BIG DATA MANAGEMENT ON THE CLOUD

CSE3BGX

2020

Credit points: 15

Subject outline

Companies are acquiring massive amounts of data and also providing internet based service to millions of people. This is extremely challenging due to the large scale of data involved and the huge number of concurrent requests by users. In this subject we will study the current state-of-the-art technologies for analysing huge amounts of data and responding to millions of user requests within a second. Currently the most cost efficient way of achieving the above aim is to use large-scale cloud-based services offered by vendors such as Amazon, Google, IBM, Microsoft, etc. We will study how to use the cloud services provided by these vendors to meet the big data needs of businesses. In particular this subject will include the following topics: cloud architectures, parallel database systems, map and reduce, key value stores, transaction support in the cloud, virtualization, and multi-tenant database systems.

School: Engineering and Mathematical Sciences (Pre 2022)

Credit points: 15

Subject Co-ordinator: Zhen He

Available to Study Abroad/Exchange Students: No

Subject year level: Year Level 3 - UG

Available as Elective: No

Learning Activities: N/A

Capstone subject: No

Subject particulars

Subject rules

Prerequisites: (CSE2DBX OR CSE1OFX OR CSE2DCX) AND CSE1IOX
Students must be admitted in one of the following courses: SBAIO, SBACTO

Co-requisites: N/A

Incompatible subjects: N/A

Equivalent subjects: N/A

Quota Management Strategy: N/A

Quota-conditions or rules: N/A

Special conditions: N/A

Minimum credit point requirement: N/A

Assumed knowledge: N/A

Learning resources

Hadoop in Action

Resource Type: Book

Resource Requirement: Recommended

Author: Lam C., Davis M., Gaddam A.

Year: 2016

Edition/Volume: N/A

Publisher: Manning

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Hadoop The Definitive Guide

Resource Type: Book

Resource Requirement: Recommended

Author: White T.

Year: 2015

Edition/Volume: N/A

Publisher: O'Reilly Media

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Big Data Management on the Cloud

Resource Type: Book

Resource Requirement: Recommended

Author: Didasko Digital

Year: 2018

Edition/Volume: N/A

Publisher: Didasko

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Career Ready

Career-focused: No

Work-based learning: No

Self sourced or Uni sourced: N/A

Entire subject or partial subject: N/A

Total hours/days required: N/A

Location of WBL activity (region): N/A

WBL addtional requirements: N/A

Graduate capabilities & intended learning outcomes

Graduate Capabilities

Intended Learning Outcomes

01. Compare and contrast the benefits of using cloud computing over traditional methods of managing big data for clients.
02. Critically evaluate the best type of cloud-based service to use for a particular application scenario.
03. Design and develop efficient frameworks such as MapReduce to analyse large data sets.
04. Implement cloud-hosted database systems on a cloud computing platform.
05. Develop efficient programs that query cloud-hosted database systems.

Online (Didasko), 2020, Study block 1, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 2 - 13
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 2 to week 13 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 10, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 41 - 52
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 41 to week 52 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 11, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 45 - 0
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 45 to week 0 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 12, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 49 - 0
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 49 to week 0 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 2, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 6 - 17
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 6 to week 17 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 3, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 10 - 21
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 10 to week 21 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 4, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 14 - 25
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 14 to week 25 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 5, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 19 - 30
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 19 to week 30 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 6, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 23 - 34
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 23 to week 34 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 7, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 27 - 38
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 27 to week 38 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 8, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 32 - 43
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 32 to week 43 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5

Online (Didasko), 2020, Study block 9, Online

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Zhen He

Class requirements

Unscheduled Online ClassWeek: 36 - 47
One 3.00 hours unscheduled online class per week on any day including weekend during the day from week 36 to week 47 and delivered via online.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Online test (30 minutes) (equivalent to 500 words)Multiple-choice and/or short answer questions on cloud computing. Test to be conducted in Week 5.

N/AN/AN/ANo15SILO1, SILO2

Practical scenario-based report on processing of big data sets (equivalent to 1500 words)A practical scenario-based report on processing of big data sets using MapReduce programming model.

N/AN/AN/ANo30SILO3

Practical scenario-based report on cloud-hosted database systems (equivalent to 1500 words)A practical scenario-based report on setting up and querying cloud-based database systems

N/AN/AN/ANo30SILO4, SILO5

Online subject test (60 minutes) (equivalent to 1000 words)Multiple-choice and/or short answer questions test that covers the theoretical knowledge

N/AN/AN/ANo25SILO3, SILO4, SILO5