DATA EXPLORATION AND ANALYSIS

CSE5DEV

2020

Credit points: 15

Subject outline

The goal of this subject is to provide you with specialist knowledge and tools required to formulate solutions to complex data p problems encountered by data scientists. You will learn various data exploration techniques and analysis tools. Selected problems include numerical data exploration, data cleaning and normalization, data reduction, clustering analysis and predictive analysis. One or more applications associated with each problem will also be discussed. To solve these problems, you will learn fundamentals of exploratory data analysis techniques, data reduction tools, statistical learning, logistic regression and predictive analysis. You will also learn the techniques to implement data exploration methods and analysis tools using R programming language.

School: Engineering and Mathematical Sciences (Pre 2022)

Credit points: 15

Subject Co-ordinator: Nasser Sabar

Available to Study Abroad/Exchange Students: Yes

Subject year level: Year Level 5 - Masters

Available as Elective: No

Learning Activities: N/A

Capstone subject: No

Subject particulars

Subject rules

Prerequisites: CSE4DBF or MAT4NLA or admission into one of the following courses SMIOTB

Co-requisites: N/A

Incompatible subjects: N/A

Equivalent subjects: N/A

Quota Management Strategy: N/A

Quota-conditions or rules: N/A

Special conditions: N/A

Minimum credit point requirement: N/A

Assumed knowledge: N/A

Learning resources

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Resource Type: Book

Resource Requirement: Prescribed

Author: Trevor Hastie, Robert Tibshirani, Jerome Friedman

Year: 2009

Edition/Volume: N/A

Publisher: Springer

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Data Analysis with R

Resource Type: Book

Resource Requirement: Recommended

Author: Tony Fischetti

Year: 2015

Edition/Volume: N/A

Publisher: Packt Publishing

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Evolving Fuzzy Systems --- Fundamentals, Reliability, Interpretability, Useability, Applications

Resource Type: Book

Resource Requirement: Recommended

Author: Edwin Lughofer

Year: 2011

Edition/Volume: N/A

Publisher: Springer

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Think Stats: Exploratory Data Analysis

Resource Type: Book

Resource Requirement: Recommended

Author: Allen B. Downey

Year: 2011

Edition/Volume: N/A

Publisher: Amazon

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Machine Learning: A Probabilistic Perspective

Resource Type: Book

Resource Requirement: Prescribed

Author: Kevin P. Murphy

Year: 2012

Edition/Volume: N/A

Publisher: The MIT Press

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Data Mining : Practical Machine Learning Tools and Techniques

Resource Type: Book

Resource Requirement: Recommended

Author: Ian H. Witten, Eibe Frank, Mark A. Hall

Year: 2006

Edition/Volume: N/A

Publisher: Morgan Kaufman

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Career Ready

Career-focused: No

Work-based learning: No

Self sourced or Uni sourced: N/A

Entire subject or partial subject: N/A

Total hours/days required: N/A

Location of WBL activity (region): N/A

WBL addtional requirements: N/A

Graduate capabilities & intended learning outcomes

Graduate Capabilities

Intended Learning Outcomes

01. Investigate and critically analyse common problems encountered by data scientists in practice.
02. Formulate comprehensive solutions to data science problems
03. Effectively construct data analytics tools for application to complex data sets.
04. Develop comprehensive data reduction and data cleaning techniques for application to dimensionality problems.
05. Critically evaluate the performance of data exploration and data analysis techniques.

Bendigo, 2020, Semester 2, Day

Overview

Online enrolment: No

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Nasser Sabar

Class requirements

Computer LaboratoryWeek: 31 - 43
One 2.00 hours computer laboratory per week on weekdays during the day from week 31 to week 43 and delivered via face-to-face.

LectureWeek: 31 - 43
One 1.00 hour lecture per week on weekdays during the day from week 31 to week 43 and delivered via face-to-face.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Assignment on data exploration (equivalent to 1,300 words)Written report

N/AN/AN/ANo25SILO1, SILO2, SILO4

Assignment on data analysis (equivalent to 1,300 words)Written report

N/AN/AN/ANo25SILO1, SILO2, SILO3, SILO4

One 2-hour examination equivalent to 2,000 words

N/AN/AN/ANo50SILO1, SILO2, SILO4, SILO5

Melbourne (Bundoora), 2020, Semester 2, Day

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Nasser Sabar

Class requirements

Computer LaboratoryWeek: 31 - 43
One 2.00 hours computer laboratory per week on weekdays during the day from week 31 to week 43 and delivered via face-to-face.

LectureWeek: 31 - 43
One 2.00 hours lecture per week on weekdays during the day from week 31 to week 43 and delivered via face-to-face.

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Assignment on data exploration (equivalent to 1,300 words)Written report

N/AN/AN/ANo25SILO1, SILO2, SILO4

Assignment on data analysis (equivalent to 1,300 words)Written report

N/AN/AN/ANo25SILO1, SILO2, SILO3, SILO4

One 2-hour examination equivalent to 2,000 words

N/AN/AN/ANo50SILO1, SILO2, SILO4, SILO5