PROBABILITY AND STATISTICS FOR DATA SCIENCE

STM4PSD

2020

Credit points: 15

Subject outline

This subject develops an understanding of probability and statistics applied to Data Science. Probability topics include joint and conditional probability, Bayes' Theorem and distributions such as the uniform, binomial, Poisson and normal distributions as well as properties of random variables and the Central Limit Theorem. Statistical inference and data analysis is also considered covering, among other topics, significance testing and confidence intervals with an introduction to methods such as ANOVA, linear and nonlinear regression and model verification. Applications to data science are considered and students will be exposed to the R statistical package as well as the mathematical type-setting package LaTeX.

School: Engineering and Mathematical Sciences (Pre 2022)

Credit points: 15

Subject Co-ordinator: Mitra Jazayeri

Available to Study Abroad/Exchange Students: Yes

Subject year level: Year Level 4 - UG/Hons/1st Yr PG

Available as Elective: Yes

Learning Activities: N/A

Capstone subject: No

Subject particulars

Subject rules

Prerequisites: N/A

Co-requisites: N/A

Incompatible subjects: STA4SS OR STM4PM

Equivalent subjects: N/A

Quota Management Strategy: N/A

Quota-conditions or rules: N/A

Special conditions: This subject will be offered to sufficient enrolment numbers

Minimum credit point requirement: N/A

Assumed knowledge: N/A

Learning resources

Online learning materials

Resource Type: Web resource

Resource Requirement: Prescribed

Author: Luke Prendergast

Year: 2017

Edition/Volume: N/A

Publisher: La Trobe University

ISBN: N/A

Chapter/article title: N/A

Chapter/issue: N/A

URL: N/A

Other description: N/A

Source location: N/A

Career Ready

Career-focused: No

Work-based learning: No

Self sourced or Uni sourced: N/A

Entire subject or partial subject: N/A

Total hours/days required: N/A

Location of WBL activity (region): N/A

WBL addtional requirements: N/A

Graduate capabilities & intended learning outcomes

Graduate Capabilities

Intended Learning Outcomes

01. Identify probabilistic traits of data science problems and choose methods which can be employed to determine valid and informative solutions.
02. Defend or question the validity of probability models applied to data science problems
03. Demonstrate an ability to solve a variety of Data Science problems using applications of probability models.
04. Define a statistical hypothesis with applications to Data Science that may be tested using data.
05. Identify and apply statistical methods for hypothesis testing and estimation with applications in Data Science.
06. Present clear, well-structured summaries of findings, both probabilistic and data-based, using appropriate mathematical and statistical vocabulary.

Melbourne (Bundoora), 2020, Semester 1, Blended

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Mitra Jazayeri

Class requirements

Computer LaboratoryWeek: 10 - 22
One 2.00 hours computer laboratory per week on weekdays during the day from week 10 to week 22 and delivered via face-to-face.

Unscheduled Online ClassWeek: 10 - 22
One 2.00 hours unscheduled online class per week on any day including weekend during the day from week 10 to week 22 and delivered via online.
Pre-recorded Lecture

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Four written assignments (500-words equivalent each, 2,000-words total)Calculations and associated written discussion and conclusions.

N/AN/AN/ANo40SILO1, SILO2, SILO3, SILO4, SILO5, SILO6

3 hour final exam (3000-words equivalent)Following release of results, papers can be reviewed in accordance with University policy.

N/AN/AN/ANo60SILO1, SILO2, SILO3, SILO4, SILO5, SILO6

Melbourne (Bundoora), 2020, LTU Term 5, Blended

Overview

Online enrolment: Yes

Maximum enrolment size: N/A

Subject Instance Co-ordinator: Mitra Jazayeri

Class requirements

Computer LaboratoryWeek: 37 - 42
Two 2.00 hours computer laboratory per week on weekdays during the day from week 37 to week 42 and delivered via online.
Two by 2-hour labs per week.

Unscheduled Online ClassWeek: 37 - 42
Two 2.00 hours unscheduled online class per week on any day including weekend during the day from week 37 to week 42 and delivered via online.
On-line readings and videos, two by 2 hours per week

Assessments

Assessment elementCommentsCategoryContributionHurdle%ILO*

Four written assignments (500-words equivalent each, 2,000-words total)Calculations and associated written discussion and conclusions.

N/AN/AN/ANo40SILO1, SILO2, SILO3, SILO4, SILO5, SILO6

3 hour final exam (3000-words equivalent)Following release of results, papers can be reviewed in accordance with University policy.

N/AN/AN/ANo60SILO1, SILO2, SILO3, SILO4, SILO5, SILO6