Learning · Statistical learning
STK-IN4300 – Statistical Learning Methods in Data Science
Book
The Elements of Statistical Learning by Hastie, Tibshirani and Friedman
Chapter 1: All Chapter 2: Sections 2.1-2-6, Carmichael & Marron (2018) Chapter 3: Sections 3.2-3.6 (not 3.5.2), 3.8 Chapter 4: Section 4.4.4, Zou (2006) (Only the results described in slides for lecture 4) Chapter 5: sections 5.1, 5.2, 5.4, 5.5, 5.7 Chapter 6: sections 6.1-6.4, 6.6.1, 6.8 Hjort & Glad (1995) (Only what is shown on the slides) Chapter 7: 7.1-7.5, 7-7 Chapter 8: Section 8.7 Chapter 9: Sections 9.1,9.2 Chapter 10: 10.1-10.5, 10.9, 10.10, 10.11, 10.12.1 Bühlmann & Yu (2003) (Only supplementary material) Chen & Guestrin (2016) (Only supplementary material) Chapter 11: Schmidhuber (2015) (Only supplementary material) Chapter 15: Sections 15.1, 15.2 Chapter 16: Sections 16.1, 16.2 Chapter 18: Sections 18.1, 18.7
1 Introduction
Learning · Statistical learning
Supervised learning
Unsupervised learning
Prediction model · Learner
Classification
Regression
2 Overview of Supervised Learning
2.1 Introduction
Input · Predictor · Independent variable · Feature
Output · Response · Dependent variable
Qualitative variable · Categorical variable · Discrete variable · Factor
Quantitative variable
2.2 Variable Types and Terminology
Regression
Classification
Ordered categorial variable
Target
Dummy variable
Training data
2.3 Two Simple Approaches to Prediction
Linear model
Intercept · Bias
Least squares
Residual sum of squares · RSS
Normal equations
Linear classification model
Decision boundary
Mixture · Mixture model
\(k\)-nearest neighbour model · \(k\)-nearest neighbour fit
Voronoi tesselation
Effective number of parameters
Enhancement of linear and local models
2.4 Statistical Decision Theory
Decision theory
Loss function
Squared error loss
EPE · Expected prediction error
EPE minimized pointwise
Regression function
Squared error loss optimizer is conditional mean
kNN models conditional mean
kNN large sample properties
LS vs kNN
Additive model
\(L_1\) loss
\(L_1\) loss optimizer is conditional median
Categorical loss
Categorical loss matrix
Zero-one loss
2.5 Local Methods in High Dimensions
Supervised learning as learning by example
2.6 Statistical Models, Supervised Learning and Function Approximation
Supervised learning as function approximation
Parameter
Linear basis expansion
Nonlinear basis expansion
Sigmoid transformation
Least squares
Maximum likelihood estimation
Maximum likelihood principle
Log-likelihood · Cross-entropy
LS and ML equivalent for normal additive model