The aim of the course is to introduce some of the basic algorithms and
methods of statistical learning theory at an intermediate level. These are essential tools for making sense of the vast
and complex data sets (c.f. big data) that have emerged in fields ranging from biology to marketing to astrophysics in the past decades. The course presents some of the most important modeling and prediction techniques, along with some relevant applications. Topics presented include linear regression, classification, Bayesian learning, resampling methods, shrinkage approaches, tree-based methods, and clustering.
This is a good part of the background required for a career in data analytics. The course is lectured and examined in English.
Recommended prerequisities:
- SF 1901 or equivalent course of the type 'a first course in probability and statistics (for engineers)'
- Multivariate normal distribution
- Basic differential and integral calculus, basic linear algebra.
Lecturers:
Timo Koski (examiner) homepage and contact information
Pierre Nyquist
email
Jimmy Olsson
email
Tetyana Pavlenko
email
Course literature::
The textbook ISL can be bought at THS Kårbokhandel, Drottning Kristinas väg 15-19.
Examination:
- Computer homework (3.0 cu): there are two compulsory computer projects/home work that are to be submitted as written reports. Each report should be
written by a group of two (2) students. The reports are
examined at the Project presentation seminars on Tuesday the 22nd of November and Friday 16th of December, 2016. The computer homework will be graded with Pass/Fail.
- There will be a written exam (4.5 cu), consisting of five (5) assignments, on Thursday 25th of January, 2017, 08-
13.00.
- Bonus for summaries of the guest lectures and papers
An individually written summary (max. 2xA4) of the scientific contents of
a guest lecture (2 x E.A), (LK) will provide one (1) bonus point for the exam. In addition can bonus points be gained by written summaries of at most two scientific articles (TBA). The summary is expected to be based on the students' own notes taken during the lecture or reading of a paper.
The summaries must be submitted with deadline Fri 16th th of December at 15 hrs. The bonus points are valid for the ordinary Exam on Thursday 12th of January, 2017, and in the re-examination on Tuesday 11 april. The maximum number of bonus points to be gained is five (5).
- Important: Students, who are admitted to a course and who intend to attend it, need to activate themselves in
Rapp . Log in there using your KTH-id and click on "activate" (aktivera). The codename for sf2935 in Rapp is statin15.
Registration for the written examination via "mina sidor"/"my pages"
is required.
Grades are set according to the quality of the written examination.
Grades are given in the range A-F, where A is the best and F means
failed.
Fx means that you have the right to a complementary examination
(to reach the grade E).
The criteria for Fx is a grade F on the exam, and that an isolated part
of the course can be
identified where you have shown a particular lack of
knowledge and that the examination after a complementary examination on
this
part can be given the grade E.
-
Supervision for computer projects
Teaching assistant Felix Rios will be available for advice and supervision for computer projects at times
to be announced.
Plan of lectures
KTH Social .
(TK=Timo Koski, JO= Jimmy Olsson TP=Tetyana Pavlenko, PN= Pierre Nyquist, EA= Erik Aurell, LK= Lukas Käll, FR= Felix Rios, ISL =
the textbook )
The addresses of the lecture halls and guiding instructions are found by clicking on the Hall links below
Day |
Date |
Time |
Hall |
Topic |
Lecturer |
Tue |
01/11 |
13-15 |
E35
|
Lecture 1: Introduction to statistical learning and the course work.
Introduction to computer projects Chapter 2 in ISL.
|
TK |
Thu |
03/11
|
08-10 |
E51 |
Lecture 2: Multiple Regression reviewed and recollected
Chapter 3 in ISL
|
TP
|
Fri
|
04/11
|
10-12 |
V33 |
Lecture 3: Supervised Learning Part I. Chapter 4 in ISL
|
TP
|
Mon
|
07/11 |
15-17 |
Baltzar |
Lecture 4:Introduction to R in a computer class Chapter 2 in ISL
|
FR
|
Tue
|
08/11 |
14-16 |
H32 |
Lecture 5: Supervised Learning Part II. Chapter 4 in ISL |
TP
|
Thu
|
10/11 |
08-10 |
E51 |
Lecture 6: Supervised Learning Part III, (logistic regression), Chapter 4 in ISL, handouts. |
TK
|
Mon
|
14/11 |
13-15 |
V35 |
Lecture 7: Bootstrap |
TP
|
Tue
|
15/11 |
13-15 |
V3 |
Lecture 8:Bayesian Learning part I. Handout
|
TK
|
Thu
|
17/11 |
08-10 |
E31 |
Lecture 9: GUEST LECTURE: An insight into computational and statistical mass spectrometry-based
proteomics
|
LK
|
Tue
|
22/11 |
13-15 |
H32
|
Lecture 10:Project presentation seminar 1
|
TK, TP
|
Thu
|
24/11 |
08-10 |
H32 |
Lecture 11:Bayesian Learning part II. Handout
|
TK
|
Fri
|
25/11 |
08-10 |
TBA |
Lecture 12: Crossvalidation, Chapter 5 in ISL
| TP
|
Mon
| 28/11 |
13-15 |
V12 |
Lecture 13: Linear model selection and regularization part Chapter 5 in ISL
|
TP
|
Thu
|
01/12 |
08-10 |
D32 |
Lecture 14: Support vector machines I Chapter 9 in ISL
|
PN
|
Fri
|
02/12 |
13-15 |
E31
|
Lecture 15: Guest Lecture: Inferring protein structures from many protein sequences |
EA
|
Tue
|
06/12 |
13-15 |
H32 |
Lecture 16: Support vector machines II ISL chapter 9.
|
PN
|
Tue
|
08/12 |
08-10 |
V34
|
Lecture 17: Unsupervised learning part I. Chapter 10 in ISL
|
TK
|
Fri
|
9/12 |
10-12 |
H32
|
Lecture 18: Unsupervised learning part II. Chapter 10 in ISL
| TK
|
Mon
|
12/12 |
13-15 |
D34 |
Lecture 19: Random Trees and Classification. Chapter 8 in ISL |
JO
|
Tue
|
13/12 |
13-15 |
V33
|
Lecture 20: Guest Lecture: Inferring protein structures from many protein sequences II |
EA
|
Fri
|
16/12 |
10-12 |
E51
|
Lecture 21:Project presentation seminar 2 |
TK, TP
|
Thu
|
12/01/2017 |
08-13 |
B21 m.m. |
Exam |
TK
|
Fri
|
11/04/2017 |
08-13 |
Q26, Q31 |
Re-exam |
TK
|
Welcome, we hope you will enjoy the course (and learn (sic) a lot)!
Tetyana, Jimmy & Timo
To course
web page
Published by: Timo Koski
Updated:2016-11-1 |
|