Statistical learning is about the construction and study of systems that can automatically learn from data. With the emergence of massive datasets commonly encountered today, the need for powerful machine learning is of acute importance. Examples of successful applications include effective web search, anti-spam software, computer vision, robotics, practical speech recognition, and a deeper understanding of the human genome.

This course gives an introduction to this exciting field. In the first part, we will introduce basic techniques such as logistic regression, multilayer perceptrons, nearest neighbor approaches, both from a theoretical and methodological point of views. In the second part, we will focus on more advanced techniques such as kernel methods, which is a versatile tool to represent data, in combination with (un)supervised learning techniques that are agnostic to the type of data that is learned from. The learning techniques that will be covered include regression, classification, clustering and dimension reduction. We will cover both the theoretical underpinnings of kernels, as well as a series of kernels that are important in practical applications. Finally we will touch upon topics of active research, such as large-scale kernel methods and the use of kernel methods to develop theoretical foundations of deep learning models.

The grading of the class will be done with (i) one final exam, (ii) two homeworks.

- Vapnik, The nature of statistical learning theory. Springer.
- Hastie, Tibshirani, Friedman, The elements of statistical learning. (free online).
- Shawe-Taylor, Cristianini. Kernel methods for pattern analysis. 2004.
- Francis Bach, Learning Theory from First Principles, 2022.

Lectures are scheduled from 9:45-11:15 on Mondays, and from 8:15-9:45 on Thursdays. The first part of the class will recap basic supervised learning techniques, theory, and algorithms. Lecture notes or slides will be updated here on the fly.

You can download the slides for all lectures of the second part of the class (advanced kernel methods) here! Each lecture corresponds to a range of slides. Slides are frequently updated. Please let us know if you spot typos!

** Homeworks.**
The first homework (data) is due by **Monday, November 21, 2022**. It is to be send by email to Pierre Gaillard as a pdf file containing all results and figures, together with the code file (e.g., notebook, python file,...). The report can be written in English or in French.

# | Date | Room | Teacher | Topic |
---|---|---|---|---|

1 | 26/09/2022 | H102 | PG | Supervised learning basics |

2 | 29/09/2022 | H204 | PG | Linear regression |

3 | 03/10/2022 | D117 | PG | Linear regression |

4 | 06/10/2022 | H206 | PG | Logistic regression |

5 | 10/10/2022 | H203 | PG | Logistic regression and Maximum likelihood |

6 | 17/10/2022 | H105 | PG | Maximum likelihood and K-Nearest Neighbors |

7 | 20/10/2022 | H204 | PG | K-Nearest Neighbors (notebook) |

8 | 24/10/2022 | H104 | PG | Multilayer Perceptrons |

9 | 27/10/2022 | H204 | MA | Positive definite kernels, slides 1-23 (Homework 1) |

10 | 7/11/2022 | H103 | MA | RKHS I, slides 25-55 |

11 | 10/11/2022 | H206 | MA | RKHS II, slides 25-55 |

12 | 14/11/2022 | H105 | MA | Kernel tricks, slides 56-82 |

13 | 17/11/2022 | D111 | MA | Kernel Ridge regression, slides 83-103 |

14 | 21/11/2022 | H103 | MA | (Homework 1 due date)Large-margin classifiers, slides 115-132, Suplementary material: Rademacher complexity |

15 | 24/11/2022 | H103 | JM | Support Vector Machines, slides 134-165 |

16 | 28/11/2022 | Video only | JM | kernel PCA, slides 166-202 Video I |

17 | 01/12/2022 | Video only | JM | kernel K-means, kernel CCA, slides 166-202 Video II |

18 | 05/12/2022 | H104 | PG | Q&A on the videos. Kernel Jungle, slides 203-... |

19 | 08/12/2022 | H102 | JM | |

20 | 12/12/2022 | H205 | MA | |

21 | 15/12/2022 | H202 | MA | |

22 | 02/01/2023 | H204 | JM | |

23 | 05/01/2023 | H203 | JM | |

24 | 12/01/2023 | D211 | MA | |

# | Week of 23/01 | Exam |