From Basic Machine Learning models to Advanced Kernel Learning (2023

Course description

Statistical learning is about the construction and study of systems that can automatically learn from data. With the emergence of massive datasets commonly encountered today, the need for powerful machine learning is of acute importance. Examples of successful applications include effective web search, anti-spam software, computer vision, robotics, practical speech recognition, and a deeper understanding of the human genome.

This course gives an introduction to this exciting field. In the first part, we will introduce basic techniques such as logistic regression, multilayer perceptrons, nearest neighbor approaches, both from a theoretical and methodological point of views. In the second part, we will focus on more advanced techniques such as kernel methods, which is a versatile tool to represent data, in combination with (un)supervised learning techniques that are agnostic to the type of data that is learned from. The learning techniques that will be covered include regression, classification, clustering and dimension reduction. We will cover both the theoretical underpinnings of kernels, as well as a series of kernels that are important in practical applications. Finally we will touch upon topics of active research, such as large-scale kernel methods and the use of kernel methods to develop theoretical foundations of deep learning models.

Evaluation

The grading of the class will be done with (i) one final exam, (ii) two homeworks.

Reading material

Vapnik, The nature of statistical learning theory. Springer.
Hastie, Tibshirani, Friedman, The elements of statistical learning. (free online).
Shawe-Taylor, Cristianini. Kernel methods for pattern analysis. 2004.
Francis Bach, Learning Theory from First Principles, 2023.

Optional exercises (ungraded)

Addtional exercises can be found in this booklet.

Final exam

The final exam will take place on the 31st of January 2024 from 2pm to 4pm (the location is to be confirmed). You are allowed to bring one single A4 sheet of handwritten notes.

Schedule

Lectures are scheduled from 9:45 to 11:15 on Mondays and Thursdays. The first part of the class will recap basic supervised learning techniques, theory, and algorithms. Lecture notes or slides will be updated here on the fly.

You can download the slides for all lectures of the second part of the class (advanced kernel methods) here! Each lecture corresponds to a range of slides. Slides are frequently updated. Please let us know if you spot typos!

Homework 1. The first homework (data) is due by Friday, November 24, 2023. The report can be written in English or in French. It is to be uploaded using the form here as a pdf report and a code file (.py or .ipynb). If the upload does not succeed (for some reason), send an email to but only after you have tried the upload.

Homework 2. The second homework is due by Thursday, January 09, 2024 at 09:00AM. Students must submit a single PDF to the GradeScope plateform. Please follow the instructions to create an account on GradeScope and to submit the homework. To access the course plateform you need to provide the following Course entry code: WBK2JJ. For questions about the Homework please send an email to "michael.arbel@inria.fr", the object of the email must start with the course entry code.

Previous exams: 2023.

#	Date	Room	Teacher	Topic
1	25/09/2023	D213	PG	Supervised learning basics
2	28/09/2023	D111	PG	Linear regression
3	02/10/2023	Amphi D-D-123	PG	Linear regression (solution exercise 2.1)
4	05/10/2023	D109	PG	Logistic regression
5	09/10/2023	D213	PG	Maximum likelihood (notebook)
6	12/10/2023	D109	PG	K-Nearest Neighbors
7	16/10/2023	H105	PG	Lasso
8	19/10/2023	Amphi D-D123	PG	Multilayer Perceptrons
9	23/10/2023	H105	JM	Positive definite kernels, slides 1-23
10	26/10/2023	H105	JM	RKHS, slides 25-55
11	6/11/2023	H105	JM	Kernel tricks, slides 56-69
12	9/11/2023	D109	JM	Representer theorem, kernel ridge regression, slides 72-94
13	13/11/2023	H105	JM	kernel logistic regression, slides 95-111
14	16/11/2023	D111	JM	large margin classifiers, machine learning theory, slides 112-129
15	20/11/2023	D111	MA	Unsupervised learning, slides 163-
16	23/11/2023	D109	JM	Support Vector Machines, slides 130-162
17	27/11/2023	H105	JM
18	30/11/2023	D109	MA
19	04/12/2023	H105	MA
20	07/12/2023	H105	MA
21	11/12/2023	H105	MA
23	14/12/2023	H105	MA
24	18/12/2023	H105	MA
24	21/12/2023	H105	MA
#	End/01/2023			Exam

From Basic Machine Learning models to Advanced Kernel Learning

Course description

Evaluation

Reading material

Optional exercises (ungraded)

Final exam

Schedule

Instructors

Julien Mairal

Pierre Gaillard

Michael Arbel