From Basic Machine Learning models to Advanced Kernel Learning


Course description

Statistical learning is about the construction and study of systems that can automatically learn from data. With the emergence of massive datasets commonly encountered today, the need for powerful machine learning is of acute importance. Examples of successful applications include effective web search, anti-spam software, computer vision, robotics, practical speech recognition, and a deeper understanding of the human genome.

This course gives an introduction to this exciting field. In the first part, we will introduce basic techniques such as logistic regression, multilayer perceptrons, nearest neighbor approaches, both from a theoretical and methodological point of views. In the second part, we will focus on more advanced techniques such as kernel methods, which is a versatile tool to represent data, in combination with (un)supervised learning techniques that are agnostic to the type of data that is learned from. The learning techniques that will be covered include regression, classification, clustering and dimension reduction. We will cover both the theoretical underpinnings of kernels, as well as a series of kernels that are important in practical applications. Finally we will touch upon topics of active research, such as large-scale kernel methods and the use of kernel methods to develop theoretical foundations of deep learning models.


Evaluation

The grading of the class will be done with (i) one final exam, (ii) two homeworks.

Reading material

Optional exercises (ungraded)

Addtional exercises can be found in this booklet.

Final exam

The final exam will take place in January 2025 and will last 2 hours (the location and precise date is to be confirmed). You are allowed to bring one single A4 sheet of handwritten notes.

Schedule

Lectures are scheduled from 9:45 to 11:15 on Mondays and from 8:15 to 9:45 on Thursdays. The first part of the class will recap basic supervised learning techniques, theory, and algorithms. Lecture notes or slides will be updated here on the fly.

You can download the slides for all lectures of the second part of the class (advanced kernel methods) here! Each lecture corresponds to a range of slides. Slides are frequently updated. Please let us know if you spot typos!

Homework 1. The first homework is due by Friday, November 22, 2024. The report can be written in English or in French. It is to be uploaded using the form here as a pdf report and a code file (.py or .ipynb). If the upload does not succeed (for some reason), send an email to but only after you have tried the upload.

# Date Room Teacher Topic
1 23/09/2024 C008 PG Supervised learning basics
2 26/09/2024 C008 PG Linear regression
3 30/09/2024 C008 MA Linear regression (solution exercise 2.1)
4 03/10/2024 C010 MA Logistic regression
5 07/10/2024 C003 PG Maximum likelihood (notebook)
10/10/2024 No Class
6 14/10/2024 C003 PG K-Nearest Neighbors
7 17/10/2024 C010 PG Lasso
8 21/10/2024 H101 PG Multilayer Perceptrons
9 24/10/2024 C010 JM Positive definite kernels and RKHS, slides 1-30
10 04/11/2024 C009 JM RKHSs, slides 31-47
11 07/11/2024 online JM (online) Smoothness functional, kernel tricks, Slides 48-61
(1) Try to do the quizz on slide 48
(2) Watch the last 7 minutes of this video.
(3) Watch the first 16 minutes of this video.
(4) Interactive session will be at 9am on Zoom, id meeting 937 8058 5239 using the password given during the class.
12 14/11/2024 C010 MA
13 18/11/2024 C010 MA
14 21/11/2024 C010 MA
15 25/11/2024 C010 MA
16 28/11/2024 C010 MA
17 02/12/2024 C010 JM
18 05/12/2024 C010 PG
19 09/12/2024 C010 JM
20 12/12/2024 C010 JM
21 16/12/2024 C010 JM
22 19/12/2024 C010 JM
23 06/01/2025 C010 MA
24 09/01/2025 C010 MA
End/01/2025
# End/01/2025 Exam

Instructors

Contact: firstname.lastname@inria.fr