This page is intended for students following the TC0 class (in the M1 of CS, « parcours IA » (AI) of Paris Saclay).

See http://lptms.u-psud.fr/francois-landes/machine-learning-resources/ for **resources and materials** to self-train or to complete this course’s material.

Course material: https://drive.google.com/drive/folders/1xqIHX8oOYE8u5rl-9O1LHVBwII-uJscn?usp=sharing

Lecture: **NEW LINK !** https://eu.bbcollab.com/guest/3b4bf65215e84ca5a41be4cf88b8573d (although everyone should have the link !) ~~https://eu.bbcollab.com/guest/3e35006e16c445a9b63685d87b8328c2~~

NOTE ! Next time we should try discord for the tutorial sessions, and have a single big group of everybody, so you are free to choose your partner in the class.

**NOTE ! About MCQ at beginning of the class: most likely yes, there will be one. I will tell you in advance by email. It will deal with the material covered in the Lecture notes (the annotated notes are now available online)**

For tutorials, group #1 & #2 -> moving to discord.

Title: **INTRODUCTION TO MACHINE LEARNING**

**Teachers:** François LANDES francois.landes@u-psud.fr (Lecture + 1 group) + Giancarlo Fissore (1 group)

**IMPORTANT** –** the courses will be fully Online, until further notice. ***(my Initial plan was to have the first course as much in presence as possible..) * 🙁

**IMPORTANT 2 –** make sure you have un **updated version of python3 and jupyter-notebook**, with at least

**installed. Shortly we will also need**

*numpy, scipy, matplotlib**sklearn (scikit-learn)*, possibly

*pandas*.

*Seaborn*is always nice to have (I am not an expert of it).

**Alternative Solution 1:**Use https://jupytercloud.lal.in2p3.fr/ . Use your institutional (Paris-Saclay, typically) account to connect for the first time. This will open a work session of jupyter-notebook, that runs on the cloud, or more precisely, on the servers of the LAL (Linear Accelerator Laboratoire). You can click on the blue button on the top right corner, « upload », to import a notebook file onto the cloud, and then edit and run it online. Your files are saved over time there.

**Alternative Solution 2**

**(worse):**same thing but using instead https://colab.research.google.com/notebooks/intro.ipynb (bad point: it’s google, you need an account + data privacy is bad)

**Pre-requisites:**

PRE1,PRE2,PRE3,PRE4.

Very strongly recommended: follow also TC2 and OPT9 (during the same term, T2).

**Follow-ups:**

This course is mandatory for the students of the “parcours AI”. It is necessary or at least very strongly recommended to follow most of the next AI courses.

**General description:**

This course will be the *theoretical counterpart* of OPT9 (Datacomp, applying ML to a concrete projects, which is more hands-on than this course). It is algorithms-oriented, i.e. we will sketch the great principles of ML, but focus on how algorithms work in practice, *including all necessary mathemtical aspects*.

Assuming a *knowledge of fundamental maths notions* (Bayesian inference, Algebra, Analysis, some optimization), we will *cover the inner workings of ML algorithms in detail*. Beyond their technical implementation, we will also explain their theoretical foundations (mathematical definitions, limits, when and why they fail or work, etc).

The course will be supported by pen-and-paper, and lab sessions in groups of ~20, where we will re-code and play with algorithms, using Python.

**Typical organization:** 1h30 of Lecture, then 2h00 of tutorials (either paper-pencil, or lab session).

We will use python to code. If you don’t have an udpated environment, please make sure you can connect to Paris-Saclay’s solutions.

**Note !** An important part of the course material will be dispensed through the black/white/digital-board. You are supposed to be taking notes, either individually or in groups. To adjust for covid-related constraints, motivated students are encouraged to self-organize to type a set of notes, which we may proofread, to then share with the class. (although the class will be recorded, I recommend taking notes: a video does not replace good notes).

**Time: **Friday afternoons, 14:00 – 17:45**Rooms:** either E212 (lecture and TD) or D103 (TP)

Exam day: E212 and E210.

**Online** sessions (no need for e-campus account) : https://eu.bbcollab.com/guest/3e35006e16c445a9b63685d87b8328c2

For tutorials, groupe #1 : https://eu.bbcollab.com/guest/ef27d892af654515aa33f32231a9db38

For tutorials, groupe #2 : https://eu.bbcollab.com/guest/3bea2be815f04720964fde05c3fea170

**MCC (grades):**

- Session 1: 0.4
*CC+0.6*EE- EE 60%
**Limited time written exam**(about theory of ML) (attendance mandatory, except if impossible, with a good reason). (Limitless) documents will be allowed. - CC 20%
**small project**(groups of 2), focused on experimenting how and why this or that works/does not work. We will ask for a written report (4 pages max) + code + 5 min oral presentation, in front of the whole class (if time allows) - CC 20%
**5 min quizze**s at the beggining of class, will count in the CC total.

- EE 60%
- Session 2: 0.4
*CC+0.6*EE (2nd chance exam)- EE: New written exam (if very few candidates, Oral instead of written exam)
- CC: the grade of 1st session is kept.

Detailed Table of contents : see the slides in the GoogleDrive (TODO)

Detailed content: **tentative** program.

**TO BE ADJUSTED with other courses**

Tentative lab session program (1 subject ≠ 1 session, some are longer, some shorter)

- perceptron + linear reg, : coding from scratch
- PCA, not entirely from scratch: from np.linalg.eig
- CV, from scratch (a bit silly, no?)
- SVM: coding from scratch (? homework: SVM with Kernel ?)
- Naive Bayes: from scratch (image classif.)
- EM, from scratch (images clustering)
- Decision trees: at least the entropy+mutual info from sratch. Then the rest?

Tentative Lectures program: (1 subject ≠ 1 session, some are longer, some shorter)

(not necessarily in chornological order)

- what is ML (context of the course, why theory matters)
- ML vocabulary, intuitions of the tasks (probably already covered in OPT9)
- two working examples: the Perceptron and the linear regression
- Supervised learning as the optimization of a cost function (strong link with TC2)
- Basic Metrics
- Multi-class classif., multi-variate regression: don’t be afraid
- Overfitting: “the ennemy”

- SVM (Support Vector Machine) (Linear)
- intuitions
- definition, proof
- SVR : intuition

- The Kernel trick
- intuition
- Polynomial case: in deep detail
- higher dimensional spaces, embedding, representation
- factored vs. non factored (+interpretation)

**(this point may be almost entirely let to OPT9?)**Key methodological point: Training, Validation, Testing- Bias vs Variance tradeoff
- Model Complexity vs Generalization and how to control it: Regularization
- Data exhaustion, Hyper-parameters over-fitting.
- which hyper-parameters do you know?
- Validation methods: CV, LOO, boosting, …

- Unsupervised learning: Clustering
- Unsupervised learning as the optimization of a cost function (strong link with TC2)
- Clustering: K-means, quick overview (some familiarity is assumed)

- Bayesian models: (assuming PRE1 was followed)
- the Naive Bayes model (for e.g. image classification). Non-naive case: overview.
- regularization: the prior
- K-means with Gaussian blobs ? (soft affectations, GMM)
- the EM algorithm (for e.g. unsupervised or semi-supervised classif.)

- Unsupervised learning: Density Estimation
- Histograms (!)
- Kernel Density Estimation
- GMM (Gaussian Mixture Model)

- Unsupervised learning: Dimensional reduction
- PCA: algebraic intuition, examples (images)
- PCA as the maximization of a cost function… or of another one
- other pre-processings: feature selection, ICA, t-SNE, etc: overview

- Decision Trees
- Decision trees intuition: from manual decision boundaries to automated ones
- Entropy, Mutual information
- def of a tree (algo)

- a word on a more advanced metric: ROC, AUC

**How to prepare for the written exam ?** In the Bishop book, you have:

The table of content (for the 2006 edition, available online) is page xiii (page 13).

Section 1.2 in particular and chapter 1 in general are pre-requisites that we consider should be known, and thus may be very useful

reminders.

For those who didn’t know about the perceptron, section 4.1.7 is recommended.

Chapter 9 in general and sections 9.2 and 9.3 in particular cover most of the Bayesian Learning we do.

Decision Trees are mentionned in chapter 14 (if you wan to dig deeper), however the simplest is probably to read papers, as « Induction of decision trees, Machine Learning » by Quinlan J R.

SVMs will be covered in the last lecture, and correspond to chapter 7.

**Python & technical notes**

coding skills: python (or other langages, python is easy to learn).

**If you can bring your own laptop, and have python3 + jupyter notebook installed, that’s best. Otherwise, please check that your python interpreter works on your available machine.
**

**Packages needed / recommended:**

**python3, ipython, numpy, scikit-learn, matplotlib, pandas, seaborn,**[numba – if you love loops and hate numpy+can do without jupyter’s features]

IDEs: with jupyter, no interpreted is needed. If you don’t like jupyter, you can always do File>Download as>Python(.py) and use »’ to comment your python file from the curernt point to the end. But be careful not to re-load or re-compute everything every time you run your program. You’ll need to break down you workflow in several files, and produce intermediate files, to be compute-time-efficient.

Technical note:

To launch a « jupyter notebook », the first time in particular, you may need to check out the result of the command:

jupyter notebook list

In a web browser, go to http://localhost:8888/ipython/ (it may also be at 8080 or 8000)

Then follow instructions. On Univ. Paris-Sud computers, it may sometimes be necessary to set by hand the address or port (8888 or other) as « localhost ».

Python version? In this class we prefer python3.

Python help ? check out the web. Stackoverflow is great. Try typing « python cheat sheet » if you don’t feel inspired.