# CoDaDri

## Contents

# Breaking news:

- Homework 1 has been evaluated and sent to you. If you did not receive it, please contact us.
- Here you find the MCQ proposed last year

**The Solution** [1]

# Computational and Data Driven Physics

Modern physics is characterized by an increasing complexity of systems under investigation, in domains as diverse as condensed matter, astrophysics, biophysics, etc. Establishing adequate models to describe these systems and being able to make quantitative predictions from those models is extremely challenging. The goal of the course is to provide the tools and concepts necessary to tackle those systems.

## Course description

We will first cover many algorithms used in many-body problems and complex systems, with special emphasis on Monte Carlo methods, molecular dynamics, and optimization in complex landscapes.

Second, we will provide statistical inference and machine learning tools to harness the growing availability of experimental data to design accurate models of the underlying, complex, strongly non-homogeneous and interacting systems.

Each theoretical lecture will be followed by a tutorial illustrating the concepts with practical applications borrowed from various domains of physics. We will focus on methods and algorithms and physics, not on programming and heavy numerics! You will have to hand in 3 homeworks.

## The Team

- Alberto Rosso (Computational physics)
- Rémi Monasson (Data-driven physics)
- Simona Cocco & David Lacoste & Michel Ferrero (Tutorials)

## Where and When

- Lectures on Fridays: 14:00-16:00
- Tutorials on Fridays: 16:00-18:00
- ENS, 29 rue D'Ulm, salle Borel + Djebar

## Slack

If you have questions or want to discuss topics related to the lecture, to the exercises or to the homeworks, you can use the Computational and Data Driven Physics Slack. In order to join the Slack use the following invitation link.

## Computer Requirements

**No previous experience in programming is required.**

**Programming Language: Python**

For practical installation, we recommand either to use Anaconda (See Memento Python) or use google colab.

The Collaboratory platform from Google is quite good way to use powerful computer without buying one: It requires no specific hardware or software, and even allows you to use GPU computing for free, all by writting a jupyter notebook that you can then share.

## Grading

**Computational Physics:**

- Homework 1: 5 points
- Homework 2: 5 points
- Multiple Choice Questions in November: 10 points

**Data Driven Physics:**

- Final exam in January: 20 points

## Schedule

**Friday, September 3, 2021 **

- Lecture 1 Introduction to Monte Carlo

- Tutorial 1 Markov Matrix

**Friday, September 10, 2021**

- Introductory notebooks: python, numpy and matplotlib

- Tutorial 2 - Markov matrices (solutions)

- Tutorial 3 - Thumb rule (solutions)

- Homework 1 (deadline October 1)

**Friday, September 17, 2021**

- Lecture 2 Basic Sampling

- Lecture 3: Errors and Precision

**Friday, September 24, 2021**

- Lecture 4: Ising model and phase transitions

- Tutorial 4: Ising model and phase transitions (solutions)

**Friday, October 1, 2021**

- Lecture 5: Optimization & Dijkstra algorithm

- Tutorial 5: Simulated annealing (solutions)

- Send your copy of Homework 1 to numphys.icfp at gmail.com Thanks!

- Homework 2 (deadline October 22)

**Friday, October 8, 2021**

- Lecture 6: Introduction to Bayesian inference

- Tutorial 6: Bayesian inference and single-particle tracking. Questions. Data. Starting Notebook. Google colab version. Solutions [2]. Notebook [3]

**Friday, October 15, 2021**

- Lecture 7: Importance sampling

- Tutorial 7: Faster than the clock algorithms (solutions)

**Friday, October 22, 2021**

- Lecture 8: Asymptotic inference and information. Extra material: Proof of Cramer-Rao bound [4]

- Tutorial 8: Questions. Data Starting Notebook. Bibliography Solutions. Notebook.

- Send your copy of Homework 2 to numphys.icfp at gmail.com Thanks!

**Friday, October 29, 2021**

- Lecture 9: High-dimensional inference and Principal Component Analysis. Extra material: Handwritten notes on the derivation of Marcenko-Pastur spectral density [5]

- Tutorial 9: Replay of the neuronal activity during sleep after a task.Data . Initial Notebook. Biblio.

**Friday, November 12, 2021, 2 pm: The Quiz.**

The MCQ is composed of 19 questions (one of them counts for two). For each question you have 4 choices: 3 wrong and 1 correct: If you check the correct one you get a point. If you are wrong you loose 1/4 of a point. No answer given: zero points.

MCQ Solution (correct answers in bold):[6]

**Friday, November 26, 2021**

- Lecture 10: Priors, regularisation, sparsity

- Tutorial 10: Bayesian Inference and Priors for the analysis of gravitational waves. Starting notebook on artificial data. BiblioNotebook on real data

Notebook on Artificial data Corrections

**Friday, December 3, 2021**

- Lecture 11: Probabilistic graphical models

- Tutorial 11:Analysis of protein sequence data to infer protein structure Starting notebook and data BiblioSolutions Final notebook

**Friday, December 10, 2021**

- Lecture 12: Hidden Markov Models. Extra material: Pedagogical introduction to Kalman filters [7]

- Tutorial 12:

Hidden Markov Models Hidden for identification of recombinations in SARS-CoV-2 viral genomes Starting Notebook and DataBibliography Final Notebook Solutions

**Friday, December 17, 2021**

- Lecture 13: Unsupervised learning and representations

- Tutorial 13: How restricted Boltzmann machines learn (solutions)

**Final examination of the data-driven course (January 7, 2022)**

- Example of exam: On-line Principal Component Analysis [8]

- On-line version of the book [9]

- Examination repository [10]