Lecture notes
Lecture notes will be posted here throughout the semester.
Table of contents
- Topic 1: Geometry and probability in high dimension
- Topic 2: Orthogonality, QR and least squares
- Topic 3: Matrix norms, low-rank approximations, and SVD
- Topic 4: Introduction to spectral graph theory
- Topic 5: Convexity, gradient descent and automatic differentiation
- Topic 6: Probabilistic modeling, inference and sampling
Topic 1: Geometry and probability in high dimension
Theory
- a first data science example: species delimitation (html, ipynb)
- review (html, ipynb)
- high-dimensional space (html, ipynb)
- clustering: an objective, an algorithm, and a toy example (html, ipynb)
Applications
- k-means clustering (html, ipynb)
- datasets: iris-measurements.csv, iris-species.csv
Topic 2: Orthogonality, QR and least squares
Theory
- motivating example: predicting sales (html, ipynb)
- review (html, ipynb)
- orthogonality (html, ipynb)
- least squares (html, ipynb)
Applications
- linear regression (html, ipynb)
- datasets: advertising.csv, msn-flight-data-19.csv
Topic 3: Matrix norms, low-rank approximations, and SVD
Theory
- motivating example: movie recommendations (html, ipynb)
- matrix norms and approximating subspaces (html, ipynb)
- singular value decomposition (html, ipynb)
- condition numbers (html, ipynb)
Applications
- dimensionality reduction (html, ipynb)
- datasets: movielens-small-movies.csv, movielens-small-ratings.csv, h3n2-snp.csv, h3n2-other.csv
Topic 4: Introduction to spectral graph theory
Theory
- motivating example: community detection (html, ipynb, slides)
- review: spectral decomposition (html, ipynb, slides)
- elements of graph theory (html, ipynb, slides)
- laplacian matrix (html, ipynb, slides)
- graph partitioning (html, ipynb, slides)
Applications
Topic 5: Convexity, gradient descent and automatic differentiation
Theory
- motivating example: handwritten digit recognition (html, ipynb, slides)
- review: functions of several variables (html, ipynb, slides)
- optimality conditions (html, ipynb, slides)
- convexity (html, ipynb, slides)
- gradient descent (html, ipynb, slides)
Applications
- logistic regression (html, ipynb, slides)
- deep neural networks (html, ipynb, slides)
- datasets: lebron.csv, advertising.csv, SAHeart.csv
Topic 6: Probabilistic modeling, inference and sampling
Theory
- motivating example (html, ipynb, slides)
- review (html, ipynb, slides)
- joint distributions: marginalization and conditional independence (html, ipynb, slides)
- inference and parameter estimation: variable elimination and expectation-maximization (see Sections 9.2.1-2, 9.3.1-3 , 13.1, 13.2.1-2 in [Bis])
- sampling: Markov chain Monte Carlo methods (see Sections 11.2-3 in [Bis])
Applications
- Twitter sentiment analysis (html, ipynb, slides)
- Kalman filtering (html, ipynb, slides)
- datasets: twitter-sentiment.csv