Lecture 1. Intro and theory for shallow networks
- Perceptron convergence theorem
- Universal approximation theorem
- Approximation rates for shallow neural networks
- Barron spaces
Lecture 2. Theory for deep networks - Advantages of additional hidden layers
- Deep ReLU networks
- Misclassification error for image deformation models
Lecture 3. Theory of gradient descent in machine learning - Optimization in machine learning
- Weight balancing phenomenon
- Analysis of dropout
- Benign overfitting
- Grokking