Lecture slides are here
Reading material:
Sections on weight initialization, data normalization, and batch normalization http://cs231n.github.io/neural-networks-2
About the learning process and optimization: http://cs231n.github.io/neural-networks-3/
Relevant video links: Lecture 6, 7 and 9 from CS 231n at Stanford, link here
Exercise: training a CNN using Pytorch is here
Exercise solution