Readings for week 6
Chapter 8, especially Approximate Value Iteration, Approximate Policy Iteration and Policy Gradient.
External reading. If you have access to some of these books by Bertsekas
"Neurodynamic programming", Chapter 6
The book "Reinforcement Learning and Optimal Control"
"Dynamic Programming and Optimal Control", Chapter 6
Publisert 16. feb. 2021 13:30
- Sist endret 16. feb. 2021 13:36