Mandatory Assignment 2 - policy loss should be negated!

There is an error in the specification of the policy loss in Section 2.3. The term that is specified in Section 2.3 for the policy loss is what we want to maximize. The policy loss should be the negative of this.

Publisert 14. okt. 2019 21:40 - Sist endret 14. okt. 2019 21:40