I have a tendency to make a lot of typos. My brain reads what I meant, not what I typed. On an occasion, I may also make an error (:gasp:) or state something too imprecisely.
I don’t have collaborators for this site to check for any kind of errors. Therefore, if you spot an error of any sort, please let me know by reporting it on my Github issues page or contact me some other way (see the about page).
To encourage people to let me know about issues, I will list user-submitted revisions below. As of now, there are not many user-submitted revisions because the site is too new, too unpopular, or because I’ve been uncharacteristically error-free. Feel free to remedy that!
User submitted revisions by date
- Jan 12, Cedric Brendel (offline report): suggestion to change “finite” to “bounded” in “What is the “horizon” in reinforcement learning?”
- Nov 17, Michael Littman (offline report): typo correction for “If Q-learning is off-policy, why doesn’t it require importance sampling?”.
- Nov 13, araffin: typo/phrasing corrections for “Why does experience replay require off-policy learning and how is it different from on-policy learning?”
- Nov 13, Craig Sherstan (offline report): typo corrections for “What is the difference between V(s) and Q(s,a)?” and “Why does the policy gradient include a log probability term?”
- Nov 11, Bram Grooten (offline report): typo corrections for “About” and “Notation” pages.
- Nov 10, Craig Sherstan (offline report): typo correction for “Why does experience replay require off-policy learning and how is it different from on-policy learning?”