000 | 01281nam a2200205 4500 | ||
---|---|---|---|
008 | 220505b ||||| |||| 00| 0 eng d | ||
020 | _a9781886529397 (hb.) | ||
082 |
_a519.703 _bBER |
||
100 |
_aBertsekas, Dimitri P. _9809 |
||
245 | _aReinforcement learning and optimal control | ||
260 |
_aMassachusetts _bAthena Scientific _c2019 |
||
300 | _axiv, 373p., | ||
500 | _ahttp://www.athenasc.com/rlbook_athena.html | ||
520 | _aThis book considers large and challenging multistage decision problems, which can be solved in principle by dynamic programming (DP), but their exact solution is computationally intractable. We discuss solution methods that rely on approximations to produce suboptimal policies with adequate performance. These methods are collectively known by several essentially equivalent names: reinforcement learning, approximate dynamic programming, and neuro-dynamic programming. They have been at the forefront of research for the last 25 years, and they underlie, among others, the recent impressive successes of self-learning in the context of games such as chess and Go. | ||
650 |
_aMathematics _9435 |
||
650 |
_aMathematical optimization _9688 |
||
650 |
_aDynamic programming _9640 |
||
650 |
_aReinforcement learning _9859 |
||
942 | _cBK | ||
999 |
_c7874 _d7874 |