Philosophy Dictionary of ArgumentsHome | |||
| |||
Markov decision process: A Markov decision process (MDP) is a mathematical framework for modeling sequential decision making in environments where the outcomes are partly random and partly under the control of a decision maker.
MDPs are defined by a set of states, a set of actions, a transition function, and a reward function. See also Decisions, Decision theory, Decision making process, Artificial Intelligence._____________Annotation: The above characterizations of concepts are neither definitions nor exhausting presentations of problems related to them. Instead, they are intended to give a short introduction to the contributions below. – Lexicon of Arguments. | |||
Author | Concept | Summary/Quotes | Sources |
---|---|---|---|
Peter Norvig on Markov Decision Process - Dictionary of Arguments
Norvig I 686 Markov decision process/MDP/AI Research/Norvig/Russell: As one might expect, AI researchers have pushed MDPs in the direction of more expressive representations that can accommodate much larger problems than the traditional atomic representations based on transition matrices. (Cf. >Sequential decision making/Norvig). The use of a dynamic >Bayesian network to represent transition models was an obvious idea, but work on factored MDPs (Boutilier et al., 2000(1); Koller and Parr, 2000(2); Guestrin et al., 2003b(3)) extends the idea to structured representations of the value function with provable improvements in complexity. Relational MDPs (Boutilier et al., 2001(4); Guestrin et al., 2003a(5)) go one step further, using structured representations to handle domains with many related objects. The observation that a partially observable MDP can be transformed into a regular MDP over belief states is due to Astrom (1965)(6) and Aoki (1965)(7). >Sequential decision making/Norvig. Norvig I 687 Markov games: Game theory and MDPs are combined in the theory of Markov games, also called stochastic games (Littman, 1994(8); Hu and Wellman, 1998(9)). Shapley (1953)(10) actually described the value iteration algorithm independently of Bellman, but his results were not widely appreciated, perhaps because they were presented in the context of Markov games. >Game theory/AI Research. Norvig I 688 Evolutionary game theory (Smith, 1982(11); Weibull, 1995(12)) looks at strategy drift over time: if your opponent’s strategy is changing, how should you react? Textbooks on game theory from an economics point of view include those by Myerson (1991)(13), Fudenberg and Tirole (1991)(14), Osborne (2004)(15), and Osborne and Rubinstein (1994)(16); Mailath and Samuelson (2006)(17) concentrate on repeated games. From an AI perspective we have Nisan et al. (2007)(18), Leyton-Brown and Shoham (2008)(19), and Shoham and Leyton-Brown (2009)(20). 1. Boutilier, C., Dearden, R., and Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. AIJ, 121, 49–107. 2. Koller, D. and Parr, R. (2000). Policy iteration for factored MDPs. In UAI-00, pp. 326–334. 3. Guestrin, C., Koller, D., Parr, R., and Venkataraman, S. (2003b). Efficient solution algorithms for factored MDPs. JAIR, 19, 399–468. 4. Boutilier, C., Reiter, R., and Price, B. (2001). Symbolic dynamic programming for first-order MDPs. In IJCAI-01, pp. 467-472. 5. Guestrin, C., Koller, D., Gearhart, C., and Kanodia, N. (2003a). Generalizing plans to new environments in relational MDPs. In IJCAI-03. 6. Astrom, K. J. (1965). Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Applic., 10, 174-205. 7. Aoki, M. (1965). Optimal control of partially observable Markov systems. J. Franklin Institute, 280(5), 367-386. 8. Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In ICML-94, pp. 157-163. 9. Hu, J. and Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In ICML-98, pp. 242-250. 10. Shapley, S. (1953). Stochastic games. In PNAS, Vol. 39, pp. 1095-1100. 11. Smith, J. M. (1982). Evolution and the Theory of Games. Cambridge University Press. 12. Weibull, J. (1995). Evolutionary Game Theory. MIT Press. 13. Myerson, R. (1991). Game Theory: Analysis of Conflict. Harvard University Press 14. Fudenberg, D. and Tirole, J. (1991). Game theory. MIT Press. 15. Osborne, M. J. (2004). An Introduction to Game Theory. Oxford University Pres. 16. Osborne,M. J. and Rubinstein, A. (1994). A Course in Game Theory. MIT Press. 17. Mailath, G. and Samuelson, L. (2006). Repeated Games and Reputations: Long-Run Relationships. Oxford University Press. 18. Nisan, N., Roughgarden, T., Tardos, E., and Vazirani, V. (Eds.). (2007). Algorithmic Game Theory. Cambridge University Press. 19. Leyton-Brown, K. and Shoham, Y. (2008). Essentials of Game Theory: A Concise, Multidisciplinary Introduction. Morgan Claypool. 20. Shoham, Y. and Leyton-Brown, K. (2009). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge Univ. Press._____________Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments The note [Concept/Author], [Author1]Vs[Author2] or [Author]Vs[term] resp. "problem:"/"solution:", "old:"/"new:" and "thesis:" is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition. |
Norvig I Peter Norvig Stuart J. Russell Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010 |