Dictionary of Arguments


Philosophical and Scientific Issues in Dispute
 
[german]

Screenshot Tabelle Begriffes

 

Find counter arguments by entering NameVs… or …VsName.

Enhanced Search:
Search term 1: Author or Term Search term 2: Author or Term


together with


The author or concept searched is found in the following 18 entries.
Disputed term/author/ism Author
Entry
Reference
Chess Norvig Norvig I 192
Chess/artificial intelligence/Norvig/Russell: Chess was one of the first tasks undertaken in AI, with early efforts by many of the pioneers of computing, including Konrad Zuse in 1945, Norbert Wiener in his book Cybernetics (1948), and Alan Turing in 1950 (see Turing et al., 1953)(2). But it was Claude Shannon’s article Programming a Computer for Playing Chess (1950)(3) that had the most complete set of ideas, describing a representation for board positions, an evaluation function, quiescence search, and some ideas for selective (nonexhaustive) game-tree search. Slater (1950)(4) and the commentators on his article also explored the possibilities for computer chess play. D. G. Prinz (1952)(5) completed a program that solved chess endgame problems but did not play a full game. Stan Ulam and a group at the Los Alamos National Lab produced a program that played chess on a 6×6 board with no bishops (Kister et al., 1957)(6). It could search 4 plies deep in about 12 minutes. Alex Bernstein wrote the first documented program to play a full game of standard chess (Bernstein and Roberts, 1958)(7). The first computer chess match featured the Kotok–McCarthy program from MIT (Kotok, 1962)(8) and the ITEP program written in the mid-1960s at Moscow’s Institute of Theoretical and Experimental Physics (Adelson-Velsky et al., 1970)(9).
The first chess program to compete successfully with humans was MIT’s MACHACK-6 (Greenblatt et al., 1967)(10). The $10,000 prize for the first program to achieve a USCF (United States Chess Federation) rating of 2500 (near the grandmaster level) was awarded to DEEP THOUGHT (Hsu et al., 1990) (11) in 1989. The grand prize, $100,000, went to DEEP BLUE (Campbell et al., 2002(12); Hsu, 2004(13)) for its landmark victory over world champion Garry Kasparov in a 1997 exhibition match.
Norvig I 193
HYDRA (Donninger and Lorenz, 2004(14)) is rated somewhere between 2850 and 3000, based mostly on its trouncing of Michael Adams. The RYBKA program is rated between 2900 and 3100, but this is based on a small number of games and is not considered reliable. Ross (2004)(15) shows how human players have learned to exploit some of the weaknesses of the computer programs.


1. Wiener, N. (1948). Cybernetics. Wiley.
2. Turing, A., Strachey, C., Bates,M. A., and Bowden, B. V. (1953). Digital computers applied to games. In Bowden, B. V. (Ed.), Faster than Thought, pp. 286 - 310. Pitman.
3. Shannon, C. E. (1950). Programming a computer for playing chess. Philosophical Magazine, 41(4),
256–275
4. Slater, E. (1950). Statistics for the chess computer and the factor of mobility. In Symposium on Information Theory, pp. 150-152. Ministry of Supply
5. Prinz, D. G. (1952). Robot chess. Research, 5, 261- 266.
6. Kister, J., Stein, P., Ulam, S., Walden, W., and Wells, M. (1957). Experiments in chess. JACM, 4,
174–177.
7. Bernstein, A. and Roberts, M. (1958). Computer vs. chess player. Scientific American, 198(6), 96-
105.
8. Kotok, A. (1962). A chess playing program for the IBM 7090. AI project memo 41, MIT Computation Center.
9. Adelson-Velsky, G. M., Arlazarov, V. L., Bitman, A. R., Zhivotovsky, A. A., and Uskov, A. V. (1970).
Programming a computer to play chess. Russian Mathematical Surveys, 25, 221-262.
10. Greenblatt, R. D., Eastlake, D. E., and Crocker, S. D. (1967). The Greenblatt chess program. In Proc.
Fall Joint Computer Conference, pp. 801-810.
11. Hsu, F.-H., Anantharaman, T. S., Campbell, M. S., and Nowatzyk, A. (1990). A grandmaster chess machine. Scientific American, 263(4), 44–50.
12. Campbell, M. S., Hoane, A. J., and Hsu, F.-H. (2002). Deep Blue. AIJ, 134(1–2), 57–83.
13. Hsu, F.-H. (2004). Behind Deep Blue: Building the Computer that Defeated the World Chess Champion. Princeton University Press.
14. Donninger, C. and Lorenz, U. (2004). The chess monster hydra. In Proc. 14th International Conference on Field-Programmable Logic and applications, pp. 927-932.
15. Ross, P. E. (2004). Psyching out computer chess players. IEEE Spectrum, 41(2), 14-15.

Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010

Chess Russell Norvig I 192
Chess/artificial intelligence/Norvig/Russell: Chess was one of the first tasks undertaken in AI, with early efforts by many of the pioneers of computing, including Konrad Zuse in 1945, Norbert Wiener in his book Cybernetics (1948), and Alan Turing in 1950 (see Turing et al., 1953)(2). But it was Claude Shannon’s article Programming a Computer for Playing Chess (1950)(3) that had the most complete set of ideas, describing a representation for board positions, an evaluation function, quiescence search, and some ideas for selective (nonexhaustive) game-tree search. Slater (1950)(4) and the commentators on his article also explored the possibilities for computer chess play. D. G. Prinz (1952)(5) completed a program that solved chess endgame problems but did not play a full game. Stan Ulam and a group at the Los Alamos National Lab produced a program that played chess on a 6×6 board with no bishops (Kister et al., 1957)(6). It could search 4 plies deep in about 12 minutes. Alex Bernstein wrote the first documented program to play a full game of standard chess (Bernstein and Roberts, 1958)(7). The first computer chess match featured the Kotok–McCarthy program from MIT (Kotok, 1962)(8) and the ITEP program written in the mid-1960s at Moscow’s Institute of Theoretical and Experimental Physics (Adelson-Velsky et al., 1970)(9).
The first chess program to compete successfully with humans was MIT’s MACHACK-6 (Greenblatt et al., 1967)(10). The $10,000 prize for the first program to achieve a USCF (United States Chess Federation) rating of 2500 (near the grandmaster level) was awarded to DEEP THOUGHT (Hsu et al., 1990) (11) in 1989. The grand prize, $100,000, went to DEEP BLUE (Campbell et al., 2002(12); Hsu, 2004(13)) for its landmark victory over world champion Garry Kasparov in a 1997 exhibition match.
Norvig I 193
HYDRA (Donninger and Lorenz, 2004(14)) is rated somewhere between 2850 and 3000, based mostly on its trouncing of Michael Adams. The RYBKA program is rated between 2900 and 3100, but this is based on a small number of games and is not considered reliable. Ross (2004)(15) shows how human players have learned to exploit some of the weaknesses of the computer programs. >Artificial Intelligence, >Machine Learning.

1. Wiener, N. (1948). Cybernetics. Wiley.
2. Turing, A., Strachey, C., Bates,M. A., and Bowden, B. V. (1953). Digital computers applied to games. In Bowden, B. V. (Ed.), Faster than Thought, pp. 286 - 310. Pitman.
3. Shannon, C. E. (1950). Programming a computer for playing chess. Philosophical Magazine, 41(4),
256–275
4. Slater, E. (1950). Statistics for the chess computer and the factor of mobility. In Symposium on Information Theory, pp. 150-152. Ministry of Supply
5. Prinz, D. G. (1952). Robot chess. Research, 5, 261- 266.
6. Kister, J., Stein, P., Ulam, S., Walden, W., and Wells, M. (1957). Experiments in chess. JACM, 4,
174–177.
7. Bernstein, A. and Roberts, M. (1958). Computer vs. chess player. Scientific American, 198(6), 96-
105.
8. Kotok, A. (1962). A chess playing program for the IBM 7090. AI project memo 41, MIT Computation Center.
9. Adelson-Velsky, G. M., Arlazarov, V. L., Bitman, A. R., Zhivotovsky, A. A., and Uskov, A. V. (1970).
Programming a computer to play chess. Russian Mathematical Surveys, 25, 221-262.
10. Greenblatt, R. D., Eastlake, D. E., and Crocker, S. D. (1967). The Greenblatt chess program. In Proc.
Fall Joint Computer Conference, pp. 801-810.
11. Hsu, F.-H., Anantharaman, T. S., Campbell, M. S., and Nowatzyk, A. (1990). A grandmaster chess machine. Scientific American, 263(4), 44–50.
12. Campbell, M. S., Hoane, A. J., and Hsu, F.-H. (2002). Deep Blue. AIJ, 134(1–2), 57–83.
13. Hsu, F.-H. (2004). Behind Deep Blue: Building the Computer that Defeated the World Chess Champion. Princeton University Press.
14. Donninger, C. and Lorenz, U. (2004). The chess monster hydra. In Proc. 14th International Conference on Field-Programmable Logic and applications, pp. 927-932.
15. Ross, P. E. (2004). Psyching out computer chess players. IEEE Spectrum, 41(2), 14-15.

Russell I
B. Russell/A.N. Whitehead
Principia Mathematica Frankfurt 1986

Russell II
B. Russell
The ABC of Relativity, London 1958, 1969
German Edition:
Das ABC der Relativitätstheorie Frankfurt 1989

Russell IV
B. Russell
The Problems of Philosophy, Oxford 1912
German Edition:
Probleme der Philosophie Frankfurt 1967

Russell VI
B. Russell
"The Philosophy of Logical Atomism", in: B. Russell, Logic and KNowledge, ed. R. Ch. Marsh, London 1956, pp. 200-202
German Edition:
Die Philosophie des logischen Atomismus
In
Eigennamen, U. Wolf (Hg) Frankfurt 1993

Russell VII
B. Russell
On the Nature of Truth and Falsehood, in: B. Russell, The Problems of Philosophy, Oxford 1912 - Dt. "Wahrheit und Falschheit"
In
Wahrheitstheorien, G. Skirbekk (Hg) Frankfurt 1996


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Computer Games Norvig Norvig I 189
Computer Games/artificial intelligence/algorithms/Norvig/Russell: A game can be defined by the initial state (how the board is set up), the legal actions in each state, the result of each action, a terminal test (which says when the game is over), and a utility function that applies to terminal states. In two-player zero-sum games with perfect information, the minimax algorithm can select optimal moves by a depth-first enumeration of the game tree.
The alpha–beta search algorithm computes the same optimal move as minimax, but achieves much greater efficiency by eliminating sub trees that are provably irrelevant.
Usually, it is not feasible to consider the whole game tree (even with alpha–beta), so we
Norvig I 190
need to cut the search off at some point and apply a heuristic evaluation function that estimates the utility of a state. Many game programs precompute tables of best moves in the opening and endgame so that they can look up a move rather than search.
Games of chance can be handled by an extension to the minimax algorithm that evaluates a chance node by taking the average utility of all its children, weighted by the probability of each child.
Optimal play in games of imperfect information, such as Kriegspiel and bridge, requires reasoning about the current and future belief states of each player. A simple approximation can be obtained by averaging the value of an action over each possible configuration of missing information.
>Minimax algorithm.
History of mechanical games: The most notorious of these was Baron Wolfgang von Kempelen’s (1734–1804) “The Turk,” a supposed chess-playing automaton that defeated Napoleon before being exposed as a magician’s trick cabinet housing a human chess expert (see Levitt, 2000)(1).
Computer chess: In 1846, Charles Babbage (who had been fascinated by the Turk) appears to have contributed the first serious discussion of the feasibility of computer chess and checkers (Morrison and Morrison, 1961)(2).
VsBabbage: He did not understand the exponential complexity of search trees, claiming “the combinations involved in the Analytical Engine enormously surpassed any required, even by the game of chess.”
Chess: The NSS chess program (Newell et al., 1958)(3) used a simplified version of alpha-beta; it was the first chess program to do so. Alpha–beta pruning was described by Hart and Edwards (1961)(4) and Hart et al. (1972)(5). Alpha-beta was used by the “Kotok - McCarthy” chess program written by a student of John McCarthy (Kotok, 1962)(6). Knuth and Moore (1975)(7) proved the correctness of alpha-beta and analysed its time complexity. Pearl (1982b)(8) shows alpha–beta to be asymptotically optimal among all fixed-depth game-tree search algorithms. >Chess/artificial intelligence/Norvig/Russell.



1. Levitt, G. M. (2000). The Turk, Chess Automaton. McFarland and Company.
2. Morrison, P. and Morrison, E. (Eds.). (1961). Charles Babbage and His Calculating Engines: Selected
Writings by Charles Babbage and Others. Dover.
3. Newell, A., Shaw, J. C., and Simon, H. A. (1958). Chess playing programs and the problem of complexity. IBM Journal of Research and Development, 4(2), 320–335.
4. Hart, T. P. and Edwards, D. J. (1961). The tree prune (TP) algorithm. Artificial intelligence project memo 30, Massachusetts Institute of Technology.
5. Hart, P. E., Nilsson, N. J., and Raphael, B. (1972). Correction to “A formal basis for the heuristic determination of minimum cost paths”. SIGART Newsletter, 37, 28–29.
6. Kotok, A. (1962). A chess playing program for the IBM 7090. AI project memo 41, MIT Computation
Center.
7. Knuth, D. E. (1975). An analysis of alpha–beta pruning. AIJ, 6(4), 293–326.

Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010

Computer Games Russell Norvig I 189
Computer Games/artificial intelligence/algorithms/Norvig/Russell: A game can be defined by the initial state (how the board is set up), the legal actions in each state, the result of each action, a terminal test (which says when the game is over), and a utility function that applies to terminal states. In two-player zero-sum games with perfect information, the minimax algorithm can select optimal moves by a depth-first enumeration of the game tree.
The alpha–beta search algorithm computes the same optimal move as minimax, but achieves much greater efficiency by eliminating sub trees that are provably irrelevant.
Usually, it is not feasible to consider the whole game tree (even with alpha–beta), so we
Norvig I 190
need to cut the search off at some point and apply a heuristic evaluation function that estimates the utility of a state. Many game programs precompute tables of best moves in the opening and endgame so that they can look up a move rather than search.
Games of chance can be handled by an extension to the minimax algorithm that evaluates a chance node by taking the average utility of all its children, weighted by the probability of each child.
Optimal play in games of imperfect information, such as Kriegspiel and bridge, requires reasoning about the current and future belief states of each player. A simple approximation can be obtained by averaging the value of an action over each possible configuration of missing information.
>Minimax algorithm.
History of mechanical games: The most notorious of these was Baron Wolfgang von Kempelen’s (1734–1804) “The Turk,” a supposed chess-playing automaton that defeated Napoleon before being exposed as a magician’s trick cabinet housing a human chess expert (see Levitt, 2000)(1).
Computer chess: In 1846, Charles Babbage (who had been fascinated by the Turk) appears to have contributed the first serious discussion of the feasibility of computer chess and checkers (Morrison and Morrison, 1961)(2).
VsBabbage: He did not understand the exponential complexity of search trees, claiming “the combinations involved in the Analytical Engine enormously surpassed any required, even by the game of chess.”
Chess: The NSS chess program (Newell et al., 1958)(3) used a simplified version of alpha-beta; it was the first chess program to do so. Alpha–beta pruning was described by Hart and Edwards (1961)(4) and Hart et al. (1972)(5). Alpha-beta was used by the “Kotok - McCarthy” chess program written by a student of John McCarthy (Kotok, 1962)(6). Knuth and Moore (1975)(7) proved the correctness of alpha-beta and analysed its time complexity. Pearl (1982b)(8) shows alpha–beta to be asymptotically optimal among all fixed-depth game-tree search algorithms.
>Chess/artificial intelligence/Norvig/Russell.

1. Levitt, G. M. (2000). The Turk, Chess Automaton. McFarland and Company.
2. Morrison, P. and Morrison, E. (Eds.). (1961). Charles Babbage and His Calculating Engines: Selected
Writings by Charles Babbage and Others. Dover.
3. Newell, A., Shaw, J. C., and Simon, H. A. (1958). Chess playing programs and the problem of complexity. IBM Journal of Research and Development, 4(2), 320–335.
4. Hart, T. P. and Edwards, D. J. (1961). The tree prune (TP) algorithm. Artificial intelligence project memo 30, Massachusetts Institute of Technology.
5. Hart, P. E., Nilsson, N. J., and Raphael, B. (1972). Correction to “A formal basis for the heuristic determination of minimum cost paths”. SIGART Newsletter, 37, 28–29.
6. Kotok, A. (1962). A chess playing program for the IBM 7090. AI project memo 41, MIT Computation
Center.
7. Knuth, D. E. (1975). An analysis of alpha–beta pruning. AIJ, 6(4), 293–326.

Russell I
B. Russell/A.N. Whitehead
Principia Mathematica Frankfurt 1986

Russell II
B. Russell
The ABC of Relativity, London 1958, 1969
German Edition:
Das ABC der Relativitätstheorie Frankfurt 1989

Russell IV
B. Russell
The Problems of Philosophy, Oxford 1912
German Edition:
Probleme der Philosophie Frankfurt 1967

Russell VI
B. Russell
"The Philosophy of Logical Atomism", in: B. Russell, Logic and KNowledge, ed. R. Ch. Marsh, London 1956, pp. 200-202
German Edition:
Die Philosophie des logischen Atomismus
In
Eigennamen, U. Wolf (Hg) Frankfurt 1993

Russell VII
B. Russell
On the Nature of Truth and Falsehood, in: B. Russell, The Problems of Philosophy, Oxford 1912 - Dt. "Wahrheit und Falschheit"
In
Wahrheitstheorien, G. Skirbekk (Hg) Frankfurt 1996


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Language Acquisition Chomsky I 281
Learning/Chomsky: a child learns as well Japanese as English - pointless to ask "which hypotheses it reduces" - there must be more than the ability to associate - structural grammar does not yield the structures that we have to postulate as generative grammar. >Grammar, >Generative grammar.
I 283
Internal organization plays an important role for the perception, it determines an extremely restrictive initial scheme.
I 285
VsGoodman: Learning a second language is not that different. >Learning.
I 299
Learning/Chomsky: whether the evaluation function is learned or it is the basis for learning, is an empirical question.
II 324
Language learning: behaviorist/Quine: Conditioning, association. ChomskyVsQuine: additionally principles , only by them infinitely many sentenes are explainable. ((s) For the question whether there are infinitely many possible sentences, see >Researchgate.)


Upton I 74
Language acquistion/Chomsky/Upton: Chomsky (1979)(1) argues that there must therefore be an innate mechanism for language learning. He calls this the language acquisition device (LAD). LAD: Through the LAD the child is hard-wired to recognise the grammar of whatever language they are exposed to in infancy. This LAD matures over time, allowing the child to use increasingly complex language.
VsChomsky/Upton: Contemporary theories of language development tend to be less extreme. Both sides have modified their position, so that nativists recognise that the environment has a role to play in language acquisition, and environmentalists accept that imitation and reinforcement are insufficient to explain the child’s entry into the complex world of language.
>Language acquisition/Nativism, >Language acquisition/Bruner.

1. Chomsky, N (1979) Human language and other semiotic systems. Semiotica, 25: 31–44.

Chomsky I
Noam Chomsky
"Linguistics and Philosophy", in: Language and Philosophy, (Ed) Sidney Hook New York 1969 pp. 51-94
In
Linguistik und Philosophie, G. Grewendorf/G. Meggle Frankfurt/M. 1974/1995

Chomsky II
Noam Chomsky
"Some empirical assumptions in modern philosophy of language" in: Philosophy, Science, and Method, Essays in Honor of E. Nagel (Eds. S. Morgenbesser, P. Suppes and M- White) New York 1969, pp. 260-285
In
Linguistik und Philosophie, G. Grewendorf/G. Meggle Frankfurt/M. 1974/1995

Chomsky IV
N. Chomsky
Aspects of the Theory of Syntax, Cambridge/MA 1965
German Edition:
Aspekte der Syntaxtheorie Frankfurt 1978

Chomsky V
N. Chomsky
Language and Mind Cambridge 2006


Upton I
Penney Upton
Developmental Psychology 2011
Minimax Algorithm Norvig Norvig I 164
Minimax Algorithm/gaming/Norvig/Russell: The minimax value of a node is the utility (for MAX) of being in the corresponding state, assuming that both players play optimally from there to the end of the game.
Norvig I 165
The minimax algorithm (…) computes the minimax decision from the current state. It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. The recursion proceeds all the way down to the leaves of the tree, and then the minimax values are backed up through the tree as the recursion unwinds. The minimax algorithm performs a complete depth-first exploration of the game tree. For real games, of course, the time cost is totally impractical, but this algorithm serves as the basis for the mathematical analysis of games and for more practical algorithms.
Norvig I 167
The problem with minimax search is that the number of game states it has to examine is exponential in the depth of the tree. Unfortunately, we can’t eliminate the exponent, but it turns out we can effectively cut it in half. The trick is that it is possible to compute the correct minimax decision without looking at every node in the game tree. Solution: alpha–beta pruning. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.
The general principle is this: consider a node n
Norvig I 168
somewhere in the tree (...), such that Player has a choice of moving to that node. If Player has a better choice m either at the parent node of n or at any choice point further up, then n will never be reached in actual play. So once we have found out enough about n (by examining some of its descendants) to reach this conclusion, we can prune it.
Norvig I 190
The minimax algorithm is traced to a 1912 paper by Ernst Zermelo(1), the developer of modern set theory. The paper unfortunately contained several errors and did not describe minimax correctly. On the other hand, it did lay out the ideas of retrograde analysis and proposed (but did not prove) what became known as Zermelo’s theorem: that chess is determined - White can force a win or Black can or it is a draw; we just don’t know which. Zermelo says that should we eventually know, “Chess would of course lose the character of a game at all.” A solid foundation for game theory was developed in the seminal work Theory of Games and Economic Behavior (von Neumann and Morgenstern, 1944)(2), which included an analysis showing that some games require strategies that are randomized (or otherwise unpredictable).
Norvig I 191
VsMinimax algorithms: D. F. Beal (1980)(3) and Dana Nau (1980(4), 1983(5)) studied the weaknesses of minimax applied to approximate evaluations. They showed that under certain assumptions about the distribution of leaf values in the tree, minimaxing can yield values at the root that are actually less reliable than the direct use of the evaluation function itself. >Computer Games/Norvig.

1. Zermelo, E. (1913). Über Eine Anwendung der Mengenlehre auf die Theorie des Schachspiels. In
Proc. Fifth International Congress of Mathematicians, Vol. 2, pp. 501–504
2. von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior (first edition). Princeton University Press
3. Beal, D. F. (1980). An analysis of minimax. In Clarke, M. R. B. (Ed.), Advances in Computer
Chess 2, pp. 103–109. Edinburgh University Press
4. Nau, D. S. (1980). Pathology on game trees: A summary of results. In AAAI-80, pp. 102–104.
5. Nau, D. S. (1983). Pathology on game trees revisited, and an alternative to minimaxing. AIJ, 21(1–2),
221–244.

Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010

Minimax Algorithm Russell Norvig I 164
Minimax Algorithm/gaming/Norvig/Russell: The minimax value of a node is the utility (for MAX) of being in the corresponding state, assuming that both players play optimally from there to the end of the game.
Norvig I 165
The minimax algorithm (…) computes the minimax decision from the current state. It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. The recursion proceeds all the way down to the leaves of the tree, and then the minimax values are backed up through the tree as the recursion unwinds. The minimax algorithm performs a complete depth-first exploration of the game tree. For real games, of course, the time cost is totally impractical, but this algorithm serves as the basis for the mathematical analysis of games and for more practical algorithms.
Norvig I 167
The problem with minimax search is that the number of game states it has to examine is exponential in the depth of the tree. Unfortunately, we can’t eliminate the exponent, but it turns out we can effectively cut it in half. The trick is that it is possible to compute the correct minimax decision without looking at every node in the game tree. Solution: alpha–beta pruning. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.
The general principle is this: consider a node n
Norvig I 168
somewhere in the tree (...), such that Player has a choice of moving to that node. If Player has a better choice m either at the parent node of n or at any choice point further up, then n will never be reached in actual play. So once we have found out enough about n (by examining some of its descendants) to reach this conclusion, we can prune it.
Norvig I 190
The minimax algorithm is traced to a 1912 paper by Ernst Zermelo(1), the developer of modern set theory. The paper unfortunately contained several errors and did not describe minimax correctly. On the other hand, it did lay out the ideas of retrograde analysis and proposed (but did not prove) what became known as Zermelo’s theorem: that chess is determined - White can force a win or Black can or it is a draw; we just don’t know which. Zermelo says that should we eventually know, “Chess would of course lose the character of a game at all.” A solid foundation for game theory was developed in the seminal work Theory of Games and Economic Behavior (von Neumann and Morgenstern, 1944)(2), which included an analysis showing that some games require strategies that are randomized (or otherwise unpredictable).
Norvig I 191
VsMinimax algorithms: D. F. Beal (1980)(3) and Dana Nau (1980(4), 1983(5)) studied the weaknesses of minimax applied to approximate evaluations. They showed that under certain assumptions about the distribution of leaf values in the tree, minimaxing can yield values at the root that are actually less reliable than the direct use of the evaluation function itself. >Computer Games/Norvig.

1. Zermelo, E. (1913). Über Eine Anwendung der Mengenlehre auf die Theorie des Schachspiels. In
Proc. Fifth International Congress of Mathematicians, Vol. 2, pp. 501–504
2. von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior (first edition). Princeton University Press
3. Beal, D. F. (1980). An analysis of minimax. In Clarke, M. R. B. (Ed.), Advances in Computer
Chess 2, pp. 103–109. Edinburgh University Press
4. Nau, D. S. (1980). Pathology on game trees: A summary of results. In AAAI-80, pp. 102–104.
5. Nau, D. S. (1983). Pathology on game trees revisited, and an alternative to minimaxing. AIJ, 21(1–2),
221–244.

Russell I
B. Russell/A.N. Whitehead
Principia Mathematica Frankfurt 1986

Russell II
B. Russell
The ABC of Relativity, London 1958, 1969
German Edition:
Das ABC der Relativitätstheorie Frankfurt 1989

Russell IV
B. Russell
The Problems of Philosophy, Oxford 1912
German Edition:
Probleme der Philosophie Frankfurt 1967

Russell VI
B. Russell
"The Philosophy of Logical Atomism", in: B. Russell, Logic and KNowledge, ed. R. Ch. Marsh, London 1956, pp. 200-202
German Edition:
Die Philosophie des logischen Atomismus
In
Eigennamen, U. Wolf (Hg) Frankfurt 1993

Russell VII
B. Russell
On the Nature of Truth and Falsehood, in: B. Russell, The Problems of Philosophy, Oxford 1912 - Dt. "Wahrheit und Falschheit"
In
Wahrheitstheorien, G. Skirbekk (Hg) Frankfurt 1996


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Models Stalnaker I 146
Model/Stalnaker: a model is a pair consisting of an object domain D and a valuation function V. >Valuation function, >Domains.
I 149
Model: For our modal predicate logic is then a quadruple ‹W,R,D,v›. D is the range function of W on the sets of individuals. For w ε W, Dw is the range of the world w.
Valuation function: the valuation function attributes intensions to descriptive expressions.
Intension: the intension here is a function of possible worlds on extensions.
>Intensions, >Extensions.
Necessity operator: The semantic rule of the necessity operator remains unchanged.
>Operators.
I 150
The rules for predicate logic are generalizations of the extensional rules. We only add an index for the worlds. E.g. rule for Universal quantification/universal quantifier/Stalnaker:
IF Φ has the form ∀F, then is νs w (Φ) = 1 gdw. νs w(F) = D w. otherwise = 0.
>Quantification, >Universal quantification.

Stalnaker I
R. Stalnaker
Ways a World may be Oxford New York 2003

Reinforcement Learning AI Research Norvig I 831
Reinforcement Learning/AI Research/Norvig/Russell: In many complex domains, reinforcement learning [by reward and punishment] is the only feasible way to train a program to perform at high levels. For example, in game playing, it is very hard for a human to provide accurate and consistent evaluations of large numbers of positions, which would be needed to train an evaluation function directly from examples. Instead, the program can be told when it has won or lost, and it can use this information to learn an evaluation function that gives reasonably accurate estimates of the probability of winning from any given position. Similarly, it is extremely difficult to program an agent to fly a helicopter; yet given appropriate negative rewards for crashing, wobbling, or deviating from a set course, an agent can learn to fly by itself. A. Passive Reinforcement learning
Situation: an agent is placed in an environment and must learn to behave successfully therein.
A utility-based agent learns a utility function on states and uses it to select actions that maximize the expected outcome utility.
A Q-learning agent learns an action-utility function, or Q-function, giving the expected utility of taking a given action in a given state.
A reflex agent learns a policy that maps directly from states to actions.
Exploration: an agent must experience as much as possible of its environment in order to learn how to behave in it. >Markov decision processes/Norvig.
Norvig I 833
Passive reinforcement learning: A simple method for direct utility estimation was invented in the late 1950s in the area of adaptive control theory by Widrow and Hoff (1960)(1). The idea is that the utility of a state is the expected total reward from that state onward (called the expected reward-to-go), and each trial provides a sample of this quantity for each state visited. Utility: the utilities of states are not independent! The utility of each state equals its own reward plus the expected utility of its successor states. That is, the utility values obey the Bellman equations for a fixed policy. (>Values/AI Research).
Problem: By ignoring the connections between states, direct utility estimation misses opportunities for learning.
Norvig I 834
Adaptive Dynamic Programming /ADP: An adaptive dynamic programming (or ADP) agent takes advantage of the constraints among the utilities of states by learning the transition model that connects them and solving the corresponding Markov decision process using a dynamic programming method. Alternatively, we can adopt the approach of modified policy iteration (…), using a simplified value iteration process to update the utility estimates after each change to the learned model.
Norvig I 836
Temporal difference learning/TD: All temporal-difference methods work by adjusting the utility estimates towards the ideal equilibrium that holds locally when the utility estimates are correct.
Norvig I 839
B. Active reinforcement learning: A passive learning agent has a fixed policy that determines its behavior. An active agent must decide what actions to take. First, the agent will need to learn a complete model with outcome probabilities for all actions, (…). Next, we need to take into account the fact that the agent has a choice of actions. The utilities it needs to learn are those defined by the optimal policy; they obey the >Bellman equations (…).The final issue is what to do at each step. Having obtained a utility function U that is optimal for the learned model, the agent can extract an optimal action by one-step look-ahead to maximize the expected utility; alternatively, if it uses policy iteration, the optimal policy is already available, so it should simply execute the action the optimal policy recommends.
Norvig I 843
Q-Learning: There is an alternative TD method, called Q-learning, which learns an action-utility representation instead of learning utilities. [A] TD [temporal difference] agent that learns a Q-function does not need a model of the form P(s’| s, a), either for learning or for action selection. For this reason, Q-learning is called a model-free method.
Norvig I 845
Function approximation: simply means using any sort of representation for the Q-function other than a lookup table. The representation is viewed as approximate because it might not be the case that the true utility function or Q-function can be represented in the chosen form.
Norvig I 846
Generalization: The compression achieved by a function approximator allows the learning agent to generalize from states it has visited to states it has not visited. That is, the most important aspect of function approximation is not that it requires less space, but that it allows for inductive generalization over input states.
Norvig I 848
Policies: a policy π is a function that maps states to actions. (…) we could represent π by a collection of parameterized Q-functions, one for each action, and take the action with the highest predicted value (…).if the policy is represented by Q-functions, then policy search results in a process that learns Q-functions. This process is not the same as Q-learning! In Q-learning with function approximation, the algorithm finds a value of θ such that ˆQ θ is “close” to Q ∗, the optimal Q-function. Policy search: Policy search, on the other hand, finds a value of θ that results in good performance; (…).
VsPolicy search: Problems: One problem with policy representations of the kind (…) is that the policy is a discontinuous function of the parameters when the actions are discrete.
Solution: This means that the value of the policy may also change discontinuously, which makes gradient-based search difficult. For this reason, policy search methods often use a stochastic policy representation πθ(s, a), which specifies the probability of selecting action a in state s.
Norvig I 854
History of reinforcement learning: Turing (1948(2), 1950(3)) proposed the reinforcement-learning approach, although he was not convinced of its effectiveness, writing, “the use of punishments and rewards can at best be a part of the teaching process.” Arthur Samuel’s work (1959)(4) was probably the earliest successful machine learning research. Around the same time, researchers in adaptive control theory (Widrow and Hoff, 1960)(1), building on work by Hebb (1949)(5), were training simple networks using the delta rule. The cart–pole work of Michie and Chambers (1968)(6) can also be seen as a reinforcement learning method with a function approximator. The psychological literature on reinforcement learning is much older; Hilgard and Bower (1975)(7) provide a good survey. Neuroscience: The neuroscience text by Dayan and Abbott (2001)(8) describes possible neural implementations of temporal-difference learning, while Dayan and Niv (2008)(9) survey the latest evidence from neuroscientific and behavioral experiments.
Markov decision process: The connection between reinforcement learning and Markov decision processes was first made by Werbos (1977)(10), but the development of reinforcement learning in AI stems from work at the University of Massachusetts in the early 1980s (Barto et al., 1981)(11). The paper by Sutton (1988) provides a good historical overview.
Temporal difference learning: The combination of temporal-difference learning with the model-based generation of simulated experiences was proposed in Sutton’s DYNA architecture (Sutton, 1990)(12). The idea of prioritized sweeping was introduced independently by Moore and Atkeson (1993)(13) and
Norvig I 855
Peng and Williams (1993)(14). Q-learning: was developed in Watkins’s Ph.D. thesis (1989)(15), while SARSA appeared in a technical report by Rummery and Niranjan (1994)(16).
Function approximation: Function approximation in reinforcement learning goes back to the work of Samuel, who used both linear and nonlinear evaluation functions and also used feature-selection methods to reduce the feature CMAC space. Later methods include the CMAC (Cerebellar Model Articulation Controller) (Albus, 1975)(17), which is essentially a sum of overlapping local kernel functions, and the associative neural networks of Barto et al. (1983)(18). Neural networks are currently the most popular form of function approximator. The best-known application is TD-Gammon (Tesauro, 1992(19), 1995(20)), (…).
Policy search: Policy search methods were brought to the fore by Williams (1992(21)), who developed the REINFORCE family of algorithms. Later work by Marbach and Tsitsiklis (1998)(22), Sutton et al. (2000)(23), and Baxter and Bartlett (2000)(24) strengthened and generalized the convergence results for policy search. The method of correlated sampling for comparing different configurations of a system was described formally by Kahn and Marshall (1953)(25), but seems to have been known long before that. Its use in reinforcement learning is due to Van Roy (1998)(26) and Ng and Jordan (2000)(27); the latter paper also introduced the PEGASUS algorithm and proved its formal properties.
Norvig I 857
Inverse reinforcement learning: Russell (1998)(28) describes the task of inverse reinforcement learning - figuring out what the reward function must be from an example path through that state space. This is useful as a part of apprenticeship learning, or as a part of doing science—we can understand an animal or robot by working backwards from what it does to what its reward function must be. Cf. >Learning, >Generalization, >Understanding.


1. Widrow, B. and Hoff, M. E. (1960). Adaptive switching circuits. In 1960 IRE WESCON Convention
Record, pp. 96–104.
2. Turing, A. (1948). Intelligent machinery. Tech. rep. National Physical Laboratory. reprinted in (Ince,
1992).
3. Turing, A. (1950). Computing machinery and intelligence. Mind, 59, 433–460. 4. Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 210–229.
5. Hebb, D. O. (1949). The Organization of Behavior. Wiley.
6. Michie, D. and Chambers, R. A. (1968). BOXES: An experiment in adaptive control. In Dale, E. and
Michie, D. (Eds.), Machine Intelligence 2, pp. 125–133. Elsevier/North-Holland.
7. Hilgard, E. R. and Bower, G. H. (1975). Theories of Learning (4th edition). Prentice-Hall.
8. Dayan, P. and Abbott, L. F. (2001). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press.
9. Dayan, P. and Niv, Y. (2008). Reinforcement learning and the brain: The good, the bad and the ugly.
Current Opinion in Neurobiology, 18(2), 185–196.
10. Werbos, P. (1977). Advanced forecasting methods for global crisis warning and models of intelligence. General Systems Yearbook, 22, 25–38.
11. Barto, A. G., Sutton, R. S., and Brouwer, P. S. (1981). Associative search network: A reinforcement learning associative memory. Biological Cybernetics, 40(3), 201–211.
12. Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In ICML-90, pp. 216–224.
13. Moore, A. W. and Atkeson, C. G. (1993). Prioritized sweeping—Reinforcement learning with less data and less time. Machine Learning, 13, 103–130.
14. Peng, J. and Williams, R. J. (1993). Efficient learning and planning within the Dyna framework. Adaptive Behavior, 2, 437–454.
15. Watkins, C. J. (1989). Models of Delayed Reinforcement Learning. Ph.D. thesis, Psychology Department, Cambridge University.
16. Rummery, G. A. and Niranjan, M. (1994). Online Q-learning using connectionist systems. Tech.
rep. CUED/F-INFENG/TR 166, Cambridge University Engineering Department.
17. Albus, J. S. (1975). A new approach to manipulator control: The cerebellar model articulation controller (CMAC). J. Dynamic Systems, Measurement, and Control, 97, 270–277.
18. Barto, A. G., Sutton, R. S., and Anderson, C. W. (1983). Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, 13, 834–
846.
19. Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8(3–4), 257–
277.
20. Tesauro, G. (1995). Temporal difference learning and TD-Gammon. CACM, 38(3), 58–68.
21. Williams, R. J. (1992). Simple statistical gradient following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229–256.
22. Marbach, P. and Tsitsiklis, J. N. (1998). Simulation based optimization of Markov reward processes.
Technical report LIDS-P-2411, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology.
23. Sutton, R. S., McAllester, D. A., Singh, S. P., and Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Solla, S. A., Leen, T. K., andM¨uller, K.-R. (Eds.),
NIPS 12, pp. 1057–1063. MIT Press.
24. Baxter, J. and Bartlett, P. (2000). Reinforcement learning in POMDP’s via direct gradient ascent. In
ICML-00, pp. 41–48.
25. Kahn, H. and Marshall, A. W. (1953). Methods of reducing sample size in Monte Carlo computations. Operations Research, 1(5), 263–278. 26. Van Roy, B. (1998). Learning and value function approximation in complex decision processes. Ph.D. thesis, Laboratory for Information and Decision Systems, MIT.
27. Ng, A. Y. and Jordan, M. I. (2000). PEGASUS: A policy search method for large MDPs and POMDPs.
In UAI-00, pp. 406–415.
28. Russell, S. J. (1998). Learning agents for uncertain environments (extended abstract). In COLT-98, pp. 101–103.


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Reinforcement Learning Bostrom I 230
Reinforcement learning/superintelligence/values/Bostrom: Often, the learning algorithm involves the gradual construction of some kind of evaluation function, which assigns values to states, state-action pairs, or policies. Problem: The evaluation function, which is continuously updated in light of experience, could be regarded as incorporating a form of learning about value. However, what is being learned is not new final values but increasingly accurate estimates of the instrumental values of reaching particular states (or of taking particular actions in particular states, or of following particular policies). Insofar as a reinforcement-learning agent can be described as having a final goal, that goal remains constant: to maximize future reward. And reward consists of specially designated percepts received from the environment. Therefore, the wireheading syndrome remains a likely outcome in any reinforcement agent that develops a world model sophisticated enough to suggest this alternative way of maximizing reward. >Values/superintelligence/Bostrom.

Bostrom I
Nick Bostrom
Superintelligence. Paths, Dangers, Strategies Oxford: Oxford University Press 2017

Relations Cresswell II 107
Relativization/Cresswell: E.g. "admires" is a two-place relation - i.e. it depends on two arguments - but they can be reduced to a single digit: as a predicate "admires someone". We achieve independence by adding the relation. That is, it is either true or false that x admires someone. It is no longer relatively to two things.
V *: evaluation function (in the idiolect) can then play the role of a bound variable.
>Valuation, >Bound variable, >Idiolect.

Cr I
M. J. Cresswell
Semantical Essays (Possible worlds and their rivals) Dordrecht Boston 1988

Cr II
M. J. Cresswell
Structured Meanings Cambridge Mass. 1984

Search Algorithms Norvig Norvig I 64
Search Algorithms/Russell/Norvig: uninformed search algorithms [are] algorithms that are given no information about the problem other than its definition. Although some of these algorithms can solve any solvable problem, none of them can do so efficiently. Informed search algorithms, on the other hand, can do quite well given some guidance on where to look for solutions.
A. Uninformed search.
Norvig I 81
Breadth-first search is a simple strategy in which the root node is expanded first, then all the successors of the root node are expanded next, then their successors, and so on. In general, all the nodes are expanded at a given depth in the search tree before any nodes at the next level are expanded.
Norvig I 83
The memory requirements are a bigger problem for breadth-first search than is the execution time. (…) exponential-complexity search problems cannot be solved by uninformed methods for any but the smallest instances.
Norvig I 85
Depth-first search always expands the deepest node in the current frontier of the search tree. Both versions are nonoptimal.
Norvig I 87
Backtracking search: uses less memory. (…) only one successor is generated at a time rather than all successors; each partially expanded node remembers which successor to generate next. In this way, only O(m) memory is needed rather than O(bm). Backtracking search facilitates yet another memory-saving (and time-saving) trick: the idea of generating a successor by modifying the current state description directly rather than copying it first. This reduces the memory requirements to just one state description and O(m) actions. For this to work, we must be able to undo each modification when we go back to generate the next successor. B. Informed (heuristic) search.
Norvig I 92
Best-first search is an instance of the general tree-search or graph-search algorithm in which a node is selected for expansion based on an evaluation function, f(n). The evaluation function is construed as a cost estimate, so the node with the lowest evaluation is expanded first. Greedy best-first search tries to expand the node that is closest to the goal, on the grounds that this is likely to lead to a solution quickly. Thus, it evaluates nodes by using just the heuristic function; that is, f(n) = h(n).
Norvig I 108
A general tree search algorithm considers all possible paths to find a solution, whereas a graph search algorithm avoids consideration of redundant paths.
Norvig I 120
Online search: here, the agent is faced with a state space that is initially unknown and must be explored.
Norvig I 121
Local search algorithms: If the path to the goal does not matter, we might consider a different class of algorithms, ones that do not worry LOCAL SEARCH about paths at all. Local search algorithms operate using a single current node (rather than multiple paths) and generally move only to neighbors of that node. These algorithms have two key advantages: (1) they use very little memory - usually a constant amount; and (2) they can often find reasonable solutions in large or infinite (continuous) state spaces for which systematic algorithms are unsuitable.((s) For Problems: cf. >Local minima (local maxima; for a solution: >Simulated annealing).
Norvig I 125
Local beam search: (beam search is a path-based algorithm). The local beam search algorithm keeps track of k states rather than
Norvig I 126
just one. It begins with k randomly generated states. At each step, all the successors of all k states are generated. If anyone is a goal, the algorithm halts. Otherwise, it selects the k best successors from the complete list and repeats. >Genetic algorithms. And-Or search problem: see >Terminology/Norvig.
Norvig I 147
Online search: >Online search/Norvig.
Norvig I 154
Literature for local search: (Newton, 1671(1); Raphson, 1690(2)) can be seen as a very efficient local search method for continuous spaces in which gradient information is available.
Brent (1973)(3) is a classic reference for optimization algorithms that do not require such information.
Beam search, which we have presented as a local search algorithm, originated as a bounded-width variant of dynamic programming for speech recognition in the HARPY system (Lowerre, 1976)(4). A related algorithm is analyzed in depth by Pearl (1984(5)).
The topic of local search was reinvigorated in the early 1990s by surprisingly good results
for large constraint-satisfaction problems such as n-queens (Minton et al., 1992)(6) and logical reasoning (Selman et al., 1992)(7) and by the incorporation of randomness, multiple simultaneous searches, and other improvements.
Tabu serarch: a variant of hill climbing called tabu search has gained popularity
(Glover and Laguna, 1997)(8). This algorithm maintains a tabu list of k previously visited states that cannot be revisited; as well as improving efficiency when searching graphs, this list can allow the algorithm to escape from some local minima.
Stage algorithm: Another useful improvement on hill climbing is the stage algorithm (Boyan and Moore, 1998)(9). The idea is to use the local maxima found by random-restart hill climbing to get an idea of the overall shape of the landscape. >Constraint Satisfaction Problems/Norvig.
Norvig I 227
Constraint satisfaction problems (CSPs) represent a state with a set of variable/value pairs and represent the conditions for a solution by a set of constraints on the variables. Many important real-world problems can be described as CSPs. A number of inference techniques use the constraints to infer which variable/value pairs are consistent and which are not. These include node, arc, path, and k-consistency.
Backtracking search, a form of depth-first search, is commonly used for solving CSPs.
Inference can be interwoven with search.
The minimum-remaining-values and degree heuristics are domain-independent methods for deciding which variable to choose next in a backtracking search. The least constraining- value heuristic helps in deciding which value to try first for a given variable. Backtracking occurs when no legal assignment can be found for a variable. Conflict-directed backjumping backtracks directly to the source of the problem.
Local search using the min-conflicts heuristic has also been applied to constraint satisfaction
problems with great success.
For forward chaining, backward chaining: see >Software agents/Norvig.


1. Newton, I. (1664–1671). Methodus fluxionum et serierum infinitarum. Unpublished notes
2. Raphson, J. (1690). Analysis aequationum universalis. Apud Abelem Swalle, London.
3. Brent, R. P. (1973). Algorithms for minimization without derivatives. Prentice-Hall
4. Lowerre, B. T. (1976). The HARPY Speech Recognition System. Ph.D. thesis, Computer Science Department, Carnegie-Mellon University.
5. Pearl, J. (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-
Wesley.
6. Minton, S., Johnston, M. D., Philips, A. B., and Laird, P. (1992). Minimizing conflicts: A heuristic repair method for constraint satisfaction and scheduling problems. AIJ, 58(1–3), 161–205.
7. Selman, B., Levesque, H. J., and Mitchell, D. (1992). A new method for solving hard satisfiability problems. In AAAI-92, pp. 440–446.
8. Glover, F. and Laguna, M. (Eds.). (1997). Tabu search. Kluwer
9. Boyan, J. A. and Moore, A. W. (1998). Learning evaluation functions for global optimization and Boolean satisfiability. In AAAI-98

Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010

Search Algorithms Russell Norvig I 64
Search Algorithms/Russell/Norvig: uninformed search algorithms [are] algorithms that are given no information about the problem other than its definition. Although some of these algorithms can solve any solvable problem, none of them can do so efficiently. Informed search algorithms, on the other hand, can do quite well given some guidance on where to look for solutions.

A. Uninformed search.
Norvig I 81
Breadth-first search is a simple strategy in which the root node is expanded first, then all the successors of the root node are expanded next, then their successors, and so on. In general, all the nodes are expanded at a given depth in the search tree before any nodes at the next level are expanded.
Norvig I 83
The memory requirements are a bigger problem for breadth-first search than is the execution time. (…) exponential-complexity search problems cannot be solved by uninformed methods for any but the smallest instances.
Norvig I 85
Depth-first search always expands the deepest node in the current frontier of the search tree. Both versions are nonoptimal.
Norvig I 87
Backtracking search: uses less memory. (…) only one successor is generated at a time rather than all successors; each partially expanded node remembers which successor to generate next. In this way, only O(m) memory is needed rather than O(bm). Backtracking search facilitates yet another memory-saving (and time-saving) trick: the idea of generating a successor by modifying the current state description directly rather than copying it first. This reduces the memory requirements to just one state description and O(m) actions. For this to work, we must be able to undo each modification when we go back to generate the next successor.
B. Informed (heuristic) search.
Norvig I 92
Best-first search is an instance of the general tree-search or graph-search algorithm in which a node is selected for expansion based on an evaluation function, f(n). The evaluation function is construed as a cost estimate, so the node with the lowest evaluation is expanded first. Greedy best-first search tries to expand the node that is closest to the goal, on the grounds that this is likely to lead to a solution quickly. Thus, it evaluates nodes by using just the heuristic function; that is, f(n) = h(n).
Norvig I 108
A general tree search algorithm considers all possible paths to find a solution, whereas a graph search algorithm avoids consideration of redundant paths.
Norvig I 120
Online search: here, the agent is faced with a state space that is initially unknown and must be explored.
Norvig I 121
Local search algorithms: If the path to the goal does not matter, we might consider a different class of algorithms, ones that do not worry LOCAL SEARCH about paths at all. Local search algorithms operate using a single current node (rather than multiple paths) and generally move only to neighbors of that node. These algorithms have two key advantages: (1) they use very little memory - usually a constant amount; and (2) they can often find reasonable solutions in large or infinite (continuous) state spaces for which systematic algorithms are unsuitable.((s) For Problems: cf. >Local minima (local maxima; for a solution: >Simulated annealing).
Norvig I 125
Local beam search: (beam search is a path-based algorithm). The local beam search algorithm keeps track of k states rather than
Norvig I 126
just one. It begins with k randomly generated states. At each step, all the successors of all k states are generated. If anyone is a goal, the algorithm halts. Otherwise, it selects the k best successors from the complete list and repeats. >Genetic algorithms.
And-Or search problem: see >Terminology/Norvig.
Norvig I 147
Online search: >Online search/Norvig.
Norvig I 154
Literature for local search: (Newton, 1671(1); Raphson, 1690(2)) can be seen as a very efficient local search method for continuous spaces in which gradient information is available.
Brent (1973)(3) is a classic reference for optimization algorithms that do not require such information.
Beam search, which we have presented as a local search algorithm, originated as a bounded-width variant of dynamic programming for speech recognition in the HARPY system (Lowerre, 1976)(4). A related algorithm is analyzed in depth by Pearl (1984(5)).
The topic of local search was reinvigorated in the early 1990s by surprisingly good results
for large constraint-satisfaction problems such as n-queens (Minton et al., 1992)(6) and logical reasoning (Selman et al., 1992)(7) and by the incorporation of randomness, multiple simultaneous searches, and other improvements.
Tabu serarch: a variant of hill climbing called tabu search has gained popularity (Glover and Laguna, 1997)(8). This algorithm maintains a tabu list of k previously visited states that cannot be revisited; as well as improving efficiency when searching graphs, this list can allow the algorithm to escape from some local minima.
Stage algorithm: Another useful improvement on hill climbing is the stage algorithm (Boyan and Moore, 1998)(9). The idea is to use the local maxima found by random-restart hill climbing to get an idea of the overall shape of the landscape.
>Constraint Satisfaction Problems/Norvig.
Norvig I 227
Constraint satisfaction problems (CSPs) represent a state with a set of variable/value pairs and represent the conditions for a solution by a set of constraints on the variables. Many important real-world problems can be described as CSPs. A number of inference techniques use the constraints to infer which variable/value pairs are consistent and which are not. These include node, arc, path, and k-consistency.
Backtracking search, a form of depth-first search, is commonly used for solving CSPs.
Inference can be interwoven with search.
The minimum-remaining-values and degree heuristics are domain-independent methods for deciding which variable to choose next in a backtracking search. The least constraining- value heuristic helps in deciding which value to try first for a given variable. Backtracking occurs when no legal assignment can be found for a variable. Conflict-directed backjumping backtracks directly to the source of the problem.
Local search using the min-conflicts heuristic has also been applied to constraint satisfaction problems with great success.
For forward chaining, backward chaining see >Software agents/Norvig.

1. Newton, I. (1664–1671). Methodus fluxionum et serierum infinitarum. Unpublished notes
2. Raphson, J. (1690). Analysis aequationum universalis. Apud Abelem Swalle, London.
3. Brent, R. P. (1973). Algorithms for minimization without derivatives. Prentice-Hall
4. Lowerre, B. T. (1976). The HARPY Speech Recognition System. Ph.D. thesis, Computer Science Department, Carnegie-Mellon University.
5. Pearl, J. (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-
Wesley.
6. Minton, S., Johnston, M. D., Philips, A. B., and Laird, P. (1992). Minimizing conflicts: A heuristic repair method for constraint satisfaction and scheduling problems. AIJ, 58(1–3), 161–205.
7. Selman, B., Levesque, H. J., and Mitchell, D. (1992). A new method for solving hard satisfiability problems. In AAAI-92, pp. 440–446.
8. Glover, F. and Laguna, M. (Eds.). (1997). Tabu search. Kluwer
9. Boyan, J. A. and Moore, A. W. (1998). Learning evaluation functions for global optimization and Boolean satisfiability. In AAAI-98

Russell I
B. Russell/A.N. Whitehead
Principia Mathematica Frankfurt 1986

Russell II
B. Russell
The ABC of Relativity, London 1958, 1969
German Edition:
Das ABC der Relativitätstheorie Frankfurt 1989

Russell IV
B. Russell
The Problems of Philosophy, Oxford 1912
German Edition:
Probleme der Philosophie Frankfurt 1967

Russell VI
B. Russell
"The Philosophy of Logical Atomism", in: B. Russell, Logic and KNowledge, ed. R. Ch. Marsh, London 1956, pp. 200-202
German Edition:
Die Philosophie des logischen Atomismus
In
Eigennamen, U. Wolf (Hg) Frankfurt 1993

Russell VII
B. Russell
On the Nature of Truth and Falsehood, in: B. Russell, The Problems of Philosophy, Oxford 1912 - Dt. "Wahrheit und Falschheit"
In
Wahrheitstheorien, G. Skirbekk (Hg) Frankfurt 1996


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Similarity Metrics Bigelow I 129
Counterfactual Conditional/Valuation/Valuation Function/Valuation Rules/Bigelow/Pargetter:
V9 If a = (ß would be > would be γ) then V (a) is the set of all possible worlds w ε w so that there is a possible world u where β is true and γ is true and every possible world v in which ß is true and γ is false, is less accessible from w than u.

>Similarity metrics.
Similarity/possible worlds/similarity metrics/counterfactual conditional/Bigelow/Pargetter: rule V9 states that a counterfactual conditional (β would be > would be γ) is true in a possible world if the next ß-worlds are all γ-worlds.
>Counterfactual conditional.
For example, Violet says: "If I were a blackbird, I would sing" this is true in the actual world, because the next worlds where Violet is a blackbird are possible worlds where she sings.
Similarity metrics/Bigelow/Pargetter: the possible worlds in which Violet is a blackbird on the tree or on the mailbox can be possibly at the same distance from the actual world.
>Possible worlds.
Connection/Possible worlds: the question of whether such "connections" can exist is discussed in connection with the conditionally excluded middle (see above).
Complexity/Bigelow/Pargetter: the complexity of V) is due to the desired generality.
>Generality, >Generalization.
I 130
Resemblance/similarity metrics/counterfactual conditional/proximity/possible worlds/Bigelow/Pargetter: we want a possible world, in which the fore link and back link of the counterfactual conditional are both true, to be closer than one in which only the fore links are true, and the back links are false.
I 209
Possible World/Variant/Bigelow/Pargetter: we could also specify individuals by describing their position in the course of their existence. Through an infinite sequence of quadruples. There are many variants, including more economical ones.
We can combine all the positions of a particle into one function. This is also possible for other properties that we attribute to a particle. So we can combine a particle not only with numbers, but also with whole functions.
Function: these functions could describe the changes of the particle.
Book/Bigelow/Pargetter: a book for such a described world could be a Hilbert space. But a book is not a world yet! A book for the actual world would consist of
two components:
1. a world property, or a maximum specific structural universal
2. to something that instantiates this universal, that is the world itself.
This applies to the actual world!
Other possible worlds correspond to a universal, but this is not instantiated, so there is no world here.
>Universals, >Instantiation.
Representation/Bigelow/Pargetter: now the numbers representing these world properties could seem all too abstract.
I 210
But they are not! They represent the proportions in which the properties of the parts that we have chosen as units are related to each other. Crossworld-relations/world properties/property theory/Bigelow/Pargetter: now it seems as if our theory is making a surprising turn: it seems to provide a measure for the distance between possible worlds that we have been unable to gain so far. And that measure would not be arbitrary!
>Cross world identity.
Accessibility: could we get it under control with this?
>Accessibility.
If the possible worlds contain the same individuals, it is even easy to construct a similarity metrics for them.
If the individuals are different, it is more difficult.

Big I
J. Bigelow, R. Pargetter
Science and Necessity Cambridge 1990

Terminology Norvig Norvig I 8
Terminology/Russell/Norvig: Although decidability and computability are important to an understanding of computation, the notion of tractability has had an even greater impact. Roughly speaking, a problem is called intractable if the time required to solve instances of the problem grows exponentially with the size of the instances. The distinction between polynomial and exponential growth in complexity was first emphasized in the mid-1960s (Cobham, 1964(1); Edmonds, 1965(2)). It is important because exponential growth means that even moderately large instances cannot be solved in any reasonable time.
Norvig I 106
Pattern databases: The idea behind them is to store these exact solution costs for every possible subproblem instance (…)
Norvig I 108
Def Problem: A problem consists of five parts: the initial state, a set of actions, a transition model describing the results of those actions, a goal test function, and a path cost function. The environment of the problem is represented by a state space. A path through the state space from the initial state to a goal state is a solution.
Norvig I 135
And/or nodes/search trees: or: In a deterministic environment, the only branching is introduced by the agent’s own choices in each state. We call these nodes OR nodes And: In a nondeterministic environment, branching is also introduced by the environment’s choice of outcome for each action. We call these nodes AND nodes. A solution for an AND–OR search problem is a subtree that (1) has a goal node at every leaf, (2) specifies one action at each of its OR nodes, and (3) includes every outcome branch at each of its AND nodes.
Norvig I 148
Competitive ratio: Typically, the agent’s objective is to reach a goal state while minimizing cost. (Another possible objective is simply to explore the entire environment.) The cost is the total path cost of the path that the agent actually travels. It is common to compare this cost with the path cost of the path the agent would follow if it knew the search space in advance—that is, the actual shortest path (or shortest complete exploration). In the language of online algorithms, this is called the competitive ratio; we would like it to be as small as possible. >Online search/Norvig.
Norvig I 162
Def Pruning: Pruning allows us to ignore portions of the search tree that make no difference to the final choice. Def Heuristic evaluation functions: allow us to approximate the true utility of a state without doing a complete search.
Def utility function: (also called an objective function or payoff function), defines the final numeric value for a game that ends in terminal state s for a player p. In chess, the outcome is a win, loss, or draw, with values +1, 0, or 1 2 . Some games have a wider variety of possible outcomes; the payoffs in backgammon range from 0 to +192.
Def Zero-sum game: is (confusingly) defined as one where the total payoff to all players is the same for every instance of the game. Chess is zero-sum. “Constant-sum” would have been a better term, but zero-sum is traditional and makes sense if you imagine each player is charged an entry fee of 1/2 .
Norvig I 165
Minimax Algorithm/gaming: The minimax algorithm (…) computes the minimax decision from the current state. It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. The recursion proceeds all the way down to the leaves of the tree, and then the minimax values are backed up through the tree as the recursion unwinds. The minimax algorithm performs a complete depth-first exploration of the game tree. For real games, of course, the time cost is totally impractical, but this algorithm serves as the basis for the mathematical analysis of games and for more practical algorithms.
Norvig I 208
Def node consistency: A single variable (corresponding to a node in the CSP network) is node-consistent if all the values in the variable’s domain satisfy the variable’s unary constraints. Def arc consistency: A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary constraints. More formally, Xi is arc-consistent with respect to another variable Xj if for every value in the current domain Di there is some value in the domain Dj that satisfies the binary constraint on the arc (Xi,Xj). >Constraint Satisfaction Problems/CSP/Norvig.
Norvig I 210
Def Path consistency: Arc consistency tightens down the domains (unary constraints) using the arcs (binary constraints). To make progress on problems like map coloring, we need a stronger notion of consistency. Path consistency tightens the binary constraints by using implicit constraints that are inferred by looking at triples of variables.
Norvig I 211
Def K-consistency: Stronger forms of propagation can K-CONSISTENCY be defined with the notion of k-consistency. A CSP is k-consistent if, for any set of k − 1 variables and for any consistent assignment to those variables, a consistent value can always be assigned to any kth variable. For forward chaining, backward chaining: see >Software agents/Norvig.
Norvig I 266
Propositions: The idea of associating propositions with time steps extends to any aspect of the world changes over time. Fluent: We use the word fluent (from the Latin fluents, flowing) to refer an aspect of the world that changes. “Fluent” is a synonym for “state variable”.

Norvig I 346
Skolemize: Skolemization is the process of removing existential quantifiers by elimination. In the simple case, it is just like the Existential Instantiation rule (…): translate ∃x P(x) into P(A), where A is a new constant.
Norvig I 410
Nondeterministic action: The programming languages community has coined the term demonic nondeterminism for the case where an adversary makes the DEMONIC choices, contrasting this with angelic nonde-
Norvig I 411
terminism, where the agent itself makes the choices. We borrow this term to define angelic semantics for HLA descriptions.
Norvig I 468
Closed-world assumption: as implemented in logic programs, provides a simple way to avoid having to specify lots of negative information. It is best interpreted as a default that can be overridden by additional information.

1. Cobham, A. (1964). The intrinsic computational difficulty of functions. In Proc. 1964 International
Congress for Logic, Methodology, and Philosophy of Science, pp. 24–30.
2. Edmonds, J. (1965). Paths, trees, and flowers. Canadian Journal of Mathematics, 17, 449–467.

Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010

Terminology Russell Norvig I 8
Terminology/Russell/Norvig: Although decidability and computability are important to an understanding of computation, the notion of tractability has had an even greater impact. Roughly speaking, a problem is called intractable if the time required to solve instances of the problem grows exponentially with the size of the instances. The distinction between polynomial and exponential growth in complexity was first emphasized in the mid-1960s (Cobham, 1964(1); Edmonds, 1965(2)). It is important because exponential growth means that even moderately large instances cannot be solved in any reasonable time.
Norvig I 106
Pattern databases: The idea behind them is to store these exact solution costs for every possible subproblem instance (…)
Norvig I 108
Def Problem: A problem consists of five parts: the initial state, a set of actions, a transition model describing the results of those actions, a goal test function, and a path cost function. The environment of the problem is represented by a state space. A path through the state space from the initial state to a goal state is a solution.
Norvig I 135
And/or nodes/search trees: or: In a deterministic environment, the only branching is introduced by the agent’s own choices in each state. We call these nodes OR nodes And: In a nondeterministic environment, branching is also introduced by the environment’s choice of outcome for each action. We call these nodes AND nodes. A solution for an AND–OR search problem is a subtree that
(1) has a goal node at every leaf,
(2) specifies one action at each of its OR nodes, and
(3) includes every outcome branch at each of its AND nodes.
Norvig I 148
Competitive ratio: Typically, the agent’s objective is to reach a goal state while minimizing cost. (Another possible objective is simply to explore the entire environment.) The cost is the total path cost of the path that the agent actually travels. It is common to compare this cost with the path cost of the path the agent would follow if it knew the search space in advance—that is, the actual shortest path (or shortest complete exploration). In the language of online algorithms, this is called the competitive ratio; we would like it to be as small as possible. >Online search/Norvig.
Norvig I 162
Def Pruning: Pruning allows us to ignore portions of the search tree that make no difference to the final choice. Def Heuristic evaluation functions: allow us to approximate the true utility of a state without doing a complete search.
Def utility function: (also called an objective function or payoff function), defines the final numeric value for a game that ends in terminal state s for a player p. In chess, the outcome is a win, loss, or draw, with values +1, 0, or 1 2 . Some games have a wider variety of possible outcomes; the payoffs in backgammon range from 0 to +192.
Def Zero-sum game: is (confusingly) defined as one where the total payoff to all players is the same for every instance of the game. Chess is zero-sum. “Constant-sum” would have been a better term, but zero-sum is traditional and makes sense if you imagine each player is charged an entry fee of 1/2 .
Norvig I 165
Minimax Algorithm/gaming: The minimax algorithm (…) computes the minimax decision from the current state. It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. The recursion proceeds all the way down to the leaves of the tree, and then the minimax values are backed up through the tree as the recursion unwinds. The minimax algorithm performs a complete depth-first exploration of the game tree. For real games, of course, the time cost is totally impractical, but this algorithm serves as the basis for the mathematical analysis of games and for more practical algorithms.
Norvig I 208
Def node consistency: A single variable (corresponding to a node in the CSP network) is node-consistent if all the values in the variable’s domain satisfy the variable’s unary constraints. Def arc consistency: A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary constraints. More formally, Xi is arc-consistent with respect to another variable Xj if for every value in the current domain Di there is some value in the domain Dj that satisfies the binary constraint on the arc (Xi,Xj). >Constraint Satisfaction Problems/CSP/Norvig.
Norvig I 210
Def Path consistency: Arc consistency tightens down the domains (unary constraints) using the arcs (binary constraints). To make progress on problems like map coloring, we need a stronger notion of consistency. Path consistency tightens the binary constraints by using implicit constraints that are inferred by looking at triples of variables.
Norvig I 211
Def K-consistency: Stronger forms of propagation can K-CONSISTENCY be defined with the notion of k-consistency. A CSP is k-consistent if, for any set of k − 1 variables and for any consistent assignment to those variables, a consistent value can always be assigned to any kth variable. For forward chaining, backward chaining: see >Software agents/Norvig.
Norvig I 266
Propositions: The idea of associating propositions with time steps extends to any aspect of the world changes over time. Fluent: We use the word fluent (from the Latin fluents, flowing) to refer an aspect of the world that changes. “Fluent” is a synonym for “state variable”.

Norvig I 346
Skolemize: Skolemization is the process of removing existential quantifiers by elimination. In the simple case, it is just like the Existential Instantiation rule (…): translate ∃x P(x) into P(A), where A is a new constant.
Norvig I 410
Nondeterministic action: The programming languages community has coined the term demonic nondeterminism for the case where an adversary makes the DEMONIC choices, contrasting this with angelic nonde-
Norvig I 411
terminism, where the agent itself makes the choices. We borrow this term to define angelic semantics for HLA descriptions.
Norvig I 468
Closed-world assumption: as implemented in logic programs, provides a simple way to avoid having to specify lots of negative information. It is best interpreted as a default that can be overridden by additional information.
1. Cobham, A. (1964). The intrinsic computational difficulty of functions. In Proc. 1964 International
Congress for Logic, Methodology, and Philosophy of Science, pp. 24–30.
2. Edmonds, J. (1965). Paths, trees, and flowers. Canadian Journal of Mathematics, 17, 449–467.

Russell I
B. Russell/A.N. Whitehead
Principia Mathematica Frankfurt 1986

Russell II
B. Russell
The ABC of Relativity, London 1958, 1969
German Edition:
Das ABC der Relativitätstheorie Frankfurt 1989

Russell IV
B. Russell
The Problems of Philosophy, Oxford 1912
German Edition:
Probleme der Philosophie Frankfurt 1967

Russell VI
B. Russell
"The Philosophy of Logical Atomism", in: B. Russell, Logic and KNowledge, ed. R. Ch. Marsh, London 1956, pp. 200-202
German Edition:
Die Philosophie des logischen Atomismus
In
Eigennamen, U. Wolf (Hg) Frankfurt 1993

Russell VII
B. Russell
On the Nature of Truth and Falsehood, in: B. Russell, The Problems of Philosophy, Oxford 1912 - Dt. "Wahrheit und Falschheit"
In
Wahrheitstheorien, G. Skirbekk (Hg) Frankfurt 1996


Norvig I
Peter Norvig
Stuart J. Russell
Artificial Intelligence: A Modern Approach Upper Saddle River, NJ 2010
Valuation Bigelow I 125
Valuation function V/Bigelow/Pargetter: its definition is complex because it has to be recursive. It assigns an interpretation or a semantic value. (To each expression of the language). >Recursion.
Valuation: First, semantic values are assigned to the non-logical constants.
>Semantic value.
Rules are then created for semantic values from compound expressions.
Logical constants: their valuation is specified by recursive rules.
>Logical constants.
Domain: can also be restricted, e.g. if you want to exclude the Barcan formula.
>Domains.
For example, restriction: for each world w you can assume a separate individual domain DW. Which, for example, consists only of the possibilia of this possible world.
>Possibilia, >Possible worlds.
I 126
Def partition/Bigelow/Pargetter: is a family of individual domains that do not overlap. I.e. no individual is in more than one possible world. That would correspond to Lewis's counterpart theory. >Counterpart theory.
I 129
Counterfactual Conditional/Valuation/Valuation Function/Valuation Rules/Bigelow/Pargetter:
V9 If a = (ß would be γ) then V (a) is the set of all possible worlds w ε w so that there is a possible world u where ß is true and γ is true and every possible world v in which ß is true and γ is false, is less accessible from w than from u.

>Similarity metrics, >Counterfactual conditional.
Similarity/possible worlds/similarity metrics/counterfactual conditional/Bigelow/Pargetter: Rule V9 states that a counterfactual conditional (ß would be > would be γ) is true in a possible world if the next ß-worlds are all γ-worlds.

Big I
J. Bigelow, R. Pargetter
Science and Necessity Cambridge 1990

Values Bostrom I 226
Values/superintelligence/software-agents//Bostrom: While the agent is unintelligent, it might lack the capability to understand or even represent any humanly meaningful value. Problem: It is impossible to enumerate all possible situations a superintelligence might find itself in and to specify for each what action it should take. Similarly, it is impossible to create a list of all possible worlds and assign each of them a value.
Motivation: A motivation system, therefore, cannot be specified as a comprehensive lookup table. It must instead be expressed more abstractly, as a formula or rule that allows the agent to decide what to do in any given situation. ((s) Cf. the philosophical discussion of principles against content: >Principles, >Utilitarianism, >Deontology.)
I 227
Utility: Creating a machine that can compute a good approximation of the expected utility of the actions available to it is an AI-complete problem. (…) a problem, a problem that remains even if the problem of making machines intelligent is solved. We can use this framework of a utility-maximizing agent to consider the predicament of a future seed-AI programmer who intends to solve the control problem by endowing the AI with a final goal that corresponds to some plausible human notion of a worthwhile outcome. E.g., The programmer has some particular human value in mind that he would like the AI to promote. (…) let us say that it is happiness. But how could he express such a utility function in computer code? Computer languages do not contain terms such as “happiness” as primitives.
I 228
If we cannot transfer human values into an AI by typing out full-blown representations in computer code, what else might we try?
I 230
{Possible methods for acquiring values]: -Reinforcement learning: Often, the learning algorithm involves the gradual construction of some kind of evaluation function, which assigns values to states, state–action pairs, or policies.
Problem: The evaluation function, which is continuously updated in light of experience, could be regarded as incorporating a form of learning about value. However, what is being learned is not new final values but increasingly accurate estimates of the instrumental values of reaching particular states (or of taking particular actions in particular states, or of following particular policies). Insofar as a reinforcement-learning agent can be described as having a final goal, that goal remains constant: to maximize future reward. And reward consists of specially designated percepts received from the environment. Therefore, the wireheading syndrome remains a likely outcome in any reinforcement agent that develops a world model sophisticated enough to suggest this alternative way of maximizing reward.
I 233
- Motivational scaffolding: It involves giving the seed AI an interim goal system, with relatively simple final goals that we can represent by means of explicit coding or some other feasible method. Once the AI has developed more sophisticated representational faculties, we replace this interim scaffold goal system with one that has different final goals. Problem: Because the scaffold goals are not just instrumental but final goals for the AI, the AI might be expected to resist having them replaced (goal-content integrity being a convergent instrumental value). This creates a hazard. If the AI succeeds in thwarting the replacement of its scaffold goals, the method fails.
I 234
Further problems: (1) The motivational scaffolding (…) carries the risk that the AI could become too powerful while it is still running on its interim goal system. (2) Installing the ultimately intended goals in a human-level AI is not necessarily that much easier than doing so in a more primitive AI.
I 235
-Value learning: [in order to] AI’s intelligence to learn the values (…) we must provide a criterion for the AI that at least implicitly picks out some suitable set of values. (…) the value learning approach retains an unchanging final goal throughout the AI’s developmental and operational phases. Learning does not change the goal. It changes only the AI’s beliefs about the goal. Criteria: The AI thus must be endowed with a criterion that it can use to determine which percepts constitute evidence in favor of some hypothesis about what the ultimate goal is, and which percepts constitute evidence against.
Problem: creating artificial general intelligence in the first place, which requires a powerful learning mechanism that can discover the structure of the environment from limited sensory inputs.
I 240
Understanding/motivation: (…) the difficulty here is not so much how to ensure that the AI can understand human intentions. A superintelligence should easily develop such understanding. Rather, the difficulty is ensuring that the AI will be motivated to pursue the described values in the way we intended. This is not guaranteed by the AI’s ability to understand our intentions: an AI could know exactly what we meant and yet be indifferent to that interpretation of our words (being motivated instead by some other interpretation of the words or being indifferent to our words altogether).
Solution: the correct motivation should ideally be installed in the seed AI before it becomes capable of fully representing human concepts or understanding human intentions.
I 253
[Further] value-loading techniques: - Evolutionary selection: Powerful search may find a design that satisfies the formal search criteria but not our intentions.
- Value accretion: (…) the human value-accretion dispositions might be complex and difficult to replicate in a seed AI.
Problem: A bad approximation may yield an AI that generalizes differently than humans do and therefore acquires unintended final goals.
I 254
- Motivational scaffolding: encourage a system to develop internal high-level representations that are transparent to humans (while keeping the system’s capabilities below the dangerous level) and then to use those representations to design a new goal system. - Emulation modulation: If machine intelligence is achieved via the emulation pathway, it would likely be possible to tweak motivations through the digital equivalent of drugs or by other means.
- Institution design: Various strong methods of social control could be applied in an institution composed of emulations. In principle, social control methods could also be applied in an institution composed of artificial intelligences. >Ethics/superintelligence/Bostrom, >Ethics/superintelligence/Yudkowsky, >Norms/Bostrom.

Bostrom I
Nick Bostrom
Superintelligence. Paths, Dangers, Strategies Oxford: Oxford University Press 2017


The author or concept searched is found in the following controversies.
Disputed term/author/ism Author Vs Author
Entry
Reference
Chomsky, N. Putnam Vs Chomsky, N. Chomsky I 293
PutnamVsChomsky: Putnam assumes for phonetics in the universal grammar, that it only has a single list of sounds. This did not require a sophisticated explanatory hypothesis. Only "memory span and powers of recollection". "No upright behaviorist would deny that these are innate properties." ChomskyVsPutnam: but there have been set up very strong empirical hypotheses about the selection of the universal distinctive features, none of which seems to be explained on the basis of restrictions of memory.
Chomsky I 298
PutnamVsChomsky: Thesis: instead of an innate schematism, "general multipurpose strategies" could be assumed. This innate base would have to be the same for the acquisition of any knowledge, so that there is nothing special about language acquisition.
Chomsky I 299
ChomskyVsPutnam: with that he is no longer entitled to assume something is innate. Furthermore, it only shifts the problem. PutnamVsChomsky: the evaluation functions proposed in the universal grammar "the kind of facts is constituted which tries to explain the theory of learning, but not the required explanation itself".
ChomskyVsPutnam: E.g. no one would say that the genetic basis for the development of arms instead of wings was "the kind of fact that attempts to explain the theory of learning". Rather, they are the basis for an explanation of other facts of human behavior.
Whether the evaluation function is learned or is the basis of learning, is an empirical question.
PutnamVsChomsky: certain ambiguities can only be discovered by routine, therefore their postulated explanation by Chomsky's grammar is not very impressive.
ChomskyVsPutnam: he misunderstands it, in fact that refers to competence and not to performance (actual practice).
What the grammar explains is why e.g. in "criticism of students" "student" can be understood as subject or object, whereas e.g. "grain" in "the growing of the grain" can only be subject.
The question of routine does not matter here.
Chomsky I 300
Innate Ideas/ChomskyVsPutnam: the innate representation of universal grammar indeed solves the problem of learning (at least partly) if it is really true that this is the basis for language acquisition, which may very well be the case!
Putnam III 87
Putnam/Chomsky: Putnam proposes: correctness in linguistics is what the currently available data best explain about the behavior of the speaker under a current interest. What is true today, will be false tomorrow. PutnamVsChomsky: I never said that what is right today, will be wrong tomorrow.
Putnam: Chomsky's hidden main theses:
1) the we are free to choose our interests at will,
2) that interests themselves are not subject to normative criticism.
E.g. Hans' heart attack lies in the defiance of medical recommendations. Other explanation: high blood pressure. It may be, in fact, that on one day one fact is more in the interests of the speaker, and the next day another one.
III 88
PutnamVsChomsky: 1) we cannot just pick and choose our interests. 2) It sometimes happens that the relevance of a particular interest is disputed. How can it be, however, that some interests are more reasonable than others? Reasonableness is supposed to depend on different conditions in different contexts. There is no general answer.
III 88/89
The assertion that a concept is interest relative does not come out at the same as the thesis, all interests are equally reasonable.

Putnam I
Hilary Putnam
Von einem Realistischen Standpunkt
In
Von einem realistischen Standpunkt, Vincent C. Müller Frankfurt 1993

Putnam I (a)
Hilary Putnam
Explanation and Reference, In: Glenn Pearce & Patrick Maynard (eds.), Conceptual Change. D. Reidel. pp. 196--214 (1973)
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (b)
Hilary Putnam
Language and Reality, in: Mind, Language and Reality: Philosophical Papers, Volume 2. Cambridge University Press. pp. 272-90 (1995
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (c)
Hilary Putnam
What is Realism? in: Proceedings of the Aristotelian Society 76 (1975):pp. 177 - 194.
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (d)
Hilary Putnam
Models and Reality, Journal of Symbolic Logic 45 (3), 1980:pp. 464-482.
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (e)
Hilary Putnam
Reference and Truth
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (f)
Hilary Putnam
How to Be an Internal Realist and a Transcendental Idealist (at the Same Time) in: R. Haller/W. Grassl (eds): Sprache, Logik und Philosophie, Akten des 4. Internationalen Wittgenstein-Symposiums, 1979
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (g)
Hilary Putnam
Why there isn’t a ready-made world, Synthese 51 (2):205--228 (1982)
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (h)
Hilary Putnam
Pourqui les Philosophes? in: A: Jacob (ed.) L’Encyclopédie PHilosophieque Universelle, Paris 1986
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (i)
Hilary Putnam
Realism with a Human Face, Cambridge/MA 1990
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam I (k)
Hilary Putnam
"Irrealism and Deconstruction", 6. Giford Lecture, St. Andrews 1990, in: H. Putnam, Renewing Philosophy (The Gifford Lectures), Cambridge/MA 1992, pp. 108-133
In
Von einem realistischen Standpunkt, Vincent C. Müller Reinbek 1993

Putnam II
Hilary Putnam
Representation and Reality, Cambridge/MA 1988
German Edition:
Repräsentation und Realität Frankfurt 1999

Putnam III
Hilary Putnam
Renewing Philosophy (The Gifford Lectures), Cambridge/MA 1992
German Edition:
Für eine Erneuerung der Philosophie Stuttgart 1997

Putnam IV
Hilary Putnam
"Minds and Machines", in: Sidney Hook (ed.) Dimensions of Mind, New York 1960, pp. 138-164
In
Künstliche Intelligenz, Walther Ch. Zimmerli/Stefan Wolf Stuttgart 1994

Putnam V
Hilary Putnam
Reason, Truth and History, Cambridge/MA 1981
German Edition:
Vernunft, Wahrheit und Geschichte Frankfurt 1990

Putnam VI
Hilary Putnam
"Realism and Reason", Proceedings of the American Philosophical Association (1976) pp. 483-98
In
Truth and Meaning, Paul Horwich Aldershot 1994

Putnam VII
Hilary Putnam
"A Defense of Internal Realism" in: James Conant (ed.)Realism with a Human Face, Cambridge/MA 1990 pp. 30-43
In
Theories of Truth, Paul Horwich Aldershot 1994

SocPut I
Robert D. Putnam
Bowling Alone: The Collapse and Revival of American Community New York 2000

Chomsky I
Noam Chomsky
"Linguistics and Philosophy", in: Language and Philosophy, (Ed) Sidney Hook New York 1969 pp. 51-94
In
Linguistik und Philosophie, G. Grewendorf/G. Meggle Frankfurt/M. 1974/1995

Chomsky II
Noam Chomsky
"Some empirical assumptions in modern philosophy of language" in: Philosophy, Science, and Method, Essays in Honor of E. Nagel (Eds. S. Morgenbesser, P. Suppes and M- White) New York 1969, pp. 260-285
In
Linguistik und Philosophie, G. Grewendorf/G. Meggle Frankfurt/M. 1974/1995

Chomsky IV
N. Chomsky
Aspects of the Theory of Syntax, Cambridge/MA 1965
German Edition:
Aspekte der Syntaxtheorie Frankfurt 1978

Chomsky V
N. Chomsky
Language and Mind Cambridge 2006