Philosophy Dictionary of Arguments

Home Screenshot Tabelle Begriffe

Author Concept Summary/Quotes Sources

AI Research on Goals - Dictionary of Arguments

Bostrom I 126
Goals/superintelligence/AI Research/Bostrom: Is it possible to say anything about what a superintelligence with a decisive
I 127
strategic advantage would want?
I 129
Motivation/intelligence/superintelligent will/orthogonality/Bostrom: Intelligent search for instrumentally optimal plans and policies can be performed in the service of any goal. Intelligence and motivation are in a sense orthogonal: we can think of them as two axes spanning a graph in which each point represents a logically possible artificial agent. Some qualifications could be added to this picture. For instance, it might be impossible for a very unintelligent system to have very complex motivations.
I 130
Def Orthogonality thesis/Bostrom: Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.
According to the orthogonality thesis, artificial agents can have utterly non-anthropomorphic goals.
-Predictability through design:
I 131
(…) even before an agent has been created we might be able to predict something about its behavior, if we know something about who will build it and what goals they will want it to have.
-Predictability through inheritance. If a digital intelligence is created directly from a human template (as would be the case in a high-fidelity whole brain emulation), then the digital intelligence might inherit the motivations of the human template.
-Predictability through convergent instrumental reasons: (…) we may be able to infer something about its more immediate objectives by considering the instrumental reasons that would arise for any of a wide range of possible final goals in a wide range of situations.
I 132
Def Instrumental convergence thesis/Bostrom: Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents. >Goals/Omohundro.
Where there are convergent instrumental values, we may be able to predict some aspects of a superintelligence’s behavior:
-Self-preservation: Most humans seem to place some final value on their own survival. This is not a necessary feature of artificial agents: some may be designed to place no final value whatever on their own survival.
-Goal-content integrity: If an agent retains its present goals into the future, then its present goals will be more likely to be achieved by its future self. This gives the agent a present instrumental reason to
I 133
prevent alterations of its final goals. For software agents, which can easily switch bodies or create exact duplicates of themselves, preservation of self as a particular implementation or a particular physical object need not be an important instrumental value. Advanced software agents might also be able to swap memories, download skills, and radically modify their cognitive architecture and personalities.
I 141
Orthogonality thesis/Bostrom: (see above) the orthogonality thesis suggests that we cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans (…).
I 270
Goals/ethics/morality/superintelligence/Bostrom: Consider, for example, the following “reasons-based” goal:
Do whatever we would have had most reason to ask the AI to do.
((s)VsBostrom: Here it is assumed that the AI has no reason to falsify our intentions.
I 272
Bostrom: components for choices of behavior:
-Goal content: What objective should the AI pursue? How should a description of this objective be interpreted?
-Decision theory: Should the AI use causal decision theory, evidential decision theory, updateless decision theory, or something else?
-Epistemology: What should the AI’s prior probability function be (…).What theory of anthropics should it use?
-Ratification: Should the AI’s plans be subjected to human review before being put into effect? If so, what is the protocol for that review process?
>Ethics/superintelligence/Bostrom, >Ethics/superintelligence/Yudkowsky, >Norms/Bostrom.

Explanation of symbols: Roman numerals indicate the source, arabic numerals indicate the page number. The corresponding books are indicated on the right hand side. ((s)…): Comment by the sender of the contribution. Translations: Dictionary of Arguments
The note [Concept/Author], [Author1]Vs[Author2] or [Author]Vs[term] resp. "problem:"/"solution:", "old:"/"new:" and "thesis:" is an addition from the Dictionary of Arguments. If a German edition is specified, the page numbers refer to this edition.
AI Research
Bostrom I
Nick Bostrom
Superintelligence. Paths, Dangers, Strategies Oxford: Oxford University Press 2017

Send Link

Authors A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q   R   S   T   U   V   W   Y   Z  

Concepts A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q   R   S   T   U   V   W   Z  

Ed. Martin Schulz, access date 2022-01-18
Legal Notice   Contact   Data protection declaration