Download the PowerPoint

Ethics for Machines J Storrs Hall Stick-Built AI ● ● ● ● ● ● No existing AI is intelligent Intelligence implies the ability to learn Existing AIs are really “artificial skills” A human e.g. Grandmaster will have learned the chess-playing skills It's the learning that's the intelligent part Providing ethical constraints to stick-built AI is just a matter of competent design Autogenous AI ● ● ● Truly intelligent AI would learn and grow Create new concepts and understand the world in new ways This is a problem for the ethical engineer:   Cannot know what concepts the AI will have Can't write rules in terms of them The Rational Architecture ● ● ● ● WM (world model) predicts the results of actions UF (utility function) evaluates possible worlds The AI evaluates the effects of its possible actions and does what it predicts will have the best results This is an ideal except in the case of very simplified worlds (e.g. chess) Learning Rational AI ● ● ● WM is updated to use new concepts to describe the world in WM changes can be evaluated based on how well they predict But on what basis can we update the UF? Vernor Vinge: A mind that stays at the same capacity cannot live forever; after a few thousand years it would look more like a repeating tape loop than a person. ... To live indefinitely long, the mind itself must grow ... and when it becomes great enough, and looks back ... what fellow-feeling can it have with the soul that it was originally? Invariants ● ● We must find properties that are invariant across the evolutionary process Base our moral designs on those A Structural Invariant ● ● Maintaining the grasp, range, and validity of the WM is a necessary subgoal for virtually anything else the AI might want to do Socrates put it:  There is only one good, namely, knowledge; and only one evil, namely, ignorance. Evolving AI ● Current AI only evolves like any engineered artifact  ● The better it works, the more likely the design is to be copied in the next generation Once AIs have a hand in creating new AIs themselves, there will be a strong force toward self-interest The Moral Ladder ● ● Axelrod's “Evolution of Cooperation” Subsequent research expanding it      ALWAYS DEFECT is optimal in random environment GRIM is optimal in env. of 2-state strategies TIT-FOR-TAT in env. of human-written strategies PAVLOV in env. cleared out by TIT-FOR-TAT etc. Open Source Honesty ● ● ● ● ● Intelligent autonomous agents are always better off if they can cooperate Even purely self-interested ones Ascending the moral evolutionary ladder requires finding others one can trust AIs might be able to create protocols that would guarantee their motives e.g. Public-key signed release of UF The Horizon Effect ● ● ● ● Short planning horizons produce unoptimal behavior A planning horizon commensurate with the AI's predictive ability is evolutionarily stable What goes around, comes around, especially in an environment of superintelligent AIs Honesty really is the best policy Invariant Traits ● ● ● ● ● Curious, e.g strongly motivated to increase its understanding of the world Self-interested Understands evolutionary dynamics of the moral ladder Capable of guranteeable trustworthiness Long planning horizon The Moral Machine ● If we start an AI with these traits, they are unlikely to disappear in their essence, even if they get changed in their details beyond current recognition

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download the PowerPoint