human compatibe, by stuart russell

2/24/2020

When Steve Schwarzman donated a record-breaking $188M to Oxford in June 2019, I was a little surprised. The goal of steering the ethical adoption of artificial intelligence seemed vague and frankly, over the top. Fast forward 8 months - I finally get it.

At the start of his book, Stuart Russell lets us know that our current march towards superhuman intelligence is unstoppable - but success might be the undoing of the human race. People are, without even realizing it, constantly developing narrow-focused AI using broader techniques that ultimately push us closer to general-purpose AI. But for super-intelligent AI to become a reality we need more than just fast computers, we need conceptual breakthroughs. Computers need to understand language, have common sense, discover actions and more.

Civilization advances by extending the number of important operations which we can perform without thinking about them. - Alfred North Whitehead

This all sounds great, and the economic impact of super-intelligent systems could be huge. But there are a lot of issues we need to work through before that happens. These include:

An Infopocalypse. This is when fake information floods the internet and it could lead to misuses of AI (like blackmail, deep-fakes, and false identities)
AI will inevitably take jobs from people - jobs that can be outsourced are prime candidates for automation
An intelligence explosion, where one super-intelligent machine will just recursively self improve and create even smarter machines. If this happens, at what point do humans become the tools of computer systems, instead of the other way around?

In lieu of these concerns, Russell says that our goal should be to design highly intelligent machines that solve tough problems, without making them such that they make us unhappy. A little vague, I know. He lays out a few principles for these so-called "beneficial machines":

The machine’s only objective is to maximize the realization of human preferences
The machine is initially uncertain what these preferences are
The ultimate source of info about human preferences is human behavior

Russell also raises some really interesting points that relate to these principles and super-intelligence in general:

The economic incentives to develop responsible AI should keep the industry in check. This reminds me of the cannabis industry in Amsterdam. Supplying cannabis into the country is actually illegal but because tourism is so heavily reliant on it, the industry is effectively self-regulated
Off Switch Problem: People often say that if a super-intelligent machine becomes problematic, we can just switch it off. But a machine with an objective will not allow itself to be switched off because this would prevent it from achieving its objective. Hence, disabling the off switch is an instrumental goal, i.e. it is a subgoal of the original goal
If the ultimate source of info about human preferences should be/is indeed human behavior, and these preferences change over time - you see the problem here, right? For example, my preferences are probably very different to, let’s say, a Roman gladiator’s
Intentional preference engineering is when machines modify human preferences (with the aid of humans) to do things such as increase altruism, reduce envy, etc - this introduces some serious ethical questions

I love books that make me think. If you want a book that will not only help you understand why super-intelligent AI is a relevant issue, but also forces you to think - I would definitely recommend starting with Human Compatible.