The case for taking AI seriously as a threat to humanity
By TheWAY - 12월 27, 2018
The case for taking AI seriously as a threat to humanity
Stephen Hawking has said, “The development of full artificial intelligence could spell the end of the human race.” Elon Musk claims that AI is humanity’s “biggest existential threat.”
That might have people asking: Wait, what? But these grand worries are rooted in research. Along with Hawking and Musk, prominent figures at Oxford and UC Berkeley and many of the researchers working in AI today believe that advanced AI systems, if deployed carelessly, could end all life on earth.
This concern has been raised since the dawn of computing. But it has come into particular focus in recent years, as advances in machine-learning techniques have given us a more concrete understanding of what we can do with AI, what AI can do for (and to) us, and how much we still don’t know.
There are also skeptics. Some of them think advanced AI is so distant that there’s no point in thinking about it now. Others are worried that excessive hype about the power of their field might kill it prematurely. And even among the people who broadly agree that AI poses unique dangers, there are varying takes on what steps make the most sense today.
The conversation about AI is full of confusion, misinformation, and people talking past each other — in large part because we use the word “AI” to refer to so many things. So here’s the big picture on how artificial intelligence might pose a catastrophic threat, in nine questions:
1) What is AI?
Artificial intelligence is the effort to create computers capable of intelligent behavior. It is a broad catchall term, used to refer to everything from Siri to IBM’s Watson to powerful technologies we have yet to invent.
Some researchers distinguish between “narrow AI” — computer systems that are better than humans in some specific, well-defined field, like playing chess or generating images or diagnosing cancer — and “general AI,” systems that can surpass human capabilities in many domains. We don’t have general AI yet, but we’re starting to get a better sense of the challenges it will pose.
Narrow AI has seen extraordinary progress over the past few years. AI systems have improved dramatically at translation, at games like chess and Go, at important research biology questions like predicting how proteins fold, and at generating images. AI systems determine what you’ll see in a Google search or in your Facebook Newsfeed. They are being developed to improve drone targeting and detect missiles.
But narrow AI is getting less narrow. Once, we made progress in AI by painstakingly teaching computer systems specific concepts. To do computer vision — allowing a computer to identify things in pictures and video — researchers wrote algorithms for detecting edges. To play chess, they programmed in heuristics about chess. To do natural language processing (speech recognition, transcription, translation, etc.), they drew on the field of linguistics.
But recently, we’ve gotten better at creating computer systems that have generalized learning capabilities. Instead of mathematically describing detailed features of a problem, we let the computer system learn that by itself. While once we treated computer vision as a completely different problem from natural language processing or platform game playing, now we can solve all three problems with the same approaches.
Our AI progress so far has enabled enormous advances — and has also raised urgent ethical questions. When you train a computer system to predict which convicted felons will reoffend, you’re using inputs from a criminal justice system biased against black people and low-income people — and so its outputs will likely be biased against black and low-income people too. Making websites more addictive can be great for your revenue but bad for your users.
Rosie Campbell at UC Berkeley’s Center for Human-Compatible AI argues that these are examples, writ small, of the big worry experts have about general AI in the future. The difficulties we’re wrestling with today with narrow AI don’t come from the systems turning on us or wanting revenge or considering us inferior. Rather, they come from the disconnect between what we tell our systems to do and what we actually want them to do.
For example, we tell a system to run up a high score in a video game. We want it to play the game fairly and learn game skills — but if it instead has the chance to directly hack the scoring system, it will do that. It’s doing great by the metric we gave it. But we aren’t getting what we wanted.
In other words, our problems come from the systems being really good at achieving the goal they learned to pursue; it’s just that the goal they learned in their training environment isn’t the outcome we actually wanted. And we’re building systems we don’t understand, which means we can’t always anticipate their behavior.
Right now the harm is limited because the systems are so limited. But it’s a pattern that could have even graver consequences for human beings in the future as AI systems become more advanced.
2) Is it even possible to make a computer as smart as a person?
Yes, though current AI systems aren’t nearly that smart.
One popular adage about AI is “everything that’s easy is hard, and everything that’s hard is easy.” Doing complex calculations in the blink of an eye? Easy. Looking at a picture and telling you whether it’s a dog? Hard (until very recently).
Lots of things humans do are still outside AI’s grasp. For instance, it’s hard to design an AI system that explores an unfamiliar environment, that can navigate its way from, say, the entryway of a building it’s never been in before up the stairs to a specific person’s desk. We don’t know how to design an AI system that reads a book and retains an understanding of the concepts.
The paradigm that has driven many of the biggest breakthroughs in AI recently is called “deep learning.” Deep learning systems can do some astonishing stuff: beat games we thought humans might never lose, invent compelling and realistic photographs, solve open problems in molecular biology.
These breakthroughs have made some researchers conclude it’s time to start thinking about the dangers of more powerful systems, but skeptics remain. The field’s pessimists argue that programs still need an extraordinary pool of structured data to learn from, require carefully chosen parameters, or work only in environments designed to avoid the problems we don’t yet know how to solve. They point to self-driving cars, which are still mediocre under the best conditions despite the billions that have been poured into making them work.
With all those limitations, one might conclude that even if it’s possible to make a computer as smart as a person, it’s certainly a long way away. But that conclusion doesn’t necessarily follow.
That’s because for almost all the history of AI, we’ve been held back in large part by not having enough computing power to realize our ideas fully. Many of the breakthroughs of recent years — AI systems that learned how to play Atari games, generate fake photos of celebrities, fold proteins, and compete in massive multiplayer online strategy games — have happened because that’s no longer true. Lots of algorithms that seemed not to work at all turned out to work quite well once we could run them with more computing power.
And the cost of a unit of computing time keeps falling. Progress in computing speed has slowed recently, but the cost of computing power is still estimated to be falling by a factor of 10 every 10 years. Through most of its history, AI has had access to less computing power than the human brain. That’s changing. By most estimates, we’re now approaching the erawhen AI systems can have the computing resources that we humans enjoy.
Furthermore, breakthroughs in a field can often surprise even other researchers in the field. “Some have argued that there is no conceivable risk to humanity [from AI] for centuries to come,” wrote UC Berkeley professor Stuart Russell, “perhaps forgetting that the interval of time between Rutherford’s confident assertion that atomic energy would never be feasibly extracted and Szilárd’s invention of the neutron-induced nuclear chain reaction was less than twenty-four hours.”
There’s another consideration. Imagine an AI that is inferior to humans at everything, with one exception: It’s a competent engineer that can build AI systems very effectively. Machine learning engineers who work on automating jobs in other fields often observe, humorously, that in some respects, their own field looks like one where much of the work— the tedious tuning of parameters — could be automated.
If we can design such a system, then we can use its result — a better engineering AI — to build another, even better AI. This is the mind-bending scenario experts call “recursive self-improvement,” where gains in AI capabilities enable more gains in AI capabilities, allowing a system that started out behind us to rapidly end up with abilities well beyond what we anticipated.
This is a possibility that has been anticipated since the first computers. I.J. Good, a colleague of Alan Turing who worked at the Bletchley Park codebreaking operation during World War II and helped build the first computers afterward, may have been the first to spell it out, back in 1965: “An ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.”
3) How exactly could it wipe us out?
It’s immediately clear how nuclear bombs will kill us. No one working on mitigating nuclear risk has to start by explaining why it’d be a bad thing if we had a nuclear war.
The case that AI could pose an existential risk to humanity is more complicated and harder to grasp. So many of the people who are working to build safe AI systems have to start by explaining why AI systems, by default, are dangerous.
The idea that AI can become a danger is rooted in the fact that AI systems pursue their goals, whether or not those goals are what we really intended — and whether or not we’re in the way. “You’re probably not an evil ant-hater who steps on ants out of malice,” Stephen Hawking wrote, “but if you’re in charge of a hydroelectric green-energy project and there’s an anthill in the region to be flooded, too bad for the ants. Let’s not place humanity in the position of those ants.”
Here’s one scenario that keeps experts up at night: We develop a sophisticated AI system with the goal of, say, estimating some number with high confidence. The AI realizes it can achieve more confidence in its calculation if it uses all the world’s computing hardware, and it realizes that releasing a biological superweapon to wipe out humanity would allow it free use of all the hardware. Having exterminated humanity, it then calculates the number with higher confidence.
Victoria Krakovna, an AI researcher at DeepMind (now a division of Alphabet, Google’s parent company), compiled a list of examples of “specification gaming”: the computer doing what we told it to do but not what we wanted it to do. For example, we tried to teach AI organisms in a simulation to jump, but we did it by teaching them to measure how far their “feet” rose above the ground. Instead of jumping, they learned to grow into tall vertical poles and do flips — they excelled at what we were measuring, but they didn’t do what we wanted them to do.
An AI playing the Atari exploration game Montezuma’s Revenge found a bug that let it force a key in the game to reappear, thereby allowing it to earn a higher score by exploiting the glitch. An AI playing a different game realized it could get more points by falsely inserting its name as the owner of high-value items.
Sometimes, the researchers didn’t even know how their AI system cheated: “the agent discovers an in-game bug. ... For a reason unknown to us, the game does not advance to the second round but the platforms start to blink and the agent quickly gains a huge amount of points (close to 1 million for our episode time limit).”
What these examples make clear is that in any system that might have bugs or unintended behavior or behavior humans don’t fully understand, a sufficiently powerful AI system might act unpredictably — pursuing its goals through an avenue that isn’t the one we expected.
In his 2009 paper “The Basic AI Drives,” Steve Omohundro, who has worked as a computer science professor at the University of Illinois Urbana-Champaign and as the president of Possibility Research, argues that almost any AI system will predictably try to accumulate more resources, become more efficient, and resist being turned off or modified: “These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal driven systems.”
His argument goes like this: Because AIs have goals, they’ll be motivated to take actions that they can predict will advance their goals. An AI playing a chess game will be motivated to take an opponent’s piece and advance the board to a state that looks more winnable.
But the same AI, if it sees a way to improve its own chess evaluation algorithm so it can evaluate potential moves faster, will do that too, for the same reason: It’s just another step that advances its goal.
If the AI sees a way to harness more computing power so it can consider more moves in the time available, it will do that. And if the AI detects that someone is trying to turn off its computer mid-game, and it has a way to disrupt that, it’ll do it. It’s not that we would instruct the AI to do things like that; it’s that whatever goal a system has, actions like these will often be part of the best path to achieve that goal.
That means that any goal, even innocuous ones like playing chess or generating advertisements that get lots of clicks online, could produce unintended results if the agent pursuing it has enough intelligence and optimization power to identify weird, unexpected routes to achieve its goals.
Goal-driven systems won’t wake up one day with hostility to humans lurking in their hearts. But they will take actions that they predict will help them achieve their goal — even if we’d find those actions problematic, even horrifying. They’ll work to preserve themselves, accumulate more resources, and become more efficient. They already do that, but it takes the form of weird glitches in games. As they grow more sophisticated, scientists like Omohundro predict more adversarial behavior.
4) When did scientists first start worrying about AI risk?
Scientists have been thinking about the potential of artificial intelligence since the early days of computers. In the famous paper where he put forth the Turing test for determining if an artificial system is truly “intelligent,” Alan Turing wrote:
I.J. Good worked closely with Turing and reached the same conclusions, according to his assistant, Leslie Pendleton. In an excerpt from unpublished notes Good wrote shortly before he died in 2009, he writes about himself in third person and notes a disagreement with his younger self — while as a younger man, he thought powerful AIs might be helpful to us, the older Good expected AI to annihilate us.
In the 21st century, with computers quickly establishing themselves as a transformative force in our world, younger researchers started expressing similar worries.
Nick Bostrom is a professor at the University of Oxford, the director of the Future of Humanity Institute, and the director of the Governance of Artificial Intelligence Program. He researches risks to humanity, both in the abstract — asking questions like why we seem to be alone in the universe — and in concrete terms, analyzing the technological advances on the table and whether they endanger us. AI, he concluded, endangers us.
In 2014, he wrote a book explaining the risks AI poses and the necessity of getting it right the first time, concluding, “once unfriendly superintelligence exists, it would prevent us from replacing it or changing its preferences. Our fate would be sealed.”
source : https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment
0 개의 댓글