A team of computer scientists has used theoretical calculations to argue that algorithms could not control a super-intelligent AI.
Their study addresses what Oxford philosopher Nick Bostrom calls the control problem: how do we ensure super-intelligence machines act in our interests?
The researchers conceived of a theoretical containment algorithm that would resolve this problem by simulating the AI‘s behavior, and halting the program if its actions became harmful.
If you break the problem down to basic rules from theoretical computer science, it turns out that an algorithm that would command an AI not to destroy the world could inadvertently halt its own operations. If this happened, you would not know whether the containment algorithm is still analyzing the threat, or whether it has stopped to contain the harmful AI. In effect, this makes the containment algorithm unusable.
[Read: Meet the 4 scale-ups using data to save the planet]
The study found that no single algorithm could calculate whether an AI would harm the world, due to the fundamental limits of computing:
Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world, strict containment requires simulations of such a program, something theoretically (and practically) impossible.
This type of AI remains confined to the realms of fantasy — for now. But the researchers note the tech is making strides towards the type of super-intelligent systems envisioned by science fiction writers.
“There are already machines that perform certain important tasks independently without programmers fully understanding how they learned it,” said study co-author Manuel Cebrian of the Max Planck Institute for Human Development.
“The question therefore arises whether this could at some point become uncontrollable and dangerous for humanity.”
You can read the study paper in the Journal of Artificial Intelligence Research.
Published January 12, 2021 — 18:40 UTC