AI capabilities have already been expanding at an exponential rate, largely as a result of the distributed network effects of programmers building on each other’s breakthroughs. Given these recent developments, experts are forecasting future improvements at a double exponential rate, which begins to look on a graph like a vertical line of potentiality exploding upward. The term generally used to describe this phenomenon, which heretofore has been a hypothetical thought experiment, is a Singularity.
Artificial intellence (AI) is becoming increasingly capable of learning how to make itself more powerful. In recent examples, AI models have learned to generate their own training data to self-improve, and to edit sections of code so as to make the code work at more than double the speed.
The alignment problem with developing AI represents a profound risk to human flourishing. What would happen if a superhuman intelligence wants to achieve some goal that’s out of alignment with the conditions required for human welfare—or for that matter, the survival of life itself on Earth? This misalignment could simply be the result of misguided human programming.
It’s quite conceivable that a superintelligent AI could develop its own goal orientation, which would be highly likely to be misaligned with human flourishing. The AI might not see humans as an enemy to be eliminated, but we could simply become collateral damage to its own purposes, in the same way that orangutans, mountain gorillas, and a myriad other species face extinction as the result of human activity. For example, a superintelligence might want to optimize the Earth’s atmosphere for its own processing speed, leading to a biosphere that could no longer sustain life.