The idea that we should aim to shape smarter-than-human artificial intelligence (AI) is among the most well-known crucial considerations of effective altruism. In this post, I will detail my stance on this issue. I assume that you are familiar with the main argument for why AI might be pivotal; if not, I recommend reading Lukas Gloor’s essay Altruists Should Prioritize Artificial Intelligence first.
Contents
Clarifications and definitions
To avoid confusion, we need to clarify what we mean by “AI” and “AI priority”. This can refer to many possible meanings, which is why we need to distinguish precisely between the different claims and definitions.
I use the term “artificial superintelligence” (ASI) to mean “a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds”1, and the terms “distributed machine intelligence” or “economy-like AI” for Hansonian scenarios. Without qualifiers, “artificial intelligence”(AI) is a supercategory that includes contemporary AI systems and scenarios that fall into neither category.
With these definitions, consider the following claims about the relevance of AI:
Proposition 1a (Strong ASI priority): When evaluating the consequences of an intervention, almost all of the variance in expected suffering2 stems from its influence on ASI. Efforts to shape ASI either directly by doing technical AI safety work or indirectly are orders of magnitude more effective than any other intervention.
Proposition 1b (Weak ASI priority): When evaluating the consequences of an intervention, a significant part (>20%) of the variance in total expected suffering stems from its influence on ASI. Efforts to shape ASI either directly by doing technical AI safety work or indirectly are at least in the same ballpark as other interventions.
Proposition 2a (Strong AI priority): Like Proposition 1a, but with any kind of AI, not necessarily ASI.
Proposition 2b (Weak AI priority): Like Proposition 1b, but with any kind of AI, not necessarily ASI.
Strong ASI priority is the strongest statement since it implies all the others. Strong AI priority implies Weak AI priority and Weak ASI priority implies Weak AI priority.
As a notational shorthand, I write ASI priority when I refer to both the strong and the weak version, and A(S)I priority to refer to both ASI priority and AI priority.
When considering Aumann evidence from the opinions of others – such as the majority of AI researchers –we need a variant of the claim that does not presuppose altruistic motivation:
Proposition 3a (Strong ASI pivotality): ASI-related technological progress is likely to be the most pivotal development of our time, as judged by an average person with a crystal ball.
Proposition 3b (Weak ASI pivotality): ASI-related technological progress is likely to be an important development of our time as judged by an average person with a crystal ball.
Proposition 4a (Strong AI pivotality): AI-related technological progress is likely to be the most pivotal development of our time, as judged by an average person with a crystal ball.
Proposition 4b (Weak AI pivotality): AI-related technological progress is likely to be an important development of our time as judged by an average person with a crystal ball.
As we’ve seen earlier, any given strong claim implies the corresponding weak claim, and any given ASI claim implies the corresponding AI claim. Of course, the distinction between strong and weak claims is arbitrary, that is, you may also endorse a claim of intermediate strength.
A(S)I pivotality claims relate to A(S)I priority claims in that an altruistic person who endorses one of the pivotality claims will likely endorse the corresponding priority claim and vice versa. That said, they do not logically imply each other. One may believe that AI is pivotal, but that it is almost impossible to influence it (pivotality but not priority); or one may think that AI will probably not be pivotal or not happen at all, but that Pascalian reasoning still makes it a top priority for effective altruists (priority but not pivotality).3
Weak arguments against the A(S)I priority
Unpredictability of technological developments
The history of science shows that predictions about new technologies are difficult. The actual developments usually differ from predictions regarding the time needed to develop the technology, the form it will take, and whether the technology arrives at all.
In discussions of ASI, this is exacerbated by the long time horizons and the speculative nature of the predictions. A priori, it would be surprising if predictions on this level of detail and speculativeness are exactly right. It seems more likely that they are misguided in some way, just like past predictions of technological developments have been wrong in thousands of ways.
This does not necessarily imply that the topic is not relevant at all, though that’s also possible. Rather, it could mean that AI takes a form other than superintelligence, such as economy-like AI. It could mean that the path to the creation of AI may be surprising because other game-changing technologies come first, or that technology develops differently in a way that we cannot even conceive of right now.
A scenario that I find more plausible than the usual superintelligence scenarios is that AI systems achieve superhuman performance in more and more domains, but not all at once. This calls the notion of artificial general intelligence into question. Granted, the end result may still be an agent-like superintelligence, but altruistic efforts to shape it are less promising if the emergence of AI is gradual as opposed to a bottleneck-like single point in time. This is because more people will work on safety in that case and because intelligence amplification techniques may render our contribution to technical questions irrelevant. It may also mean that MIRI-style work on the “control problem” is misguided. (For more details, see Strategic implications of AI scenarios.)
This is an argument against ASI priority and ASI pivotality to the extent to which these claims are based on fairly specific scenarios. It applies (less strongly) to AI priority in general because unpredictability makes it more difficult to find effective interventions.
Influencing technology is hard
Even if we were able to predict new technologies, we would still struggle to influence them, let alone influence them in a robustly positive direction. This is especially true if AI is distributed and we cannot identify a single lever to pull on. Powerful economic or egoistic forces drive technological development, which makes it “quasi-deterministic” in many ways. For instance, attempting to stop technological progress altogether seems futile.
More concretely, AI researchers and companies will likely work a lot on safety as soon as (and if) AI systems become ubiquitous. If emulations, iterated embryo selection, or other forms of strong intelligence enhancement become feasible, they will be applied to AI safety, too, which might render our contribution irrelevant. This means that the work of a few effective altruists on technical questions may be a drop in the ocean unless they focus on specific questions that may otherwise remain neglected, such as how to prevent suffering risks.
On the flip side, other ways of reducing suffering in the far future, such as trying to spread altruistic values, may be comparably difficult. This may be a feature of any attempt to shape the far future, not of AI-related interventions in particular.
To the extent to which it applies, this is an argument against all priority claims, but not against the pivotality claims.
Effective altruists in the past
Consider a suffering-focused effective altruist in the first half of the 20th century. What is the most important thing to do?
She may have thought that nuclear weapons4 will lead to a singleton by military supremacy, and influencing this singleton is all that matters. But, as we now know, history turned out differently: nuclear weapons were indeed invented, but not used to gain world dominance. Instead of betting on a specific scenario, it might have been wiser to do moral advocacy, movement building, or other interventions that are impactful across a wider range of possible futures.
Even if she correctly predicts a technological development, such as how factory farming will multiply animal suffering, it’s not clear what she would do to prevent it. Perhaps it was almost impossible to influence the development of industrialization in meaningful ways because it results from strong economic forces. Maybe the best thing to do in this situation is to build the animal movement by writing about antispecisism.
This argument applies to all priority claims and Strong A(S)I pivotality. This section is strongly related to the previous ones, though, so it does not count as independent evidence.
Heuristics
In cluster thinking, it makes sense to consider – with limited weight – crude heuristics such as the following:
- “Be sceptical about bold or very specific claims on future technology” seems to be a reasonable prior. While people sometimes underestimate technological developments in human history, we know of few cases of specific predictions that turned out right.
- To the extent to which superintelligence scenarios are a form Pascal’s wager, a heuristic against such wagers applies. But most proponents of work on AI do not view it as a form of Pascalian reasoning. I discuss whether the wager works later.
- Specific scenarios may trigger an absurdity heuristic, but this applies only weakly to the more general idea that new technologies such as AI may transform society. Also, the epistemic value of the absurdity heuristic is dubious.
- Millions of people will likely try to influence superintelligence. A sudden and unexpected AI takeoff is conceivable, but it is more likely that warning signs or an “AI Sputnik moment” will trigger large-scale societal debate. On the other hand, human politics rarely give sufficient weight to future generations or the voiceless suffering of non-human animals or digital beings, which is why work on suffering-focused AI safety still seems neglected.
Epistemology
Elite common sense is less concerned about risks from artificial intelligence than many effective altruists. In particular, many AI researchers view such scenarios as science fiction. That said, the topic became more acceptable over the last years as a result of Nick Bostrom’s book Superintelligence, and many AI researchers agree that AI systems will have a significant impact on society. For example, 1200 AI researchers signed the Asilomar AI principles. But such concerns are mostly about autonomous weapons, automatization, and other tangible impacts, which we should distinguish from more bold claims about superintelligence. The argument applies to the extent to which elite common sense still views ASI as less disruptive, less crucial, less likely, or further away.
What is more likely: that millions of smart people all over the world collectively fail to see how crucial superintelligence is, or that a handful of effective altruists got it wrong?
This is not as clear-cut as the question suggests, though, because most people have never heard of, let alone seriously engaged with, the arguments brought forward by proponents of work on ASI. Societal elites may also fail to realize its importance because of different values, that is because they don’t care about the far future in a consequentialist way.
That said, I also see possible epistemic problems in the effective altruism movement:
- Most thinking on superintelligence is based on a limited number of individuals (mostly Nick Bostrom and Eliezer Yudkowsky). Social interaction may explain why many in the movement are convinced.
- People who are convinced of the AI priority are more likely to stay involved, which leads to a selection effect. Also, the movement tends to attract people with a technical background and an openness to unusual ideas.
The arguments in this section apply mostly to Strong A(S)I pivotality and Strong A(S)I priority.
Gut check
In a world full of factory farms and wild animal suffering, what is the prior that the most important thing to do is to write math-heavy technical papers on AI? An initial reaction is that this might be a diversion from getting one’s hands dirty. Many people in the AI safety community seem to be driven by intellectual fascination rather than an altruistic urge to do something about a potential moral catastrophe, which is concerning. On the other hand, one might argue that the community is just not biased by empathy-related fuzzies and a need to see tangible fruits of their work.
This isn’t a strong point, though, since gut reactions are often wrong.
Weak arguments in favor of AI safety work
Robustness of the arguments
The argument that new technologies will play an important role in shaping the future, just as they have shaped human history, is quite robust. Advanced artificial intelligence appears to be at least a plausible candidate for a pivotal future technology, even if we can’t predict the details at this point. In this sense, the case for the relevance of AI does not hinge on specific scenarios.
This is an argument for AI pivotality and also for AI priority, but to a lesser extent because it’s hard to influence AI without envisioning specific scenarios.
The question remains whether directly influencing AI technology is more effective than moral advocacy or other broad interventions such as promoting international cooperation. But AI would need to be far more consequential (on expectation) than the average new technologies to make it a top priority for altruists. There is no toaster safety movement.5
Instrumental benefits
We may also decide to work on (suffering-focused) AI safety for (partially) instrumental reasons:
- It helps spread altruistic concern for suffering among AI researchers, which makes it more likely that they will implement precautionary safety measures. This is particularly important if AI researchers will play a crucial role in shaping the future, which I’m unsure of.
- Our work may inspire others to develop our ideas further.
- By working on AI safety, we build credibility and expertise in the AI domain, putting us in a better position to intervene later on.
This applies to all priority claims.
Weaker versions of the claims are widespread & commonsensical
Weaker claims along the lines of “AI will have a significant impact on society” are widely accepted among AI researchers and societal elites. In the last years, we’ve seen a lot of AI researchers wake up to the challenge. However, while many people now accept that some form of “AI safety” is relevant, they don’t necessarily agree with superintelligence, though we have also seen more discussion of that.
What about the wager?
One might argue that even if superintelligence was unlikely, altruists should still work on it because the potential impact is larger in such scenarios. Lukas Gloor puts it like this:
In a future where smarter-than-human artificial intelligence won’t be created, our altruistic impact – even if we manage to achieve a lot in greatly influencing this non-AI future – would be comparatively “capped” and insignificant when contrasted with the scenarios where our actions do affect the development of superintelligent AI (or how AI would act).
I buy into this argument to some extent, but the difference is not more than an order of magnitude. My best guess is that our impact in ASI scenarios is larger by a factor of 2-3.
The following list of (weak) arguments contribute to that judgment:
- Heuristics suggest scepticism about any claim that something dominates by many orders of magnitude. See e.g. Brian Tomasik’s essay Why Charities Don’t Differ Astronomically in Cost-Effectiveness and my post on why uncertainty smoothes out differences in impact.
- An anthropic leverage penalty punishes any hypothesis in which you occupy an extraordinary position of power. The question is how much evidence we have that we can influence the far future via shaping ASI. This is not as simple as it seems because we may be in a simulation or because a late filter might diminish our influence on the far future. I discuss this point in more detail here.
- It’s unlikely, but possible, that almost every ASI would become existentialist or self-modify radically, diluting our impact on the result.
- If a superintelligence acts based on human preferences, goal-preservation is not certain because the humans may change their values.
- It’s not completely obvious that we can’t have a comparable impact in non-superintelligence scenarios:
- Distributed machine intelligence may also lead to space colonization and goal preservation.
- Even if AI is not a game-changing technology, it may still become feasible to colonize space via other means. Humans may decide to lock in their values when a world government is formed, or humans might compromise before spreading into space.
- Path dependencies in values may constitute a crude form of value lock-in that is far weaker than goal preservation in an AI, but not astronomically weaker.
- We may discover unknown unknowns that are relevant even without superintelligence.
Overall, the wager is a legitimate argument for work on ASI, but it’s not a slam dunk. In addition, my epistemology favors cluster thinking approaches, which give only limited weight to such considerations.
Work on AI safety now vs. later
What has a larger marginal impact: working on AI-related issues now or working on it in future?
Reasons for working on it now include:
- We may enjoy first-mover advantages regarding the framing of the issues at hand.
- Instrumental benefits such as inspiring more people to work on suffering-focused AI safety are significantly larger for earlier work.
- If AI timelines are short, then working on the problem now is crucial. But I would be surprised if we had smarter-than-human AI before 20506 and therefore do not view this as a strong point.
The main arguments for delaying work are:
- We gain more information on whether AI is the most effective cause to work on. In other words, we would avoid wasting time if the technology develops in ways other than what we expected.
- We would have more clarity about how AI will be developed, what the architecture would be, and which scenarios are most important, making work on the problem significantly more tractable.
- Similarly, we will see to what extent AI safety will be neglected or crowded.
- If the number of people working on AI safety grows exponentially or intelligence amplification techniques will become feasible, we can expect that future work contributes far more to solving technical problems than our work over the next decade.
Conclusion
The main takeaways of this essay are:
- It’s complicated. Simply saying “AI dominates everything by many orders of magnitude” doesn’t do justice to the complexity of the questions. Sadly, many discussions on AI don’t reflect that.
- Rather than superintelligence taking off, I find it more plausible that AI systems achieve superhuman performance in more and more domains, but not all at once. But technological development can differ from expectations in thousands of ways.
- An effective altruist in earlier times should probably not have focused on technology.
- The wager works to some extent, but the difference is less than an order of magnitude.
- There is a difference between “AI systems will likely transform society and we should try to make sure it’s safe and used responsibly” versus “the coming intelligence explosion is all that matters” – which discussions of the topic often fail to reflect. Elite common sense supports the former, but not necessarily the latter.
- I tend to agree with Weak AI priority and Weak AI pivotality, but am sceptical about the stronger claims.
- I (weakly) think that direct technical work on suffering-focused AI safety is not an intrinsic top priority at the moment, but we should keep the topic in mind, and instrumental benefits such as inspiring others to also work on it may tip the balance.
- In a nutshell, I would say AI is important, but it does not clearly dominate everything.
Acknowledgements
I am indebted to Max Daniel, Lukas Gloor, and Caspar Österheld for valuable comments and discussions.
Footnotes
- Source: Wikipedia
- More precisely, we probably care about the standard deviation, not the variance. The main text uses “variance” in an informal sense, not as the mathematical concept.
- See here for details on why we may have more impact in scenarios with ASI. I elaborate on this argument in a later section.
- History of nuclear weapons
- HT Caspar Österheld for this point.
- Not to mention all the conceptual problems with general artificial intelligence and it arriving at a single point in time.