Prioritization researchers sometimes make statements (e.g. here) of the form
“The probability that intervention X reduces expected suffering is 60%.”
What exactly do we mean when we use such expressions? To help us think more accurately about prioritization, this post aims to place its basic concepts on a firm footing.
Contents
Credences and expectations
The simplest way to talk about the consequences of an intervention is to give a probability – in the sense of subjective Bayesian credence – that it is positive. The meaning of “I have 60% credence that controlled AI leads to less suffering than uncontrolled AI” is that when you sample a random trajectory1 of the future conditional on controlled AI, and another trajectory with uncontrolled AI, then your credence that the former contains less suffering is 60%.
Talking about the expected value (EV) of an intervention is trickier. The expected value is a number, not a probability, so we cannot – strictly speaking – say something like “X is 70% likely to reduce expected suffering”. But we can define it via an idealization procedure: the statement means that if I had a lifetime to think about the question, if I was better at using all available evidence, and so on, then I am 70% likely to conclude that X is positive in expectation. More formally, we have logical uncertainty as to the value of the EV, and can talk about the credence that resolving the uncertainty will yield a positive result.
The meaning of the two kinds of statements differs in a nontrivial way:
- If an intervention is more than 50% likely to be positive, it can still be negative in expectation. For instance, an intervention may be mildly beneficial in most cases but negative in EV because of serious tail risks. That said, under the additional assumption that the impact is of comparable magnitude in both directions, we can view the probability that an intervention is positive as a proxy for its EV. This kind of symmetry may or may not hold in practice.
- Conversely, one can sometimes be confident that the EV is positive, but not confident that it the result will actually be favorable. This occurs if an intervention has high variance – for example, its impacts may be chaotic – but you have good reasons to assume that it is positive in expectation. A trivial example is betting on a fair coin flip with 2:1 odds: you’re only 50% likely to win, but you certainly get a good deal.
Robustness
For risk-neutral altruists, the probability that an intervention is positive in EV matters more. It is a useful measure of robustness: if the probability is close to 50%, you think that the sign is comparatively likely to flip under further investigation. If that is the case, then the practical upshot is to research the intervention further; if you are fairly certain that the sign will remain the same, then pursuing the object level is more attractive.2
Counterintuitively, an intervention may be robust in this sense even if the credence that it is actually beneficial is close to 50% – or even if it is smaller than 50%. This is rare, though, because the different kinds of probability correlate strongly in practice.
Magnitude of impact
We should consider the magnitude of impact, not just the sign. To evaluate how good an intervention is, we might, in addition to tractability and neglectedness, look at a combination of:
- Absolute impact: How much does the intervention change the world? How consequential is it in an absolute sense, that is, without considering the nature of the impacts? We need an “impact measure” to evaluate this, which is not hard to define. A possibility is to consider how much a randomly selected value system would (on expectation) care about the consequences.
- Targetedness: To what extent do the consequences affect the amount of suffering in the world, as opposed to the amount of art or the number of paperclips? In other words, how consequential is the intervention from the perspective of suffering-focused ethics?
- Robustness: How likely are the effects to be positive or negative (in expectation)?
Informally speaking, the expected impact is the product of these three factors.
What does it mean to say that a cause dominates?
When we talk about whether X dominates (e.g. X = far future, artificial intelligence), we need to distinguish between two somewhat different questions:
- What fraction of expected suffering is because of X?
- What fraction of variance in expected suffering in a “random intervention” is because of X? I define “random intervention” as “ask a randomly selected EA to come up with a reasonable thing to do”.
The second question is more action-relevant because it implicitly considers tractability and neglectedness. If If we cannot influence a source of suffering, it is irrelevant to our decision-making.