A response to the EA forum (part 2/4)
Both types of arguments prove too much because they … are not specific to longtermism at all. They would e.g. imply that I can’t have a probability distribution over how many guests will come to my Christmas party tomorrow, which is absurd.
- Max_Daniel
First, your point about future expectations being undefined seems to prove too much. There are infinitely many ways of rolling a fair die (someone shouts ‘1!’ while the die is in the air, someone shouts ‘2!’, etc.). But there is clearly some sense in which I ought to assign a probability of 1/6 to the hypothesis that the die lands on 1.
- elliotthornley
This post begins a series in which I respond to a number of criticisms from the EA Forum’s response to my previous post, A Case Against Strong Longtermism. Make sure to read that piece before reading this one.
If you’re a completionist and want to follow the saga from the beginning, it starts on the podcast with the philosophy of probability series and goes through until episode 17, Against Longtermism and the FAQ follow-up. Many of the ideas Ben and I discussed in Against Longtermism took written form (his, mine), which then started a reddit thread and two lively discussions (his, mine) on the popular Effective Altruism Forum, the central hub for all things EA.
I couldn’t address all the comments in the forum, and promised to write a follow-up piece instead. And so here we are.
~
By far, most of the discussion on the forum focussed around my claim that the expected value of the future is undefined. To briefly recap, part of my argument against longtermism is that assumption 2.1 in The case for strong longtermism (TCFSL) doesn’t hold, because the set of all possible futures is infinite. Losing this assumption undermines longtermism itself, because this shows one cannot meaningfully talk about the expected size of the future. The future could have any size at all, depending on the assumptions one makes.
While some acknowledged the argument was correct, most thought it was wrong - but failed to reach consensus on where exactly the problem lay. Some thought it was mistaken for physical reasons, others for axiomatic reasons, and others still said that while expected utility theory theory may perhaps be broken, well it’s really just a rough heuristic anyway. But by far the most common response was that it couldn’t be true, because it “proves too much”.
Recall that the Proving Too Much strategy is when you challenge a claim by saying it couldn’t be true, because if it were, it would also imply some other conclusion which is obviously absurd. Specifically in this context, the claim is that if my argument were correct, it would imply one couldn’t have a probability distribution over Christmas party guests, or over die rolls.
So now I need to argue two separate claims. First, that the argument is correct, and second, that it doesn’t lead to the absurd conclusion. I’ll start with the first.
If you recall, part of my previous critique was regarding Shivani’s use of expected utility theory:
Then, using our figure of one quadrillion lives, the expected good done by Shivani contributing $10,000 to [preventing world domination by a repressive global political regime] would, by the lights of utilitarian axiology, be 100 lives. In contrast, funding for the Against Malaria Foundation, often regarded as the most cost-effective intervention in the area of short-term global health improvements, on average saves one life per $3500. (italics mine)
- Page 17, The Case For Strong Longtermism
I argued that the expected value of the future is undefined due to Hilbert’s paradox of the Grand Hotel, but even if it wasn’t, Shivani’s comparison would still be illegitimate, because of the methodological error of equating made up numbers with real data. I’ve recently learned, however, that the problem of undefined expectations runs much deeper than just the infinite set of future possibilities. It turns out that expected utility theory is itself so plagued with the undefined-expectation problem there’s even a special name for them - expectation gaps. The most significant of these expectation gaps is called “The Pasadena game”, and it goes as follows:
In The Pasadena game, one constructs a game of chance that produces in expectation a conditionally convergent series. Conditionally convergent series have been known to mathematicians since the 1800’s, and their key property is that one can rearrange the terms in the series to produce any value one likes. The Stanford Encyclopedia of Philosophy describes the situation:
Under the circumstances described here, we seem to have no reason to prefer any particular order in which to sum up the terms of the infinite series. So is the expected value of Pasadena game \(\ln 2\) or \(\frac{1}{2} \ln 2\) or \(\frac{1}{3}\) or \(-\infty\) or \(345.68\)? All these suggestions seem equally arbitrary.
Thus, the expected value of the Pasadena game is undefined and (as with the infinite set of possible futures) forms an expectation gap.
If that was the end of the story, it would be only a highly contrived curiosity with little application to decision theory used in practice. The problem, however, is that once you have an expectation gap in hand, they multiply and infect all other decision making scenarios, like a virus. For this reason the Australian philosopher of mathematics Mark Colyvan calls this the contagion problem: the disease of expectation gaps easily spreads to every other decision making scenario as well. Colyvan explains:
A second kind of contagion problem is what Bartha calls the failure of garden-variety decision-theoretic reasoning, or for short, garden-variety contagion. It is exemplified by the Pizza problem: choosing between ordering pizza and ordering Chinese food. (See Hajek and Smithson 2012.) This ought to be a straightforward problem, and if decision theory can’t handle it, decision theory is in serious trouble.
However, suppose that you assign some positive probability, however tiny, to the prospect of playing the Pasadena game after ordering pizza. Then ordering pizza is an expectation gap: it is a gamble, with outcomes pizza followed by the Pasadena game, and pizza not followed by the Pasadena game.
Presumably the probability that you assign to the former outcome is astronomically smaller than the latter. But that does not save the pizza option from contamination. So expected utility theory cannot value ordering pizza, and it cannot even place your pizza ordering on your preference ordering.
Furthermore, Bayesian orthodoxy cannot criticize you for assigning positive probability to your playing the game: you may do so while adhering perfectly to the probability axioms. Indeed, assigning zero probability to this contingent event is more liable to ruffle Bayesian feathers. After all, it is unclear why it should be zeroed out by your prior, and unclear how it could ever be zeroed out by conditionalization on your evidence. And yet once you assign positive probability to playing the game after having pizza, the pizza option is poisoned decision-theoretically.
This already shows that expected utility theory cannot even represent your choice between pizza and Chinese, still less advise you about it. Indeed, as long as any option in a given decision problem of yours is contaminated — however long your list of options may be — you cannot maximize expected utility over all your options. But while we’re at it, we might as well contaminate the Chinese option too. For the same reasoning that applies to the pizza option applies to it. We are left comparing a gap with a gap. If you do what decision theory tells you — nothing — your stomach will remain empty. (emphasis in original)
- Mark Colyvan, Making Ado Without Expectations
Because of the contagion problem, in Making Ado Without Expectations Colyvan says expectation gaps “rock the very foundations of expected utility theory”, and indeed, the Pasadena game even forces the Stanford Encyclopedia of Philosophy to conclude that maximum expected utility theory might be mistaken:
However, until recently no one has seriously questioned that the principle of maximizing expected utility is the right principle to apply. The rich and growing literature on the many puzzles inspired by the St. Petersburg paradox indicate that this might have been a mistake. Perhaps the principle of maximizing expected utility should be replaced by some entirely different principle?
Perhaps it should - I’ll discuss alternatives to decision theory in the next post of this series. But by discussing The Pasadena game I’m shifting the goalpost - I still haven’t defended the ‘shouting a natural number’ argument. So I should say a few words about that.
I think it proves both too little and too much.
Too little, in the sense that it’s contingent on things which don’t seem that related to the heart of the objections you’re making. If we were certain that the accessible universe were finite (as is suggested by (my lay understanding of) current physical theories), and we had certainty in some finite time horizon (however large), then all of the EVs would become defined again and this technical objection would disappear.
In that world, would you be happy to drop your complaints? I don’t really think you should, so it would be good to understand what the real heart of the issue is.
Too much, in the sense that if we apply the argument naively then it appears to rule out using EVs as a decision-making tool in many practical situations (where subjective probabilities are fed into the process), including many where we have practical experience of it and it has a good track record. (emphasis in original)
First, we don’t consider physical constraints when constructing the set of future possibilities - they come into the picture later, by assigning probability zero to physically impossible events. (I wrote about this in greater detail here). And in any case, we’re talking about possible universes here, not actual ones. This is why it establishes a one-to-one correspondence between the set of future possibilities and the natural numbers, and why we can say the set of future possibilities is (at least) countably infinite.
Infinite sets break expected utility theory. After posting the previous piece, I learned of an 80000 hours interview where MacAskill says the same thing:
Robert Wiblin: I guess it could get even worse than that because you could have one view that says something is absolutely prohibited, and another one that says that the same thing is absolutely mandatory. Then you’ve got a completely undefined value for it.
Will MacAskill: Absolutely. That’s the correct way of thinking about this. The problem of having infinite amounts of wrongness or infinite amounts of rightness doesn’t act as an argument in favor of one moral view over another. It just breaks expected utility theory. Because some probability of infinitely positive value, some probability of infinitely negative value, you try to take the sum product over that you end up just with undefined expected value.
This point is also repeatedly raised by E.T. Jaynes, the founder of modern Bayesianism. For example in the preface of The Logic of Science he writes:
Infinite-set paradoxing has become a morbid infection that is today spreading in a way that threatens the very life of probability theory, and it requires immediate surgical removal. In our system, after this surgery, such paradoxes are avoided automatically; they cannot arise from correct application of our basic rules, because those rules admit only finite sets, and infinite sets that arise as well-defined and well-behaved limits of finite sets. (emphasis added)
- E.T. Jaynes, Probability Theory: The Logic of Science
And he dedicates all of Chapter 15 to the subject, saying with resignation:
There are too many paradoxes contaminating the current literature for us to analyze separately. Therefore we seek here to study a few representative examples in some depth, in the hope that the reader will then be on the alert for the kind of reasoning which leads to them.
- E.T. Jaynes, Probability Theory: The Logic of Science
So to sum up: There are an infinite number of possible (not actual) futures, and this creates an “expectation gap” - an undefined expectation that breaks expected utility theory. And even if it didn’t, the Pasadena game would.
But how do we square this with the fact that we can talk sensibly about the probability of a die roll, or of future party guests? In other words, why doesn’t this claim prove too much?
And with this question, we turn to the heart of the Proves Too Much argument - that undefined expectations leads to absurd conclusions.
How can we talk sensibly about the probability of a die roll? By adopting an instrumentalist view of the probability calculus, and abandoning bayesian absolutism which says we must always assign numbers between zero and one to our beliefs. To illustrate the difference, consider again assumption 2.1 in TCFSL:
In expectation, the future is vast in size.
This is an objectivist view of probability, because here, there is one “object” under consideration - the future - and it has the property vast number of expected beings associated with it. Consider now rewording it as follows:
Given the available evidence, a rational agent should believe that, in expectation, the future is vast in size.
This is a subjectivist view of probability because here, the property vast number of expected beings is associated not with the future, but with the belief of a “subject”, i.e. the rational agent. This interpretation is where credences start entering the conversation. Contrast this against the instrumentalist view:
In order to achieve a particular goal, it is useful to assume that, in expectation, the future is vast in size.
Here, we make no claim about how the future will actually look, only that assuming there will be a vast number of expected beings in the future is useful in achieving certain ends. The focus has now shifted away from probability, and towards the goals. This means we are free to change the assumptions, and even the instrument itself, if doing so lets us better accomplish our goals.
Instead of credences, we instead talk about assumptions. People are free to make different assumptions, and they can be judged only on their intrinsic reasonableness - about which we can argue - and on their usefulness in accomplishing the goal in question, which we can measure using data and observation. Note that this view is perfectly congruent with recognizing the value of bayesian statistics in its ability to solve real world problems.
So as soon as one adopts the instrumentalist view, the ‘proving too much’ argument dissolves. One assigns a probability of \(1 / 6\) to a die rolling \(4\) because it is useful to do so. If the situation changed - perhaps the die is dropped in sand instead of bounced on a hard surface - one might assign different probabilities. But these modeling assumptions are always a choice we have, not something we are forced into. And again, to reiterate, we are free to choose other tools as well - especially if the tool is being used inappropriately, as is the case of using expected values to peer one billion years into the future, without data.
~
But couldn’t a longtermist just say: “It’s useful to make the assumption that I won’t face The Pasadena Game, and it’s also useful to assume a measure over the infinite set of possible futures, so I’ll just assume that and get back to my attempted EV maximizing”? Of course - that is a perfectly legitimate response, and is why the field of statistics and machine learning doesn’t need to worry about silly Pasadena games.
The catch, however, is that you now have to defend why your assumptions are better or worse anyone else’s. And for that, you’ll either need data (as the statisticians will tell you, and as GiveWell has emphasized), or physical theories which justify the use of the probability calculus (like chaos theory or statistical mechanics). Theories aren’t possible in the case of the long-term future of human society because of the impossibility proof from epistemology, and as we have seen from Shivani’s reasoning above, longtermists eschew data.
So this is why we are free to assign probability distributions over party guests, or die rolls, and why undefined expectations don’t prove too much. Expectation gaps just kill the bayesian epistemology piece, that absolutist streak which says we must assign credences to our beliefs, and that we are irrational if we do not. Because if we must do so, then surely we must also assign some non-zero credence of playing The Pasadena game, and this breaks everything.
To Bayesians, the brain is an engine of accuracy: it processes and concentrates entangled evidence into a map that reflects the territory. The principles of rationality are laws in the same sense as the Second Law of Thermodynamics … Remember this, when you plead to be excused just this once. We can’t excuse you. It isn’t up to us.
- Eliezer Yudkowsky, No One Can Exempt You From Rationality’s Laws
So, to summarise the above, we have to assign probabilities to empirical hypotheses, on pain of getting Dutch-booked and accuracy-dominated. And all reasonable-seeming probability assignments imply that we should pursue longtermist interventions. (italics in original)
In the next post, I’ll address Dutch Books, accuracy-domination, and all of Eliezer Yudkowsky’s repeated claims that none of us are exempt from the so-called “Laws of Rationality”.