The Journal for the Theory of Social Behaviour, 2011, 41(2): 209-227.
What Makes a Good Decision? Robust Satisficing as a Normative Standard of Rational Decision Making
Technion—Israel Institute of Technology
The Methodist Hospital Research Institute
Most decisions in life involve ambiguity, where probabilities can not be meaningfully specified, as much as they involve probabilistic uncertainty. In such conditions, the aspiration to utility maximization may be self-deceptive. We propose “robust satisficing” as an alternative to utility maximizing as the normative standard for rational decision making in such circumstances. Instead of seeking to maximize the expected value, or utility, of a decision outcome, robust satisficing aims to maximize the robustness to uncertainty of a satisfactory outcome. That is, robust satisficing asks, “what is a ‘good enough’ outcome,” and then seeks the option that will produce such an outcome under the widest set of circumstances. We explore the conditions under which robust satisficing is a more appropriate norm for decision making than utility maximizing.
What Makes a Good Decision? Robust Satisficing as a Normative Standard of Rational Decision Making
Technion—Israel Institute of Technology
The Methodist Hospital Research Institute
In the thirty years or so since it began, the field of behavioral decision making, or behavioral economics, has developed an ever-growing catalogue of the mistakes human beings are susceptible to when they use a variety of heuristics and biases to evaluate information, make decisions, and then evaluate the results of those decisions. In assessing probability, people seem to interpret “how likely is this event to happen?” as “how typical is this of the class of events of which it is a member?” People treat the vividness of an event in memory as an indication of how frequently the event occurred in the past. People make risk averse choices when choosing among possible gains and risk seeking choices when choosing among possible losses. This is not, in itself, a problem, but it becomes a problem when variations in the language of description can induce people to treat the identical choice situation as one involving gains or as one involving losses. People organize inflows and outputs of money into a variety of mental accounts, which helps explain why they are willing to treat themselves to a luxury when they have a windfall, but otherwise not. This also helps explain why people will make deposits into savings accounts that pay 3% interest while at the same time making minimal payments to reduce credit card debt at 18% interest. Peoples’ assessments of the value of a good at a given price are dependent on surrounding other goods that provide “anchors” (eg., a $600 suit may be a “steal” on a rack of $1000 suits, but an extravagance on a rack of $300 suits). Phenomena like these have grown out of the research program on heuristics and biases launched by Daniel Kahneman and Amos Tversky (e.g., Gilovich, Griffin, & Kahneman, 2002; Kahneman, 2003; Kahneman & Tversky, 1984, 2000). And they have led to a kind of “two-process” theory of judgment and decision making. One process, which is rapid, automatic, and inaccessible to consciousness, delivers results to consciousness that are produced by these heuristics. Afterwards, the second, slower process, which is conscious and rule-governed, goes to work with logic, probability theory, and other formal systems. A decision maker need not accept the results of the automatic system as competent or definitive, but the automatic system delivers answers upon which consciousness acts.The results of the operation of the heuristics and biases of the automatic system do not always lead to mistaken judgments and bad decisions. Indeed much of the time, they serve us well (see, e.g., Gigerenzer, 2007). Nonetheless, thirty years of research documents that sometimes, they can lead to serious errors.
In all the research on how heuristics and biases can lead people into bad decisions, the normative standard for comparison has rarely been called into question. However, in this paper, we will argue that many decisions we face cannot be handled by the formal systems that are taken for granted as normatively appropriate. Specifically, the world is a radically uncertain place. This uncertainty makes calculations of expected utility virtually meaningless, even for people who know how to do the calculations. We will illustrate some of the limitations of formal systems designed to maximize utility, and suggest an approach to decision making that handles radical uncertainty—information gaps—more adequately. The arguments below will be normative in intent. They will suggest that “robust satisficing,” not utility maximizing, is often the best decision strategy, not because of the psychological, information processing limitations of human beings (see Simon, 1955, 1956, 1957), but because of the epistemic, information limitations offered by the world in which decisions must be made.
We begin by discussing an example that illustrates severe uncertainty. The decision maker faces substantial gaps between what is known and what needs to be known in order to evaluate the quality of each option. This information gap precludes the evaluation of the options in terms of both value and probability. Expected utility theory and its extensions, such as rank-dependent expected utility (Quiggin, 1993), cannot be implemented by the decision maker given the information gap that we consider. An alternative normative approach (Ben-Haim, 2006) that enables decision makers to calculate robustness to uncertainty of satisfactory outcomes—what we call “robust satisficing”—is suggested.
We then discuss two issues regarding the domain over which our normative concerns extend. First, we try to specify what counts as “radical uncertainty,” by discussing various approaches to the meaning of statements of probability. Second, we argue that robust satisficing really is a different normative standard for making decisions and not just a prescriptive alternative to utility maximizing that acknowledges human information-processing limitations.
Choosing a College
Suppose you’ve been fortunate enough to be admitted to a half-dozen colleges. Now, you sit down to decide which one to attend. How should you go about this process? It is generally agreed that the best approach is to do a multi-attribute utility analysis (Keeney & Raiffa, 1993.) First, put together a big spreadsheet. Then, list all the things that matter to you about college (e.g., size, location, reputation, quality of its program in field biology, social life, music department, housing, etc.) Then, attach a weight to each attribute, to reflect its importance to you. If you are devoted to field biology, it may get a weight of 1.0, while other dimensions get fractions of that weight. Next, evaluate each school on each dimension; give it a score, say from 1–10. Finally, multiply scores by weights, and do some addition. Choose the school with the highest score.
This process can obviously be taxing and time consuming, but the situation is even more complex. When you assign scores for each school on each dimension, you’re making guesses or predictions. Your assessment of the music department, the field biology program, and the social life may be wrong. So to acknowledge uncertainty, you will need to assign probabilities to the values in each cell of the spreadsheet. Since this process is not like flipping a coin, it is also hard to judge the accuracy of your probability estimates, which themselves may be wrong. And the situation is more complex still. You may be wrong about how important field biology, social life, and location are to you. You’re only seventeen, after all, and people change. So the weights you attach to dimensions also need probabilities, and these probability estimates are also subject to error. There is an additional complexity. Even if your estimates of importance and quality are correct, you don’t know how it will actually feel to experience being a student at a school that has the qualities of the one you choose. You are making a prediction about a future subjective state, and as Daniel Gilbert, Timothy Wilson, and their various collaborators have amply documented, (e.g., Gilbert, 2006; Wilson & Gilbert, 2005), such predictions are notoriously inaccurate. And there is one final matter. There are some influences on your satisfaction with college that just can’t be predicted. Will you get along with your roommate? Will the best professor in the biology department leave? Will you form a romantic attachment? These kinds of factors can play a major role in determining your college experience, and they are inherently uncertain. You can’t even pretend to attach probabilities to them, or even to identify all of them. Making this decision is tough. You could easily be wrong. Nonetheless, you do the best you can, and that seems to be multi-attribute utility calculation. It’s your best strategy.
Or is it? Suppose you know that all that really matters to you is field biology; everything else is window dressing. In that case, your decision-making process is easier. You can rate the schools strictly in terms of their offerings in field biology, and choose the school that finishes first. You use other features only to break ties. This process, sometimes called “lexicographic preference,” essentially gives infinite weight in your consideration to one dimension. Gigerenzer (e.g., 2007) refers to strategies like this as “one-reason decision making.” Of course, you can still be wrong, both in your assessment of the various schools or in your assessment of your commitment to field biology. But this process makes decision making considerably simpler. Though multi-attribute utility analysis and lexicographic preference are different (and see Baron, 2008, for a discussion of these and several other decision-making strategies), they have one important feature in common—the goal of maximizing utility. The idea is to use the best information you have in order to choose the best school for you, and the question is, what is the best way to do it.
But now, imagine a different goal. Given the multiple sources of uncertainty that are a part of the process, suppose your goal is to choose the school that is likely to be satisfactory, even if your estimates of its quality on various dimensions are wrong. Instead of maximizing utility if everything goes well, you are trying to maximize confidence in an acceptable outcome, even if you suffer the slings and arrows of outrageous fortune. We call such a goal “robust satisficing.” You are still trying to maximize something, but what you’re trying to maximize is your confidence of a good enough outcome even if things go poorly. There is no particular reason to assume that the school that is best in your utility calculation is also the school that is most robust to error in the data underlying that calculation.
What this scenario, and countless others (e.g., buying a car, choosing a place to go on vacation; choosing a job; choosing a treatment plan for a serious medical condition; choosing investments for your retirement), have in common is that you are faced with a decision that has multiple dimensions, with outcomes that are uncertain and influenced by factors that are difficult to evaluate or even identify. And they are not merely uncertain in a probabilistic sense. In many cases, you cannot even attach probabilities in a meaningful way. Your uncertainty is more radical than the uncertainty you face when rolling dice. Knight (1921) distinguished between probabilistic risk, which can be insured against, and non-probabilistic “true uncertainty,” as he called it, which is the source of entrepreneurial profit (and loss) in a competitive market. Ellsberg (1961) famously pointed out this distinction when he contrasted an urn with 50 red and 50 black balls with an urn that has 100 balls, some of which are red and some black. If their task is to pick a red ball, people typically prefer the first urn to the second, preferring (probabilistic) uncertainty to what Ellsberg termed “ambiguity.” The thrust of our advocacy of robust satisficing as a decision criterion is this:
1. Most of the decisions people face in life involve Knightian uncertainty or ambiguity at least as much as they involve probabilistic uncertainty. This is especially true when a key feature of a decision is the person’s estimation of how it will feel to have one outcome rather than another. For example, having a side effect (e.g., impotence) of prostate cancer surgery is one thing; estimating the subjective consequence of this side effect, before the fact, is quite another.
2. In conditions of radical uncertainty, utility maximization as a strategy is unreliable. Indeed, it may even be self-deceptive, in that it involves assigning probabilities to outcomes in a context in which probabilities can not be specified.
3. There is a quite reasonable alternative to utility maximization. It is maximizing the robustness to uncertainty of a satisfactory outcome, or robust satisficing. Robust satisficing is particularly apt when probabilities are not known, or are known imprecisely. The maximizer of utility seeks the answer to a single question: which option provides the highest subjective expected utility. The robust satisficer answers two questions: first, what will be a “good enough” or satisfactory outcome; and second, of the options that will produce a good enough outcome, which one will do so under the widest range of possible future states of the world.
4. This alternative has been formalized as “info-gap decision theory” (Ben-Haim, 2006). Though we will not discuss it here, it has been used effectively as a decision-making framework in an extremely wide variety of domains, though none of them, to date, are psychological.
Info-gap decision theory is designed to handle situations of profound uncertainty. Since we do not know how wrong our data and models are, we evaluate a proposed decision by asking: what is the greatest horizon of uncertainty at which the decision will still yield acceptable results? How wrong can we be, in our understanding of the relevant processes and requirements, and the outcome of the decision still be acceptable? For instance, in selecting a college, you might ask: how wrong can my estimates be—estimates of the importance to me of field biology, estimates of the probability of different future emotional states, etc.—and any given school selection still be satisfactory? The answer to this question is the robustness function. The robustness function generates a preference ordering on the available decisions: a more robust decision is preferred over a less robust decision. Satisficing means doing well enough, or obtaining an adequate outcome. A satisficing decision strategy seeks a decision whose outcome is good enough, though perhaps sub-optimal. A robust-satisficing decision strategy maximizes the robustness to uncertainty and satisfices the outcome.
Info-gap decision theory has been applied to a wide variety of different domains. Burgman (2005) devotes a chapter to info-gap theory as a tool for biological conservation and environmental management. Regan et al. (2005) use info-gap theory to devise a preservation program for an endangered rare species. McCarthy and Lindenmayer (2007) use info-gap theory to manage commercial timber harvesting that competes with urban water requirements. Knoke (2007) uses info-gap theory in a financial model for forest management. Carmel and Ben-Haim (2005) use info-gap theory in a theoretical study of foraging behavior of animals. Ben-Haim and Jeske (2003) use info-gap theory to explain the home-bias paradox, which is the anomalously large preference for assets in investors’ home countries, over more favorable foreign assets. Ben-Haim (2006) uses info-gap theory to study the equity premium puzzle (Mehra & Prescott, 1985), which is the anomalously large disparity in returns between stock and bonds, and the paradoxes of Ellsberg (1961) and Allais (see Mas-Colell, Whinston & White, 1995). Akram et al. (2006) use info-gap theory in formulating monetary policy. Fox et al. (2007) study the choice of the size of a statistical sample when the sampling distribution is uncertain. Klir (2006) discusses the relation between info-gap models of uncertainty and a broad taxonomy of measure-theoretic models of probability, likelihood, plausibility and so on. Moffitt et al. (2005) employ info-gap theory in designing container-inspection strategies for homeland security of shipping ports. Pierce et al. (2006) use info-gap theory to design artificial neural networks for technological fault diagnosis. Kanno and Takewaki (2006a, b) use info-gap theory in the analysis and design of civil engineering structures. Pantelides and Ganzerli (1998) study the design of trusses, and Ganzerli and Pantelides (2000) study the optimization of civil engineering structures. Lindberg (1991) studies the dynamic pulse bucking of cylindrical structures with uncertain geometrical imperfections. Ben-Haim and Laufer (1998) and Regev et al. (2006) apply info-gap theory for managing uncertain task-times in projects. Ben-Haim and Hipel (2002) use info-gap theory in a game-theoretic study of conflict resolution. Thus, info-gap decision theory has been used productively to model circumstances of extreme uncertainty in a wide variety of different contexts and disciplines. But it has not been used, until now, to model the psychology of decision making.
What Does “Radical Uncertainty” Mean? When Does Robust Satisficing Apply?
In this section, we try to explicate the conditions under which robust satisficing applies, by explaining what we mean by “radical uncertainty.” This requires a brief excursion into the foundations of probability theory. What does it mean to say that the probability of throwing a “7” with two dice is .17, or that the probability of developing prostate cancer is .03, or that the probability that the New York Yankees will win the next World Series is .25? Baron (2008, and see Brown, 1993) nicely summarizes three different approaches to understanding what probability statements mean. The first, we might call “logical.” When the events that comprise a sample space are fully known, and their distributions can be specified, a probability statement is simply a matter of logic: in the sample space of outcomes of rolls of two dice, there are 36 equiprobable outcomes, of which six sum to “7.” Thus one-sixth of possible rolls (.17) will yield the outcome of interest. This is not an empirical matter. It is part of what it means to be throwing “fair” dice.
The second, we might call “empirical.” If you follow a sample of 10,000 men between the ages of, say 40 and 75, and 300 of them develop prostate cancer, you might infer that the chances of any particular man developing prostate cancer are 300/10,000, or .03. You use the frequency of the event of interest in the past to infer the probability of the event with respect to any particular case in the future.
The final approach to probability we might call “personal” (see Savage, 1954). You are asked, in April, “will the Yankees win the World Series this year?” “I think they will,” you say. “How sure are you?” “I give them a 25% chance,” you say. Because each baseball season is a unique event in ways that matter to prediction, you can’t really rely on frequencies in the past to infer probabilities in the future. The number you supply is merely an expression of your confidence. As Baron (2008) points out, some have argued that it makes no sense to attach probabilities to unique events. But, of course, each throw of the dice is a “unique event,” and each middle-aged man is a “unique event,” so distinctions among these three approaches to understanding probability statements are not so easy to make sharply. This is especially true when it comes to distinguishing frequency and personal approaches to probability. What does it mean when the weather forecaster says there is a 50% chance of rain today? Does it mean that it has rained in Philadelphia on half of the August 1sts in the history of weather records? Or does it mean that in the past, when the various atmospheric conditions thought to affect the weather that are present today have been present, it has rained half the time? Or does it mean that past weather, together with our current understanding of meteorology, makes the forecaster 50% certain there will be showers?
One could argue that on closer analysis, frequency and personal approaches to probability run together. If one uses frequency as a guide to probability, one must determine what counts as a relevant past event. This raises two questions: can relevant past events be specified objectively, and if they are, do they give us the most perspicuous purchase on what is likely to happen today? The date, April 1, is not completely irrelevant to a weather forecast (if snow rather than rain were the issue, in Philadelphia, knowing the date would tell you a lot). But we have reason to believe there are better ways to count past events as relevant than by date. On the other hand, since forecasters don’t always agree on the forecast, there remains room for doubt about what is the most perspicuous set of past events. With respect to prostate cancer, gender and age can be unambiguously specified, so that a frequency approach to probability is meaningful. But as our understanding of the disease progresses, we will expect the counting of relevant past events to change. This progress may lead simultaneously to more accurate probability estimates and to more disagreements among the estimators, because not all doctors will agree on the way to construct the relevant class of past events in the way they could agree on gender and age assignment.
It is also true that even spaces that seem unambiguously characterized by the “logical” approach to probability can be characterized as “radically uncertain” (eg., Baron, 2008; Baron & Frisch, 1994; Camerer & Weber, 1992). When you throw the dice, are they “true”? Are there little irregularities on the landing surface that might affect their path? What if someone standing around the table sneezes? How is predicting the outcome of the dice throw different from predicting the outcome of the college choice? If you get molecular enough, everything is radically uncertain.
So what, then, does it mean to call an event “radically uncertain” in a way that distinguishes throwing dice from choosing a college? What makes attaching probabilities to varying degrees of satisfaction with a college’s biology program different from predicting the weather? It might be that if you pushed a high school senior, she would attach a number to how likely she was to love biology at Swarthmore. But would the number mean anything? And if not, is there information available so that if she collected it assiduously, the number she attached would mean something? Even if the answer to this latter question is “yes,” if the meaning of the number is not entirely resolved by the added information, then there is radical uncertainty.
We don’t think these are easy questions to answer. It seems to us unlikely that there will ever be models of satisfaction with college that approximate the predictive power of meteorology, but that is an empirical question. There is no doubt that people can know more or less about a domain in question, so that estimates of probability from frequency can be more or less well justified. It may not mean much when a 10-year-old New Yorker tells you at the start of the baseball season that the Yankees have a 25% chance to win the World Series. It will mean more when a fanatic enthusiast of the statistical study of baseball that has come to be known as “sabrmetrics” tells you the same thing. But even the most sophisticated sabrmetrician is at the mercy of injuries or other personnel changes. The sabrmetrician can use the past to assess confidence in the future given the team, as constituted. But if the team changes, these estimates will change as well. The sabrmetrician could even try to estimate the likelihood of injury, which would increase his, and our, confidence in his estimates. Or he could do what we are advocating, and ask which team is likely to be the most robust to the uncertainties that each baseball season contains. In other words, in real-life decisions, we may never be confronted with the kind of uncertainty we face with Ellsberg’s urn, where any number of red balls, from 0 to 100, is possible. But before we attach probabilities to outcomes, we need to assess which of Ellsberg’s two urns the decision we face more closely resembles.