Simple Heuristics That Make Us Smart

What is the best way to make decisions? Given a limited amount of time and information, how do we decide on the best course of action? And how do we determine what "the best" is?

Cognitive Science studies how people make decisions in real life. There's also a branch of cognitive science which tries to form models of decision making. I read Simple Heuristics That Make Us Smart as a way to figure out if there were any shortcuts to decision making. I didn't find anything globally applicable, but the book is fascinating, and I've been meaning to write about it for two years now... so bear with me while I run through this.

Decision making in academic literature is referred to as rational choice theory or rationality. The book goes into some detail about different kinds of rationality: unbounded rationality assumes total knowledge of the situation (as in the "perfect market" economic theory) whereas bounded rationality suggests that the human brain has limits and "uses approximate methods to handle most tasks." It also suggests that bounded rationality exploits the environment: different heuristics are applied in different environments, and a simple heuristic may work extremely well when applied to a specific environment.

The book divides bounded rationality into two categories: satisficing and "fast and frugal heuristics."

Satisficing is a heuristic organized around making an acceptable decision in the shortest possible time. It is literally "decide what you need, then pick the first acceptable solution." Don't bother looking at the other options. Don't even try to find out if there are other options. Pick the first solution that meets your needs.

Satisficing is a good heuristic in time constrained situations. It's a good strategy for buying a house. But it's not the best heuristic, because deciding what you need can be complex, and deciding what option is the best one can be complex. Satisficing can lead to sub-optimal solutions. But it's a great way of avoiding analysis paralysis.

Fast and Frugal Heuristics are the approach that receive the most attention in the book. They are environment specific heuristics. They may not work in some situations, but they will make the right choice more often than not. And they work better in spite of using less information than general decision making.

The best way to describe a frugal heuristic is by an example from the book. When a patient is admitted into a hospital, there are 15 different variables (also known as cues) which could describe whether the patient is at high risk for a heart attack. A doctor could look at every single variable and come up with a decision, but it turns out that only three decisions need to be made:

"Is the minimum systolic blood pressure over the initial 24 hour period more than 91?"
"Is Age > 62.5?"
"Is sinus tachycardia present?"

If the answer to these three questions is yes (or the first answer is no), then the patient is high risk. If not, the patient is low risk. This may seem like an oversimplified example, but this heuristic actually works better in the field than the one which examines all possible variables.

The interesting thing is that this works not because it's a generalized strategy. It works because the environment is fairly static -- patients come in and either get heart attacks or don't. Given enough time and data, heuristics like these can be derived. However, they can't be known ahead of time. You know this stuff from experience. It is, to the people who know the heuristics, "common sense", even if they can't define it.

The nice thing about heuristics is that they're practical. Using a heuristic will lead you to making the wrong decision some of the time, but it works better when you have to make a decision, and you know that making a decision is more important than getting it wrong.

There are a number of useful heuristics that appear in the book, and the greatest challenge to writing this post is that I want to write about all of them. Instead, I'll just present summaries and go over them briefly.

Ignorance based decision making

People know when they know something. There's even a saying: "Better the devil you know, than the devil you don't." A recognized quantity is more likely to be better than an unrecognized quantity. Heuristics exploit recognition to put a greater weight on a cue with a known value. This principle opens the door to a number of heuristics, notably "Take the Best."

"Take the Best" can be described as follows: All things being equal, pick the movie with Tom Hanks in it. If not, at least go for an actor whose name you recognize over one you don't. There are other one-reason decision making rules, but I'll focus on this one because it's clearly the one the researchers like best.

Take the Best makes an intuitive amount of sense, even if it does have the big weakness to "Take the Best" is that it has to be trained to know what's a valid cue (you have to know who Tom Hanks is). Or as the book charmingly puts it, "the validity of a cue is defined as the number of correct inferences divided by the number of correct and incorrect inferences made using the cue alone." In short, you have to know who Tom Hanks is before Take The Best will do you any good.

In a competition pitting heuristics to guess the size of Germany's largest cities, these three fast and frugal heuristics, using at most 3 cues, were more accurate than heuristics that looked up all 10 of the cues. The effectiveness of the frugal heuristics varied with how much information was known, but was still higher than the non-frugal ones. Sometimes having all the information isn't a good thing.

In practice, people tend to use a strategy called LEX, a generalization of Take the Best. Lex is defined as "the highest cue value on the cue with the highest validity. If more than one alternative has the same highest cue value, then for these alternatives the cue with the second highest validity is considered, and so on." If there are two movies with Tom Hanks, then pick the one directed by Ron Howard over the one directed by Michael Bay.

Why does it work?

These class of heuristics work in situations where the information is "non-compensatory" (i.e. a movie with an A-list actor and an A-list director is unlikely to have a D-list cinematographer) and the information is "J-shaped" (which I think corresponds to Zapf's Law... there are only a few excellent movies each year). In addition, this heuristic works very well in situations with "scarce information" (you don't know everything about every movie) and "decreasing population" (that is, good movies run longer than bad ones). This covers a large percentage of situations, without tying the heuristic too tightly to a particular situation.

Satisficing in Mate Search

There are situations where a "Take the Best" heuristic is not the best one. While you can choose between several movies, only the determined or foolhardy dates several women at once and compares them against each other.

There's a puzzle called "the secretary problem" in which the best secretary must be picked from a pool of applicants. However, the applicants appear in random order, and once a secretary has been rejected, she doesn't come back. The best solution to this problem (and bear in mind that this problem assumes a lot) is to interview the first 37% of the candidates, and then pick the next candidate who is better than the best in the sample size. Following this rule, you will end up with the best applicant 37% of the time.

But the secretary problem misses one small issue: while you are searching for the best match, your match is also looking around for his or her best match. Ironically, there is not much research into the best practices for mutual search. There is some research into the mate search strategies that people actually use, notably the charmingly named "one bounce" rule, where people continually look for the best mate, but stop as soon as the next prospect is less attractive than the current selected one.

And here is where I go off the geek deep end, past even Dating Design Patterns: the book actually goes looking for simple satisficing search heuristics for "biologically realistic mate search problems." Oh baby.

If you have high date value, and know that you're a good catch for any prospective mate, then the simplest dating strategy is called "Try a dozen." The best chance you ever have of meeting the perfect mate is 37% -- not very good odds, and that's assuming you date at least 30% of the available population. The mitigating factor in mate search is that you don't have to find the absolute best mate: you just have to find a good mate. Instead of taking the next best from 30% of the population, we can sample a smaller size and relax our restrictions a bit for a faster result.

By only dating 14% of the population, you have an 83% chance of finding a mate in the top 10% of the population. If you are willing to settle for a mate in the top 25% of the population, then your numbers get even better: dating 7% of the population will find you a suitable mate over 92% of the time.

Even though this looks statistically inviting, 7% of the population still looks daunting given a large enough population. However, the interesting thing is that the statistical significance of mate quality goes up with the rise in population: to find a mate in the top 25%, only 1% or 2% of the population needs to be checked. So Try a Dozen works for even large populations, although there does come a point where it doesn't scale.

But there's a catch. Try a Dozen only works when you have a mate value and know that none of your candidates are going to reject you.

The researchers wrote a dating simulation program, where they could see the best overall strategy for mutual date search. In this simulation, individuals had the same problem that we do: it's hard to know how attractive you are to prospective mates from the inside. In fact, your dates have a better idea of how much of a catch you are than you do.

The only realistic way to know your own mate value is to go through a learning stage ("adolescence" in the book) where you date a number of candidates and determine your own attractiveness through feedback. If a highly attractive mate proposes to you in adolescence, then you increase your own rating by a value proportionate to the mate's attractiveness. If a low-rated mate rejects you, then you decrease your own rating proportionately.

There are two measurements of success in this simulation. The first is the number of matches: if too many individuals have unrealistic expectations, not everyone will be matched. The second is the spread between matches: if a high value individual "settles" for a low value individual, then the match is not optimized ideally.

It turns out that with a short adolescence, most of the high valued individuals pair off immediately with the first random person, and the low valued people remain single because they are overconfident. But give the individuals an extended adolescence, and the group equilibrium quickly comes within 10% of optimum. Best of all, this heuristic works only even when the total size of the population is not known, and seems to be asymptotic after checking 20 individuals.

This also seems to make intuitive sense. However, the researchers explicitly state that the model is lacking some of the finer details common to dating. Populations are not fixed. Different age ranges may have different fitness criteria. The distribution of mate values are not as evenly distributed as they were in the simulation, and individuals don't have a chance to change their mate value based on their adolescent experiences.

The researchers are also careful to explain that mate search is not just a matter of percentages, even if it can be simulated as such. Love, they carefully explain, is essential in making decisions stick: in the heuristic of dating, love is the stopping rule.