Input from marketing intelligence can be incredible valuable, but it will never tell you wrong from right. We tend to hide this uncertainty and turn shades of grey into black and white. But understanding and accepting uncertainty ultimately leads to better decision making.
Very pregnant
A woman is either pregnant or not. She can’t be, say, 80% pregnant. We call this a natural dichotomy: it is either yes or no, 0 or 1, black or white. So are most decisions. Launch or not launch. Include TV or not in your media mix. Hire this new person or not. Managers, of course; prefer black and white over shades of grey.
Those same managers are increasingly dependent on data in their decision making. Unfortunately, data usually produce shades of grey. Your churn model flags customers that are more likely to leave. Your NPD research shows your new product has a high probability of success. Your acquisition study suggests that risk is at acceptable levels to go through with buying the target company.
Two types of wrong
So, any model or classification will yield hits and misses. Those misses come in two types. First, you have classified something you should not have. This is a false positive. Your customer stays but you’re your model predicts she was bound to leave. Or you could not have classified something you should have. This is a false negative. Your customer leaves but your model predicts she was bound to stay. Figure 1 shows more examples. I included some outside marketing just to get an idea of how ubiquitous this 2-by-2 is.
Figure 1 Examples of possible model outcomes
The confusion matrix is valuable by itself as it confronts you with all possible outcomes of your classifications. Usually we pay the most attention to the upper left quadrant. How many did we get right? Without the other quadrants, however, you don’t know how well your model really performs. How many did we get wrong? How many of these were false positives? How many of these were false negatives? What are the costs of each? Do they weigh up to the benefits? All important questions to consider, some of them I see frequently ignored.
Balancing false positives and false negatives
Even more important is the fact that your classifications don’t naturally come in the neat boxes ‘churn’ or ‘not churn, ‘successful’ or ‘not successful’, ‘significant’ or ‘not significant’. We, human beings, have created those dichotomies. For example, you observe a difference in a piece of marketing research and your test flags it as ‘statistically significant’. It does when the chance of finding the difference you have observed in your sample is small, i.e. less than 5%. A relevant question we usually don’t ask is: why 5%? Why not 1%? Wouldn’t that mean I’m more certain of my case?
The answer is: it’s about balancing false positives and false negatives. In this case, some scholars decided that 5% is a reasonable compromise between flagging differences that are not real and not flagging differences that are real.
It’s always about this balance. The police want to catch as many bad guys as possible but not at the expense of locking up innocent citizens. Doctors don’t want to overlook dangerous diseases at an early stage but should consider the psychological damage of false alarms. Marketers want their advertising to be seen by as many potential buyers as possible, but don’t want to waste their budget on people that are not buying the category.
Let’s call this side black and the other side white
Back to the significance example. The 5% might look like a hard demarcation but it is not. The underlying statistics, p-values in this case, form a smooth line, as shown in Figure 2. The criterion you use to make a distinction between significant or not is completely arbitrary. As statisticians Rosnow & Rosenthal joke: “we want to underscore that, surely, God loves the .06 nearly as much as the .05.”
The statistical significance logic is true for any model: we say a number is above or below a certain criterion, so we can comfortably put the outcome in a box: important or not, go or no go, etc. But be aware it is people that define the boundaries of black and white; the underlying statistics are just shades of grey.
Figure 2: p-values for 10 different sample outcomes
Note: The horizontal axis shows 10 hypothetical sample results (in %) in which two groups are compared; with every sample the difference becomes a little bit bigger. The vertical axis shows the corresponding p-values. In this scenario, both groups have N=400 respondents.
Risk versus opportunity
So, what does the above mean for decision making? First, accepting uncertainty implies accepting that making wrong decisions from time to time is just part of the game. Often, when decisions turn out to be wrong, people tend to distrust the data, the model, or even marketing intelligence altogether. Although you must manage risk, you can never reduce it to zero. And you wouldn’t want to, as it would instantly kill opportunity seeking.
Second, it makes you think twice about your criteria. Realizing that the criteria you use are arbitrary, gives you a good reason to reconsider these in every situation. For example, if the stakes are higher, you want to be more certain of your case. Launching a new product that requires a large investment in the production process might need more thorough scrutiny and more checks and balances than picking the most promising idea for the new social media campaign. If one data source is a bit messier than the other, the output of the former can be less trusted than the output of the latter.
Criteria are just shades of grey forced into black and white. Realizing and accepting this might make the world a bit more complex, but at the end of the day your decisions will be better.