The use of "illustrative examples" is very well-known among teachers
who do not want to confuse beginners. For example, almost everyone
who read my "Oracle
911" article clearly understood that these were deliberately
over-simplified illustrations, used to emphasize each correct
Yet some people still criticized the article for using illustrative examples,
calling them fabrications (Really,
I’m not making this up). In database consulting, everyone knows
that you cannot copy your client’s schema and data for “evidence”,
and only a beginner would suggest this.
In sum, I agree with the statement “Trust but Verify”. Personally,
I trust folks who are out-there tuning Oracle every day, and I don't
need proof from someone that I trust. To discredit an important
statement from a real-world database just because it is impossible
to provide a “reproducible” proof is not a good practice.
Let’s take a closer look at the “Beer and Diapers” concept:
Beer and Diapers
Teachers MUST create illustrative examples and it is a common
practice among professors to use simple illustrations to reinforce a
concept. Take the famous “beer and diapers”. This example is used
to explain the concept of data mining to countless University
students, and I’ve used it myself when teaching Grad School.
The professor admitted that his story was an illustrative example,
but it’s not fair to call academic examples “fabrications” or
A number of convenience store clerks, the story goes, noticed
that men often bought beer at the same time they bought diapers. The
store mined its receipts and proved the clerks' observations
correct. So, the store began stocking diapers next to the beer
coolers, and sales skyrocketed.
The story is a myth, but it shows how data mining seeks to
understand the relationship between different actions."
What is the "true story" about using data mining to identify a
relation between sales of beer and diapers?
This is one of those recurring questions related to a famous decision support example. The story of using data mining to find
a relation between "beer and diapers" is told, retold and added to
like any other legend or "tall tale". I can't recall exactly when
I first heard a version of the tale, but I have used the story
and added to it myself on occasion. The following are some versions of the
An article in The Financial Times of London (Feb. 7, 1996)
stated, "The oft-quoted example of what data mining can achieve is the case of a large US supermarket chain which discovered a strong
association for many customers between a brand of babies nappies
(diapers) and a brand of beer.
Hermiz and Manganaris (1999) stated "One of the most repeated (though likely fabricated) data mining stories is the discovery
that beer and diapers frequently appear together in a shopping
The explanation goes that when fathers are sent out on an errand
to buy diapers, they often purchase a six-pack of their favorite beer as