Indeed, much better described at: https://en.wikipedia.org/wiki/Simpson%27s_paradox
I agree. To really get into the "paradox", start with the WP article. Pearl is heavily referenced and linked in it anyway.
For me, this is pretty much the core argument that makes the paradox vanish: For instance, naive students of probability may expect the average of a product to equal the product of the averages (OP pdf page 4) It's not too hard to convince yourself that you can't stir numbers together at random and get away with it, even if it nearly looks sensible at first. The language says it all: "average of products" and "product of averages".
The Vector Interpretation in the WP article gives a pretty decent visual explanation. If you find the maths a bit off putting just squint past the notation. You can pretty much replace all occurrences of funky symbols and vector with the word "line", accept that you can add these special lines together and you should be able to get the gist of what is going on.
I appreciate how the characters in the example are named Bart and Lisa.
Yep, the first picture in the upper right of the wikipedia article is worth a thousand words. In it we can clearly see there is a third hidden variable which is somehow getting mixed in with x.
Pages 9 and 10 (Appendix A) are a little more approachable in that way.
Fascinating, and I gotta say, the Wikipedia article explained what Simpson's Paradox actually was in a way that worked a whole lot better for me than this article. But, interesting article nonetheless.
> The idea that statistical data, however large, is insufficient for determining what is “sensible,” and that it must be supplemented with extra-statistical knowledge to make sense was considered heresy in the 1950s
You talk about Onion peeling."At first glance B is better. Look deeper and A is better. Look deeper still and B may be better again."
I found this unbelievable so I wrote a program to generate random contingency tables and search for a double reversal. When I found one I wrote a little story around the numbers.
I enjoyed that. Thanks for sharing!
Can you share the granular data that would lead to the results in your paper? In other words, the 504 cases that would yield different results based each additional factor included?
I wrote a blog post about Simpson’s Paradox. One of my more popular pieces.
Any report making conclusions about trends in data that doesn't defend against possible Simpson's Paradox issues shouldn't be taken seriously.
Wikipedia got the joke: "Suppose two people, Lisa and Bart, ..."
Me too. I hoped for a really good and long analysis of why Homer is right
I clicked on the link thinking this has something to do with Homer Simpson
Thought this was a reference to "the simpsons did it"