(Translated by https://www.hiragana.jp/)
Talk:Simpson's paradox: Difference between revisions - Wikipedia Jump to content

Talk:Simpson's paradox: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Line 225: Line 225:


:True - like most paradoxes, it's only paradoxical when when described in a misleading way. You can see a paradox as a challenge to find the right way of describing the situation. - Do you suggest a change to the article?--[[User:Nø|Nø]] ([[User talk:Nø|talk]]) 11:04, 14 November 2010 (UTC)
:True - like most paradoxes, it's only paradoxical when when described in a misleading way. You can see a paradox as a challenge to find the right way of describing the situation. - Do you suggest a change to the article?--[[User:Nø|Nø]] ([[User talk:Nø|talk]]) 11:04, 14 November 2010 (UTC)

== Removed: "how likely" ==

I've just removed the section on "how likely" Simpson's paradox is. The reason for this is that in order to make sense of the statement you need to assume a probability distribution for the entries of a 2x2x2 table (presumably what the section's statement about "assuming certain conditions" was a reference to). My basic argument here is that without a statement of those "certain conditions" the statement is essentially meaningless, so we have to go to the paper to find out what it means.

Omitting explicit reference to a probability distribution in choosing an object "at random" is commonly done in elementary expositions of statistical concepts when the situation is simple enough that the distribution can be inferred from the surrounding context, or there is in some other sense enough "intuition" to suggest a natural choice. Examples like the "Bertrand paradox" show that this it not unproblematic. I would argue that here, there is not enough context or "intuition" to give those without any precise understanding of statistics/probability any sense of what it means to fill in a 2x2x2 table "at random" according to the distribution assumed in the Perlman paper, and it is potentially misleading to present the context-free assertion as if it has enough context to determine an intuitive meaning. (I should point out that the paper itself makes no claims that this distribution is "the only one" worth considering--- nor does it argue, for example, that actual statistical practice in filling in 2x2x2 tables is at all comparable to the model they assume when they calculate the .0166 figure cited here. It just computes various probabilities in a model.)

Revision as of 06:28, 30 January 2011

WikiProject iconMathematics B‑class Mid‑priority
WikiProject iconThis article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
BThis article has been rated as B-class on Wikipedia's content assessment scale.
MidThis article has been rated as Mid-priority on the project's priority scale.
WikiProject iconStatistics Unassessed
WikiProject iconThis article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
???This article has not yet received a rating on Wikipedia's content assessment scale.
???This article has not yet received a rating on the importance scale.


I'd like to change the first few paragraphs of this article to make it friendlier to folks afraid of math, and was wondering what other people thought. Here's a possibility:

Simpson's paradox is a statistical paradox described by E. H. Simpson in 1951, in which the accomplishments of several groups seem to be reversed with the groups are combined. This seeminhgly impossible result is encountered surprisingly often in social science and medical statistics.
As an example, suppose two people, Ann and Bob, who are let loose on Wikipedia. In the first test, Ann improves 60 percent of the articles she edits while Bob improves 90 percent of the articles he edits. In the second test, Ann improves just 10 percent of the articles she edits while Bob improves 30 percent.
Both times, Bob improved a much higher percentage of articles than Ann - yet when the two tests are combined, Ann has improved a much higher percentage than Bob!
The result comes about this way: In the first test, Ann edits 100 articles, improving 60 of them, while Bob edits just 10 articles, improving 9 of them. In the second test, Ann edits only 10 articles, improving 1 of them, while Bob edits 100 articles, improving 30 of them. When the two tests are added together, both edited 110 articles, yet Ann improved 69 of them (63 percent) while Bob improved only 40 of them (36 percent)!
Seems reasonable enough to me, although I wouldn't say "accomplishments" for "successes". "Success" in statistical jargon is not necessarily a positive thing! How about "ratings" instead?
I presume you are intending to leave the remaining paragraphs unchanged? -- Securiger
That was my thought, yes. So I'll go ahead and do this, then. DavidWBrooks 13:13, 17 Feb 2004 (UTC)
(However, looking it over again, I'll do my arithmetic correctly before I post it! Oops ... DavidWBrooks)
Is it a problem that the example explicitly refers to Wikipedia? (I'm thinking WP:SELF.) Avram 21:24, 10 March 2006 (UTC)[reply]

Order

I am not a frequent editor but shouldn't description come before the examples and not the other way around? —Preceding unsigned comment added by 88.234.7.51 (talk) 12:08, 28 November 2010 (UTC)[reply]

Nice work

I have recently been browsing the logic & game theory articles. This is the best I have seen so far. Congratulations to all concerned.

John Moore 309 12:36, 24 April 2006 (UTC)[reply]

I just read this article too, having come from Texture filtering and I am very impressed! This article is brilliant! --137.205.76.219 15:48, 27 January 2007 (UTC)[reply]

The same paradox?

I wonder if this is the same paradox and if it could be used as an example. I find it very easy to understand — and from real life.

Assume a population with 50% men and women and in both groups competence is spread in the same way. Imagine a situation where women are required to have more competence to get a promotion to management. You will then notice that women on the management level are more competent than male managers and that women in sub-management are more competent than men on the same level. This seems paradoxical at first considering that, on the whole, women and men are equally competent. Samulili

It's a nice example. In order to convince myself (and perhaps others) that it's the same paradox, I'll now assume that on average, the women are slightly less competent than the men (no offence, just to sharpen the paradox and make it clearer that Simpson is involved), and I'll add some numbers:
Suppose we have 100 men and 100 women. 18 of the men are highly competent, and 14 of them are in the management. Of the 82 less competent men, 6 are in the management. 17 of the women are highly competent, but only 8 of them are in the management. Of the 83 less competent women, 2 are in the management. Then, of the women in the management, 8/10=80% are highly competent, and of the sub-management women, 9/90=10% are highly competent. Of the men, only 14/20=70% of those in the management group are highly competent, and only 4/80=5% in the sub-management group are highly competent. So, in both groups, more of the women than of the men are highly competent, but combined, only 17/100=17% of the women are highly competent, while 18/100=18% of the men are.
Conclusion: This is indeed a Simpson paradox, and the only change compared to that suggested above is that I made it a little sharper by making the women less competent over all instead of just equally competent. However, I like the original better, and I think someone should go ahead and add it to the article. I'm afraid it takes skills beyond mine to write it in a simple way that makes it clear that it is a Simpson's paradox.--Niels Ø 20:04, 2 May 2006 (UTC)[reply]

How is this a paradox?

For Ann, the time that she royally screwed up barely counts, while the time that she did poorly counts the most. For Bob, the time that he royally screwed up hugely affected his total, while the time that he did amazing barely counts at all. I don't quite see why the results are surprising. Anyone care to enlighten me?

It all makes sense in the end, but it's still initially surprising for most people who are not aware of the explanation or suspect it. If you only know the partial percentages, then the total percentages would come as a surprise to most people. Obviously, once the weights are introduced, the initial surprise is exchanged for comprehension, but then a paradox is only a seemingly self-contradictory statement anyway, so I see nothing wrong with calling this a paradox. -Kvaks 01:09, 2 September 2005 (UTC)[reply]
Its not strictly a paradox, since there is a straight forward solution. But, its widely known by that name, so we ought to keep it. --best, kevin ···Kzollman | Talk··· 04:19, September 2, 2005 (UTC)
I do not support the idea that the phenomenon is not "really" a paradox. Many good paradoxes are based on representing a situation in such a way that a false conclusion seems obvious.--Niels Ø 08:18, 6 October 2006 (UTC)[reply]
Agreed with Niels. The key point to remember is that in the baseball batting average example, there are large differences in the number of at-bats between years. Rock8591 (talk) 06:18, 9 August 2009 (UTC)[reply]

One of the finer Wiki entries

The storytelling conceit, complete with sly reference to those other Simpsons, "Bart" and "Lisa," works well for me. This kind of explanation helps me in explaining a concept to others, even as I work to fully grasp it myself. The inclusion of the Wikipedia within the definition does not seem overly self-referential, as one observer has worried. Entries like this are the reason I seek out Wikipedia's take on things before looking to other, traditional sources. Thanks for an entertaining and elucidating entry! Matthew Treder 18:42, 2 May 2006 (UTC)[reply]

Agreed. The examples are clear, well written, and logical. And the references to Bart & Lisa Simpson are not only clever and fun, they also make it EXTREMELY easy for many people to remember this phenomenon as well as its associated name. If we name them Dick & Jane it would be far less memorable. How great it is when practicality and humor intersect! Jon Miller
Indeed! --WikiSlasher (talk) 13:01, 11 December 2007 (UTC)[reply]
The first (graphical) example could be made a whole lot clearer if the symbols x and y and relationships between them were explicitly defined. I would be pleased to contribute to this cause, but -- well, I am still bewildered by it. The other (real life) examples work quite well and make the fictitious, original, self-referent narrative unnecessary. Finally, if a vote gets taken, please cast mine in favor of "fallacy" -- not to exclude "paradox" but to strengthen the importance of this entry.24.130.61.77 (talk) 19:46, 2 January 2011 (UTC)[reply]
Why did I not get this earlier... mattbuck (talk) 14:22, 11 December 2007 (UTC)[reply]

I'm new to commenting here, so I apologize if I'm doing this wrong.

The question was raised as to whether or not it's appropriate for this article to reference Wikipedia [WP:Self]. I believe it may be, but should certainly be discussed. The point of avoiding self references, as I read that guideline, is to not use phrases such as "elsewhere on this site" or "in another Wikipedia article". The point is NOT to pretend that Wikipedia doesn't exist.

The article could reference bowling or mowing lawns or a great host of other activities where the characters' performance can be quantified. I suspect the Wikipedia reference was used simply because the author assumes that those reading it will be familiar with the process.

However, I don't believe that the act of editing Wikipedia articles is a good example of much anything, because most people I know who read Wikipedia have never edited anything. I've been reading for years and only today even created an account to post anything. So the example took a little more effort for me to understand than many other possible analogies could have.

And, continuing that thought and going back to the self reference guideline, the plan as I have understood it is to eventually do a printed Wikipedia. Regardless of the form, any time this article appears outside the wikipedia.org website the chances of the reader understanding the example become greatly diminished.

In other words, I like the example used here, but a different example may be more comprehensible and practical.

Ha! That's funny! Thank's for putting Bart and Lisa in the Simpson's paradox. --69.67.229.185 03:02, 26 August 2006 (UTC)[reply]

A word

The Lisa-Bart example ends in this sentence: But it is possible to retell the story so that it appears obvious that Bart is more diligent. Would it not be more natural to say "tell" instead of "retell", since it is the original statement of the situation that appears to have this conclusion?--Niels Ø 08:18, 6 October 2006 (UTC)[reply]

Good poinmt. Thanks. I've changed that line to something that I think is even better: But it is possible to have told the story in a way which would make it appear obvious that Bart is more diligent. --Keeves 12:13, 6 October 2006 (UTC)[reply]

The kidney case

I expanded the text on the two factors at the end of the section to relate more specifically to the medical example. Reading what I've written, it seems natural to ask: Why did doctors give the inferior treatment B to the milder cases, when A is better in those cases too? I have not consulted the references on this case story, but perhaps someone who has (or will) can answer my question. I imagine one of two answers: (i) Before this particular investigation, they did not know that B was inferior even in the milder cases. (ii) Treatment A is more expensive, and is therefore primarily given to those patients who need it the most. In fact, if there are no other confounding variables involved, and if A is more expensive than B, then, within a given budget, the largest number of cures is obtained by treating as many as possible from the large-stone-group with A.--Niels Ø 13:29, 13 October 2006 (UTC)[reply]

Thanks for your changes, it reads more clearly. I don't have access to the original study, but from the review and title it appears to compare surgery, ultrasound and/or using catheters. Unsurprisingly the open surgery (treatment A) is the most effective, and probably is the most the expensive with the greatest post-treatment complications. TobyK 13:36, 31 October 2006 (UTC)[reply]

Suggested addition to aid paradoxical comprehension

existing section under 'Explanation by example' subtitle

[Who is more accomplished? Lisa and Bart's mutual friends think Lisa is better—her overall success rate is higher. But it is possible to have told the story in a way which would make it appear obvious that Bart is more diligent.]

append with the addition of

+ [However, some will note that the use of statistical analysis to present a biased view is not uncommon, for example in politics. On close inspection, one may find that Bart's edits are of a higher quality, elucidating complex subjects poorly understood by the general populace. Although Lisa and Bart's mutual friends think Lisa is better, history may judge Bart's legacy to humanity to be more significant.]

This may help answer those who fail to comprehend the paradoxical nature

Teeteetee 09:51, 2 March 2007 (UTC)[reply]

How so? The quality of the edits is unrelated to the paradox we're dealing with here; it's entirely about the number of edits.--Niels Ø (noe) 09:56, 2 March 2007 (UTC)[reply]
Extracted from the article's sub-section. . . .
" worth of work/Success/managed/achieved successful/worse/we feel/disappointed/accomplished/mutual friends think/better/diligent "
Are these "entirely about the number of edits" ? Teeteetee 19:34, 4 March 2007 (UTC)[reply]
OK' I didn't put that as clearly as I should have. The point is, we need not distinguish very good edits from minor improvements; that's not what the example is about. Whether they elucidate complex subjects is utterly irrelevant. However, the words accomplished and diligent that you quote may be misleading for the same reason: They seem to suggest some edits not merely improve articles, but that they display particular diligence, which (though of course true) is, as I said, utterly irrelevant.--Niels Ø (noe) 20:25, 4 March 2007 (UTC)[reply]
I do not understand your meaning.
I have tried several times to understand.
If you could avoid criticising existing aspects of the article I might better understand.
....
Do you agree with the following statement ?
"If Bart only edited one article (and that one edit brought about world peace), Lisa's lifetime of editing thousands of articles may statistically appear better (to friends, family, politicians, religious leaders, and others viewing the statistical view), but may be judged by history to be worth less than Bart's one edit."
Teeteetee 11:52, 8 March 2007 (UTC)[reply]
Sure, but it's got nothing to do with Simpson's paradox. The Bart-and-Lisa example is solely about the number of edits that were improvements, and the number' that were not. It does not distinguish between large improvements and small improvements.--Niels Ø (noe) 16:24, 8 March 2007 (UTC)[reply]
By using "it"(in the sentence above "It does not distinguish..."), I assume you mean Simpson's Paradox.
If so, you appear to be writing "Simpson's Paradox does not distinguish between large improvements and small improvements"
....
or, put alternatively,
When Simpson's Paradox occurs improvements can be difficult to distinguish.
Teeteetee 17:29, 12 March 2007 (UTC)[reply]
If you are seriously suggesting changes to the article, I think you should either be bold and make those changes, or explain clearly at this talk page what you'd like to change, and why. I've no idea what your point is.--Niels Ø (noe) 22:01, 12 March 2007 (UTC)[reply]
Thankyou for the advice, but, I was bold on 01March2007. Also, I hoped I had clearly explained my suggestion above (at 09:51, 2 March 2007)
My original article edit can be found here> [1] at the end of the 'Explanation by example' section.Teeteetee 12:31, 13 March 2007 (UTC)[reply]

Well, I believe I have made my concerns clear, where as I do not understand what your point is. Do you think your contribution is related to Simpson's paradox, or does it merely offer an alternative angle on the Lisa-and-Bart example, an angle unrelated to Simpson's paradox? Do you actually understand Simpson's paradox, or are you trying to understand it?--Niels Ø (noe) 12:57, 13 March 2007 (UTC)[reply]

I believe I understand Simpson Paradox.
I also believe context aids understanding.
I was attempting to provide others with some context. Teeteetee 13:50, 3 April 2007 (UTC)[reply]
Then I am at a loss. I am certain I understand Simpson's paradox, and I am certain it (in the Bart-Lisa-example) has nothing to do with distingushing between large and small improvements. The context is clear (wikipedia editing, some edits being improvements, other not). Adding more context - irrelevant to the paradox - will confuse matters by having readers trying to understand how it is relevant. Please explain, what is the point?--Niels Ø (noe) 14:49, 3 April 2007 (UTC)[reply]

How is the Electoral College an example of Simpson's paradox?

In both the Lisa/Bart example and the kidney stones example, there is a 3x2 table with 6 entries. How can the Electoral College data be presented in this way? There are the 2 parties, so that's the "2" dimension. But what is the "3" dimension?

Example the "2" dimension the "3" dimension
Lisa / Bart Lisa / Bart Week 1 / Week 2 / Total
kidney stones Treatment A/B small stones / large stones / together
Electoral College Rep / Dem ??? / ??? / total number of Electoral College votes

--Occultations 21:46, 15 May 2007 (UTC)[reply]

I suspect the analogy (the College cannot reproduce the paradox exactly since the outcome in each state is only related to the difference in votes through the sign of the difference, not magnitude. One could not lose the College if every state was won.) is that one can "win" the nationwide popular vote, but under certain circumstances can lose in the College. Baccyak4H (Yak!) 03:07, 16 May 2007 (UTC)[reply]
I've removed the Electoral College example, it's not an example of Simpson's paradox. Unless, that is, someone can show how it fits the 3x2 table pattern. --Occultations 12:53, 28 May 2007 (UTC)[reply]

Do we need the fake example?

We have four different real-world examples now, some with statistics. Do we need the "bart/lisa" fake example to explain it any more? At the very least, I'd like to move the real examples up above the pretend one - I think lots of people stop reading when the article lurches into "explaining" mode. - DavidWBrooks 23:41, 22 May 2007 (UTC)[reply]

I was about to make an almost identical heading. It's a pretty asinine self-reference in addition to being original research. Milto LOL pia 04:32, 23 May 2007 (UTC)[reply]
I agree with the removal of fake examples (as I've just done with the baseball example). This section should be moved below the examples, and then transformed into a general discussion of what may cause the paradox to appear (talking about weighted averages, confounding variables, etc). Schutz 07:12, 23 May 2007 (UTC)[reply]
Then I'll do the move, and we can do the transformation later. - DavidWBrooks 10:00, 23 May 2007 (UTC) .. oops, never mind: somebody already did.[reply]
You're still welcome to do the transformation now that I have done the move :-) Schutz 13:44, 23 May 2007 (UTC)[reply]
But that will require thought and skill - I hoped I could get away with a nice, mindless move. - DavidWBrooks 14:00, 23 May 2007 (UTC)[reply]
Too late :-) I'll think about the transformation, but, as you say, it requires quite a bit of thinking first. Before that, I'll add a few more references and reformat the examples, and hopefully (if I can get around to doing it), add 2 images. Schutz 21:27, 23 May 2007 (UTC)[reply]

I have readded the example after User:Miltopia removed it, since the consensus above was for now to move the example rather than delete it. We all agree that we have enough real examples and do not need fake examples on top of that; however, this section is the only one that goes beyond giving an example, but also discuss the question of weighted averages. I don't think it is very good, or that it covers everything it should, but at the moment it is better than nothing. If nothing happens with it in the near future, then it can be removed. Schutz 07:44, 24 May 2007 (UTC)[reply]

The Bart-Lisa example is pointless and misleading. The whole point of Simpson's paradox is that differences in underlying groups may be causing changes that lead to misleading results when the groups are not taken into account - the underlying groups are important in themselves and must be investigated for a proper analysis. But in the Bart-Lisa case, the underlying groups are 'week 1' and 'week 2'. Why are the success rates of editing being divided into weeks? The only reason for doing so would be that the success rates are changing consistently across weeks for both Bart and Lisa. But I can obvious see no reason why 'week' would be an appropriate grouping factor. This example makes the impression that you should divide your data into different groups for no reason and assess across those meaningless groups - perhaps doing so until you get the answer you want (e.g. Bart should be better than Lisa. We don't see in across both weeks, so we divide into weeks and aha! there we see it. If we hadn't seen it within weeks, maybe we should divide into days...) 124.197.3.68 (talk) 14:39, 18 February 2010 (UTC)[reply]

Correlation/Causation

Would it be an idea to add Correlation does not imply causation into the 'See also' section? Apologies if this has already been covered, I don't find any references to it. Flex Flint 08:57, 17 July 2007 (UTC)[reply]

I'd also suggest that Milo Schield's fine paper "Simpson's Paradox and Cornfield's Conditions" (http://web.augsburg.edu/~schield/MiloPapers/99ASA.pdf) be added to the references, and mention Cornfield's conditions somewhere in the main sections. Haruhiko Okumura (talk) 08:42, 14 August 2008 (UTC)[reply]

The correlation/causation issue is important in its own right, but has little to do with "Simpson's Paradox." I would suggest removing this part of the text in the extant introduction. Scrooge62 (talk) 18:28, 2 December 2009 (UTC)[reply]
Yes I think Correlation does not imply causation should have a link somewhere from this article - if it's not appropriate at any pther point, it should be in "See also".
I think the lead is fine as it stands. Correlation/causation is a much wider topic than Simpson's paradox, but it seems to me the ONLY relevance of Simpson's paradox is that it is ONE of the counterexamples that can be used to reject the intuition saying that correlation DOES imply causation.--Noe (talk) 08:15, 3 December 2009 (UTC)[reply]
I would emphatically stress that Correlation does not imply causation is very strongly connected to Simpson's Paradox. Correlation is based on the unconditional (or marginal) relationship between two variables. But causation would be based on their conditional relationship controlling for confounding factors. The fact that a conditional relationship can have the opposite sign of an unconditional relationship is precisely Simpson's Paradox and is also precisely the reason why correlation cannot be taken to imply causation. No two concepts could be more strongly related! -- --Geomon (talk) 06:10, 18 January 2010 (UTC)[reply]

Vector vs. Line

I reverted a diff [2] changing vector to line in one instance. First, the section it's in is called "Vector Interpretation", so referring to vectors is the expected language of that section. Second, the word change was made in only one instance, making the whole paragraph internally inconsistent as it switched from line in the first instance to vector in all other. qitaana (talk) 22:17, 26 February 2008 (UTC)[reply]

Low birth weight paradox

How is this an example of Simpson's paradox? From the information given, I see only a medical "paradox", not a statistical one. 72.75.98.88 (talk) 22:23, 15 May 2009 (UTC)[reply]

I agree. It looks like the example states that, given that a child is low birth weight, it has a lower infant mortality rate if born to a smoking mother. It would only be an example of Simpson's paradox if, given the child is born to a smoking mother, it has a lower infant mortality rate if it were low birth weight. JokeySmurf (talk) 05:36, 16 May 2009 (UTC)[reply]
I don't see how that would be Simpson's paradox either. If low birth weight meant lower mortality in both smokers and non-smokers, but higher mortality in the population as a whole, that would be an example of Simpson's paradox. 72.75.98.88 (talk) 13:52, 16 May 2009 (UTC)[reply]
It's poorly stated, but the paradox is that normal birth weight infants of smokers have about the same mortality rate as normal birth weight infants of non-smokers, and low birth weight infants of smokers have a much lower mortality rate than low birth weight infants of non-smokers, but infants of smokers overall have a much higher mortality rate than infants of non-smokers. This is (of course) because many more infants of smokers are low birth weight, and low birth weight babies have a much higher mortality rate than normal birth weight babies. The reference does explicitly state that it is an example of Simpson's paradox. 129.22.208.134 (talk) 20:18, 8 July 2009 (UTC)[reply]

Health care disparities

The newly added section Health care disparities sounds interesting. However, as it stand, I don't think it belongs. EITHER, it should be expanded to make it an illuminating exapmle of the paradox, OR it should be removed or boiled down to at most one sentence and a reference.--Noe (talk) 08:00, 24 September 2009 (UTC)[reply]

Stigler's law

To where it reads, Since Edward Simpson did not actually discover this statistical paradox, I propose to add [note 1: See Stigler's law]. To see how this would affect the over-all appearance of this article, view the proposed revision in my sandbox. --Pawyilee (talk) 02:31, 22 February 2010 (UTC)[reply]

There being no objection, I moved it into the article. --Pawyilee (talk) 14:34, 23 February 2010 (UTC)[reply]

Kidney stones

user:DavidWBrooks recently removed the first table in the kidney stone example, which showed only the results when no distinction is made for kidney stone sizes. As the section now stands, I don't find it satisfactory. I think it needs to be made clearer that false conclusions may be drawn when the lurking variable is not identified. One way to clarify this would be to put back the table (reverting half the edit in question), and I'm inclined to do that - but I'll wait and see...-- (talk) 15:37, 7 April 2010 (UTC)[reply]

I removed it because it seemed redundant, unnecessary - the current table (it seems to me) shows everything that the first table showed; in fact, it contains that entire table. Listing two different tables made it seem, I thoiught, as if something changed between them, but the second table was merely an expansion. However, if others disagree, then I certainly will bow to the majority. - DavidWBrooks (talk) 17:19, 7 April 2010 (UTC)[reply]
YYes, the second table contain all info, but the way the section reads now fails to make an important point clear. The easiest way to fix that is to revert your edit, but I'm sure there are other ways (and probably better ways) to fix it. Feel free.-- (talk) 07:11, 8 April 2010 (UTC)[reply]
It seems to me that these sentences following the table make the point clear: "The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time. In this example the "lurking" variable (or confounding variable) of the stone size was not previously known to be important until its effects were included." But perhaps not; perhaps the matter needs to be expanded or clarified. - DavidWBrooks (talk) 13:18, 8 April 2010 (UTC)[reply]
What made the fallacy clearer was that the "combined case" and the "obvious" conclusion was stated before the extra information was added and the refined conclusion reached. I think this was a more paedagogical presentation.-- (talk) 18:47, 8 April 2010 (UTC)[reply]
We have clarified our disagreement: It struck me as redundant, even a bit confusing. Anybody else have an opinion? - DavidWBrooks (talk) 19:28, 8 April 2010 (UTC)[reply]

Fallacy

Although this situation is called Simpson's paradox, this article is very useful in illustrating a fallacy in statistics that can be corrected. Of course, Simpson's paradox goes away when one properly accounts for external variables. For example in the Male/Female admissions lawsuit, the statistics can be shown with a common weighting of departments (apples-to-apples-comparison). If this is done, there is no paradox. —Preceding unsigned comment added by Fulldecent (talkcontribs) 06:38, 14 November 2010 (UTC)[reply]

True - like most paradoxes, it's only paradoxical when when described in a misleading way. You can see a paradox as a challenge to find the right way of describing the situation. - Do you suggest a change to the article?-- (talk) 11:04, 14 November 2010 (UTC)[reply]

Removed: "how likely"

I've just removed the section on "how likely" Simpson's paradox is. The reason for this is that in order to make sense of the statement you need to assume a probability distribution for the entries of a 2x2x2 table (presumably what the section's statement about "assuming certain conditions" was a reference to). My basic argument here is that without a statement of those "certain conditions" the statement is essentially meaningless, so we have to go to the paper to find out what it means.

Omitting explicit reference to a probability distribution in choosing an object "at random" is commonly done in elementary expositions of statistical concepts when the situation is simple enough that the distribution can be inferred from the surrounding context, or there is in some other sense enough "intuition" to suggest a natural choice. Examples like the "Bertrand paradox" show that this it not unproblematic. I would argue that here, there is not enough context or "intuition" to give those without any precise understanding of statistics/probability any sense of what it means to fill in a 2x2x2 table "at random" according to the distribution assumed in the Perlman paper, and it is potentially misleading to present the context-free assertion as if it has enough context to determine an intuitive meaning. (I should point out that the paper itself makes no claims that this distribution is "the only one" worth considering--- nor does it argue, for example, that actual statistical practice in filling in 2x2x2 tables is at all comparable to the model they assume when they calculate the .0166 figure cited here. It just computes various probabilities in a model.)