illuminating science

29/6/2008

The inner crowd or just statistics?

Filed under: — Joel @ 2:50 am

Synopsis: A recent journal article claims (literally) that second guessing ourselves gives better accuracy. I think this is wrong, and that what they’re seeing is just statistics. I’d like some opinions on this!

A recent paper in the journal Psychological Science (and much publicised by The Economist in an easier to digest article) makes some interesting claims. Basically the story goes that if you ask two people to estimate something (like the number of jelly beans in a jar, or the percentage of world airports in the USA) then taking two people’s guesses and averaging them gives you a better estimate than either alone (on average). It’s called the wisdom of crowds; extend it to a hundred people, and up to a point, the group guesses get better.

Fair enough, I could believe this, I think (more specifically, the group will average towards the “group bias”, and you’re assuming this is a “good” guess. I digress…) But the new paper goes one step further: it suggests that even one person can improve their guess by making two guesses and averaging them, improving accuracy by 10 percent. The article’s authors suggest:

Although people assume that their first guess about a matter of fact exhausts the best information available to them, a forced second guess contributes additional information, such that the average of two guesses is better than either guess alone. This observed benefit of averaging multiple responses from the same person suggests that responses made by a subject are sampled from an internal probability distribution, rather than deterministically selected on the basis of all the knowledge a subject has.

Translation: we don’t come up with one best guess straight off; instead, each “guess” comes from a range of possible values our brain has computed. Furthermore, they note that a delay of three weeks between the first and second guess improves the average, presumably by making the guesses more “independent”.

But something about this bugged me, so I did a little computational experiment myself: Generate a random number (between 0.0 and 1000.0, fractions allowed) which is the “right” answer and generate two more random numbers which are my guesses (also between 0.0 and 1000.0). Then, look at the difference between the first guess and the “correct” answer, and between the average of my two guesses and the “correct” answer. Repeat.

So to be clear: I’m randomly choosing the correct answer, and then I’m randomly making two guesses with no information other than a lower and upper bound. This would be perfectly reasonable for questions like the airport percentage above (which was mentioned in the article) where you know the answer is between 0-100. I did 1200 tests (in Excel; it’s not enough for the averages to be absolutely constant, but it doesn’t change the qualitative results.)

The results: On average, the first guess was off by 330 (which I’m sure you could argue theoretically, too). But for the average (mean) of my two guesses, I was only off by 290 - this is a 10% better guess than before! Here’s a summary:

Average “answer” 492
First Guess average 495
Average of two guesses average 499
Deviation of Guess 1 (average) 332
Devation of AvGuess (average) 293
Percentage change -12%
RMSE Guess 1 406
RMSE AvGuess 360
Percentage change in RMSE -11%

Notice the averages are in about the right place; my data is pretty random, and yet simply taking the average gets me closer.

I’ve also included there the root mean square errors (RMSE), which were discussed in the paper [Disclaimer: I don’t really have a stats background: for the RMSE I took the difference between each guess and the corresponding correct answer, squared each difference and added the results, then averaged this sum and took the square root. I hope that’s right…] Unfortunately, I don’t know exactly the data numbers and ranges used in the article, so I’m not quite sure how to compare them directly, but the percentage changes are what’s important, I think, and they fit nicely with the paper’s predictions (of between 5-15%).

Obviously, I made some assumptions, in particular that no person had the faintest clue what the “right” answer should be. But that’s not ridiculous, and it shouldn’t be hard to redo the data with a normal distribution of guesses around the mean; I’d expect it to give the same result, but the conclusion is now trivial (average two deliberately normally distributed guesses, and you’re going to get closer to the mean, on average :) ).

I also did a couple of trials where I fixed the “correct answer” to the same thing for every person. If the correct answer was 0 (out of 1000), then, as you’d expect, either a single or an averaged guess both produced the same average deviation, 500. Interestingly, however, their RMSE were quite different, and the averaged guess again produced a 6% better guess by this metric. Same for a guess of 1000. Why? Averaging the two numbers produces a lower standard deviation around the average (500); the RMSE weights big differences more, so the values further away from 0 (or 1000, respectively) contribute much more, even allowing for those corresponding closer values. (Does that make sense?) So RMSE might not be a great measure here.

What about if we take 500 as the average? Then the effect is even more pronounced - the mean guess has a lower standard deviation, is closer to 500 on average, and contributes much more. 250 or 750 were similar, and I’d expect this to be true for the whole range.

In conclusion, I would argue that any benefit seen in this study is simply from statistics, not from some innate feature of our brain’s estimating abilities. The only thing they did get right is that the time delay probably allows the second guess to be more “random” which seems to be useful, on average!

Thoughts? If this makes sense, I’m going to write a rebuttal/similar, but I’d love some feedback first (particularly if I’m completely wrong and/or have missed something obvious!) The preprint article is missing a key figure, but I don’t think I’ve misunderstood or missed anything important (except for their actual data values). All comments welcome!

Addendum: I finally did some simple analytical calculations - when both guesses and the answer are chosen randomly and independently from 0-1000, the RMSE of one guess should be 410, compared to 355 for two averaged guesses, just like the simulations predict! I’ve also checked that the average absolute differences for one guess should be 1000/3, which looks right, but I haven’t yet slogged through the maths for the average guess case (does anyone know a shortcut for analytical expectation values of absolute values?!)

Addendum #2: I’ve now done the RMSE case for the absolute difference (I’m so slow sometimes…)  and I predict 333 for the single guess or 291 for the averaged guess - right on the money! Thanks to Tim for useful comments about triangular probability distributions!

25/6/2008

Crazy talk

Filed under: — Joel @ 7:21 pm

I was reading about the Salvation Army today, and was quite shocked to discover that, in the US at least, religious organisations are exempt from many of the civil rights laws if such laws would violate the “creed” of the religion. In particular, whether religious groups must obey laws preventing discrimination against homosexuals or requiring paying of benefits to same sex partners. This is from an old New York Times article, but still relevant:

There is no federal law barring discrimination against gays and lesbians, and there are provisions in federal law that exempt religious groups from civil rights requirements that violate their creeds. But 12 states and more than 100 cities and counties have laws to protect gays. At issue is whether religious groups administering social programs with public money in those places should be — or are — bound by those laws.

Furthermore, as a religious group, despite being a billion-dollar multinational organisation, the Salvation Army is exempt from various reporting requirements that apply to all other charities.

I’m just sayin’, a rational person looking at these sort of arrangements would really start to question the special status given to religion and religious groups in today’s society. Not that I’m saying there should be religious groups (well, not in this post) but they should have to live by the same laws as everyone else. Surely that’s reasonable?

In other news, Heinz pulled an ad today which showed two men kissing, because:

…viewers complained to the Advertising Standards Authority that it was “offensive” and “inappropriate to see two men kissing”.

Other complaints include that the ad was “unsuitable to be seen by children” and that it raised the difficult problem of parents having to discuss the issue of same-sex relationships with younger viewers.

Gah.

16/6/2008

Great Astrology Paragraph

Filed under: — Joel @ 7:00 pm

From a really good high school textbook, “New Century Senior Physics”, comes this choice quote on astrology:

Astrology slowly became the mumbo-jumbo side of sky watching, and was relegated to the irratioinal, non-sceintific and hoaxers club together with pyramid poewr, clairvoyance, ESP, water divining, flat earth theory, numerology, faxes from the dead, Feng Shui, Tarot cards, Bermuda Triangle, runes, UFOs, levitation, Philadelphia experiment, faces on the Moon and Elvis sightings, to name just a few.

Astrology has stagnated in the pre-scientific theoryies of thousands of years ago.  It ahas no testable hypotheses, no statistically reliable evidence of past successes, no research program, no predictive power that can be tested by experiment.  Astrologers make many extravagant claims of success bt they have never stood up to rigorous scientific scrutiny.  In short - astrology is an article of faith, of pseudo-scientific hocus-pocus and righly belongs in the comic section of the newspaper.

Awwwwwwesome.

15/6/2008

Binge drinking

Filed under: — Joel @ 10:38 pm

The Australian Medical Association has just released guidelines which define binge drinking as four standard drinks per night.

This is pretty tight - an average bottle of wine is about seven standard drinks, so under the new guidelines, if two of you order a bottle of wine over dinner you are flying awfully close to the sun. (Assuming you are lucky enough to have a partner to share the wine with - otherwise, you’re in real trouble.) And if one of you drinks more than your fair share, then you’re binging.

I’m not certain how to respond to this. I’m not a heavy drinker for the most part, and I do think that many people drink too. (Even without a degree in medicine, it’s hard to imagine that drinking to the point of passing out is good for your body…) But is four drinks (per night) really excessive? I haven’t seen any discussion of the medical reasons (and Wikipedia just talks about bladder ruptures!) There’s nothing in the news stories so far, and I think it’s something that really needs to be put out there. Otherwise, this could run the risk of being portrayed as scare tactics in the media. From the AMA’s national president:

“The definition of binge drinking is something that perhaps hasn’t been brought down to the level of four drinks per night, four standard drinks per night,” she said.

“I think many Australians will be reflecting on their habits at home and wonder whether we are binge drinking on a very regular basis.”

Dr Capolingua says this could make people think twice about how much they are drinking.

The fact that we need to bring drinking to people’s attention isn’t a medical argument. Sure, it worked for me - I don’t think of myself as a heavy drinker, but I would on occasion have four or five drinks over an evening. (”Am I really binge drinking even on that? Yikes!”) But I can’t help but think this weakens the stigma (however slight…) attached to “binge drinking” Similar to something I’ve often seen around roadworks: on a 60k/h road, many people will obey a 40km/hr roadworks sign, but ignore a 20km/hr sign as just too low, and travel at their regular speed!

I’ll be interested to see the commercial media’s response, and the general community feeling. I suspect, though, that it’s not going to do good things for the AMA’s reputation.

4/6/2008

What is a “human”?

Filed under: — Joel @ 12:19 am

Here’s an interesting story about a group of animal rights activists attempting to get a chimpanzee declared a human in an Austrian court of law.

Animal rights activist and teacher Paula Stibbe…wants the chimpanzee, named Matthew Hiasl Pan, declared a person. That way, Stibbe says she can become the primate’s legal guardian if the bankrupt animal sanctuary where Matthew lives closes.

Although this court case is really just an attempt to get around legal obstacles, it does raise a fascinating point, and one which has been around for a while - what is a human? Chimpanzees are the closest relatives of humans, sharing about 94% of our DNA. They can use tools, show altruism, have been taught sign language and been able to communicate, albeit only roughly and with debatable grasp of grammar. Young chimps even outperform college students on memory tests (which I spoke about on the radio recently!)

If you say that a certain level of competency is required, then what about babies, or small children? The next step is usually to talk about potential, in that a baby has potential to become much more, whereas a chimp doesn’t. But what about people with disabilities, particularly severe disabilities? And so on.

There’s the other side too:

If Matthew the chimp were declared a person, scientists foresee it would open a messy can of worms.

“In general, I don’t think that it’s a good idea to grant chimpanzees legal human rights,” [said John Mitani, a primate behavioral ecologist at the University of Michigan] “Chimpanzees are well-known to kill each other. What would we do to perpetrators of those ‘crimes?’”

That’s a neat question. Could a chimp plead diminished responsibility/capacity? But if that will always apply to a chimp, surely they don’t deserve to be classed as human? But what about those humans who do make this defence? And so on.

Very similar arguments come up a lot in spiritual discussions (e.g., whether animals have souls), in making the case for or against abortion, and broader questions of ethics such as euthanasia. It also ties nicely into the broader question of the definition of “life” and “intelligent life” - but that’s for another post!

What do you think? If this were being decided by jury, how would you vote and why? If you were to invite animals into the Human Clan, who would they be? Is the plot of The Bee Movie going to become a reality?!

Powered by WordPress