That's a really good reminder of what the rules are to make the wisdom of the crowd produce a better estimate than an expert guess.
While I bet someone in the audience might have known the actual number of Roman Emperors, and the sum of numbers on a roulette wheel is trivia enough well-known some people *had* to know, it's a different game entirely when we talk about unknowable answers.
Folks in Galton's study had no means of knowing the ox's weight upfront. A famous wisdom of crowds story is looking for the location of the USS Scorpion once it grounded. There was no one who had the answer.
And it's a similar story with any estimates we make. We don't have a way of knowing upfront how long it will take.
Yet, all the wisdom of the crowd's caveats still apply:
a) People need to have relevant information
b) They need to act independently (which in many estimation contexts doesn't work)
c) There needs to be diversity of opinions
Just publicly throwing guesses of numbers around features we have little understanding of *is not* the wisdom of crowds. Even worse if one person in a room has more decision leverage than others (yup, planning poker, I'm looking at you).
I can't get over the fact that the median person thought there were only 10 Roman emperors. I would have underestimated the number, but just 10 is insane, and almost half of people had to have thought it was even less than that.
i think that is unfair. You’re right that if everyone knew all the answers there wouldn’t be result to talk about, but the estimation problems chosen for the experiment were chosen exactly because they are hard to answer and things that most people don’t know
I really doubt this works the way the authors suggest. When you ask a group of people a question that many of them have no clue about you just end up with random guesses. Forming small groups just allows the guessers a chance to adopt the answers of those that at least have a rationale for their answer. This is nothing like Galtons original problem, which involved a group of people all likely to have some insight into the correct solution. All it proves is that asking people who have no relevant information is a really bad way to get an answer. If a high enough proportion of people do know something then forming groups just weights their answers higher. If very few have any information then you still get garbage as with the emperor question.
If I've understood your comment correctly, and the study authors, I think you are in agreement. The group discussion allows those without some rationale to adjust their answers in favour of those who do (indeed, there's scarcely enough time to do more). This is why there is improvement even on the questions - like the roman emperor one - where most people have bad information/starting points
>The idea was that maybe group discussion just performed a statistical function which could be matched by the averaging rule. No averaging rule tested gave answers as good as the group consensus answer, suggesting that group - even in that single minute - is able to integrate and evaluate information in a more sophisticated way.
If you asked people to rate their confidence as well, there probably is such an averaging rule, or close to it.
No, they are interpreting the group as some kind of exchange of ideas that improves everyone’s estimates. That is different than just weeding out the people that have no idea. People with no knowledge that can help them answer are just noise, often very biased noise. That is why the groups don’t improve the answer to the emperor question; there might be one scholar of Roman history but he can only influence a few people.
the groups didn't discuss the roman emperor question, so it is a bad example to choose. The oil barrel question is maybe structurally similar- individual answers were a long way from the correct answer. Here, like the other questions, post discussion agreement (and mean of post discussion individual answers) were an improvement on mean of pre-discussion answers, suggesting that discussion caused people to yield to better rationaled answers and adjust their own estimates in that direction
How many Roman emperors - so tell me first how you define the Roman empire? Ending in 1453 with fall of Constantinople? When was the final emperor in the west (much disputed). Do you include all some or none of the disputed emperors at different periods. What about holy Roman emperors through to 1806. It is a null question.
Calories in butter? Surely that depends on the butter..
As ever where the survey questions are rubbish the results will be rubbish.
This criticism fails because even under ambiguous questions (which all questions are too some extent) the success in figuring out which the answer the question expects can still be measured and improved (e.g. by group discussion). If the questions really were impossible then the answers would vary randomly, and there would be no directional trend to improvement (which would be impossible, because there couldn't be better or worse answers to impossible questions).
I did not say the question was impossible. I said your posing of it was ignorant. To extent you got answers they were people agreeing on what is commonly thought, with a heavy serve of cultural knowledge of the group.
AI gives us that now without all your palaver.
If I had been in the group I would have said I had no basis on which to estimate.
Grand theory on shaky emperical roots is shonky stuff.
I cannot take any credit for the design, running or reporting of the experiment - I just wrote about it for my newsletter
If you believe there is no basis on which to estimate answers then I think you are right, deliberation is not for you and you are better asking AI for your answers
I knew the answer on butter (I'm a triathlete and pay attention to things like calories per gram of fat) and the roulette wheel question is basic math. So, I could have persuaded my group to give the right answer on those two, and a pretty good answer on some of the others. Does this count as wisdom of crowds?
The interesting thing is that persuading your group wouldn't count as the "wisdom of crowds" in the sense this literature terms it - that is restricted to the answers you get by averaging individual answers. I would call persuading the group you know the answer as deliberation, and we have other studies showing that the exchange of reasons, not just information, is key to this. In this example you gave one explicit reason, "I know because I am a triathlete" and one implicit, "it's basic math", which implies your level of math is good enough to consider some math basic!
That's a really good reminder of what the rules are to make the wisdom of the crowd produce a better estimate than an expert guess.
While I bet someone in the audience might have known the actual number of Roman Emperors, and the sum of numbers on a roulette wheel is trivia enough well-known some people *had* to know, it's a different game entirely when we talk about unknowable answers.
Folks in Galton's study had no means of knowing the ox's weight upfront. A famous wisdom of crowds story is looking for the location of the USS Scorpion once it grounded. There was no one who had the answer.
And it's a similar story with any estimates we make. We don't have a way of knowing upfront how long it will take.
Yet, all the wisdom of the crowd's caveats still apply:
a) People need to have relevant information
b) They need to act independently (which in many estimation contexts doesn't work)
c) There needs to be diversity of opinions
Just publicly throwing guesses of numbers around features we have little understanding of *is not* the wisdom of crowds. Even worse if one person in a room has more decision leverage than others (yup, planning poker, I'm looking at you).
I can't get over the fact that the median person thought there were only 10 Roman emperors. I would have underestimated the number, but just 10 is insane, and almost half of people had to have thought it was even less than that.
seems to be more of a confirmation that most of what people know is wrong
i think that is unfair. You’re right that if everyone knew all the answers there wouldn’t be result to talk about, but the estimation problems chosen for the experiment were chosen exactly because they are hard to answer and things that most people don’t know
I really doubt this works the way the authors suggest. When you ask a group of people a question that many of them have no clue about you just end up with random guesses. Forming small groups just allows the guessers a chance to adopt the answers of those that at least have a rationale for their answer. This is nothing like Galtons original problem, which involved a group of people all likely to have some insight into the correct solution. All it proves is that asking people who have no relevant information is a really bad way to get an answer. If a high enough proportion of people do know something then forming groups just weights their answers higher. If very few have any information then you still get garbage as with the emperor question.
If I've understood your comment correctly, and the study authors, I think you are in agreement. The group discussion allows those without some rationale to adjust their answers in favour of those who do (indeed, there's scarcely enough time to do more). This is why there is improvement even on the questions - like the roman emperor one - where most people have bad information/starting points
>The idea was that maybe group discussion just performed a statistical function which could be matched by the averaging rule. No averaging rule tested gave answers as good as the group consensus answer, suggesting that group - even in that single minute - is able to integrate and evaluate information in a more sophisticated way.
If you asked people to rate their confidence as well, there probably is such an averaging rule, or close to it.
No, they are interpreting the group as some kind of exchange of ideas that improves everyone’s estimates. That is different than just weeding out the people that have no idea. People with no knowledge that can help them answer are just noise, often very biased noise. That is why the groups don’t improve the answer to the emperor question; there might be one scholar of Roman history but he can only influence a few people.
the groups didn't discuss the roman emperor question, so it is a bad example to choose. The oil barrel question is maybe structurally similar- individual answers were a long way from the correct answer. Here, like the other questions, post discussion agreement (and mean of post discussion individual answers) were an improvement on mean of pre-discussion answers, suggesting that discussion caused people to yield to better rationaled answers and adjust their own estimates in that direction
there were about 70 emperors though? iirc
How many Roman emperors - so tell me first how you define the Roman empire? Ending in 1453 with fall of Constantinople? When was the final emperor in the west (much disputed). Do you include all some or none of the disputed emperors at different periods. What about holy Roman emperors through to 1806. It is a null question.
Calories in butter? Surely that depends on the butter..
As ever where the survey questions are rubbish the results will be rubbish.
This criticism fails because even under ambiguous questions (which all questions are too some extent) the success in figuring out which the answer the question expects can still be measured and improved (e.g. by group discussion). If the questions really were impossible then the answers would vary randomly, and there would be no directional trend to improvement (which would be impossible, because there couldn't be better or worse answers to impossible questions).
I did not say the question was impossible. I said your posing of it was ignorant. To extent you got answers they were people agreeing on what is commonly thought, with a heavy serve of cultural knowledge of the group.
AI gives us that now without all your palaver.
If I had been in the group I would have said I had no basis on which to estimate.
Grand theory on shaky emperical roots is shonky stuff.
I cannot take any credit for the design, running or reporting of the experiment - I just wrote about it for my newsletter
If you believe there is no basis on which to estimate answers then I think you are right, deliberation is not for you and you are better asking AI for your answers
True if I wanted to know height of Eiffel Tower i would google it and assess the answers emerging. I would look for consistency from likely reliable sources. Eg https://www.toureiffel.paris/en/news/history-and-culture/300-330-meters-story-towers-height
Which agrees with the summary pages I found above it. Took a minute or so to review.
I knew the answer on butter (I'm a triathlete and pay attention to things like calories per gram of fat) and the roulette wheel question is basic math. So, I could have persuaded my group to give the right answer on those two, and a pretty good answer on some of the others. Does this count as wisdom of crowds?
The interesting thing is that persuading your group wouldn't count as the "wisdom of crowds" in the sense this literature terms it - that is restricted to the answers you get by averaging individual answers. I would call persuading the group you know the answer as deliberation, and we have other studies showing that the exchange of reasons, not just information, is key to this. In this example you gave one explicit reason, "I know because I am a triathlete" and one implicit, "it's basic math", which implies your level of math is good enough to consider some math basic!