Tuesday, May 8, 2012

Application of Pascal’s Triangular Plus Monte Carlo Analysis to Appraise the Wisdom of Crowds


Following to the article of “Application of Pascal’s Triangular Plus Monte Carlo Analysis to Calculate the Risk of Expected Utility” on link: http://emfps.blogspot.com/2012/05/application-of-pascals-triangular-plus.html, the purpose of this article is to appraise the wisdom of crowds by using of a simulation model which is the combination of Pascal’s Triangular method and Monte Carlo simulation model. The most important finding is to be proved the research study conducted in 1906 by the British scientist Sir Francis Galton (1822 – 1911) in the case of the Wisdom of Crowds in which we can say there is a very good  adaptability between the mathematical models and Human’s sense. I think that we can also use from this simulation model for Market Predictions.
According the book of “The Wisdom of Crowds, Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economics, Societies, and Nations” by James Surowiecki (2004), he stated the story of research study conducted by Sir Francis Galton in 1906 in related to the wisdom Crowds. I have uploaded the details of this story for your convenience as follows:

                                   British Scientist Sir Francis Galton (1822 – 1911)   





In the meanwhile, you can also read this story on below link:
http://www.hsph.harvard.edu/student-life/orientation/files/the_wisdom_of_crowds_reading.pdf


Methodology

Let me explain you the methodology and the procedure of making this simulation model step by step as follows:
1) I used the data included in the story of the research study done by Sir Francis Galton in 1906 as my example to make this simulation model where the real weight of ox was equal to 1198 pounds and average weight of ox announced by 787 people (participants) was equal to 1197.
2) I assumed the percentage of errors equal to 1% (plus/minus), 10% (plus/minus), 20% (plus/minus), and 30% (plus/minus) where we have:
                    1% error         10% error         20% error       30% error
Plus             1209.98                  1317.8                  1437.60               1557.40
Minus         1186.02                  1078.2                   958.40                 838.60

3) The difference between Plus and Minus will be equal to the total trials (the possible outcomes) as follows:

                       1% error            10% error                20% error                30% error
Trials (n)              24                          240                            480                           719

As the matter of fact, each number of trials between 0 and 24, 0 and 240, 0 and 480, 0 and 719 has been assigned to each weight of ox between 1210 and 1186, 1318 and 1078, 1438 and 958, 1557 and 838. There are two situations in which we assign zero to the least weight; in this case, we will have all variances from right side of real weight (all percentages are negative). Another situation is that we assign zero to the most weight, in this case, we will have all variances from left side of real weight (all percentages are positive). It does not make any different.
4) According to the total trials (the possible outcomes), I used from Pascal’s Triangular to find nominal coefficient (C (n, r)) respectively.
5) To calculate the total amount of outcomes for each possible outcome
6) To calculate the total sum of outcomes for the total trials
7) To divide step 5 to step 6 for each possible outcome to obtain cumulative probability
8) To find out the consequence of all possible outcomes (which is the same the Binomial Probability Distribution) by using of accumulative probability (step 7)
9) Using of the Binomial Probability Distribution to make the basic feeds for Monte Carlo Simulation Model. In this case, we should make all consequences as the assigned probability or cumulative probability (cut –offs) for using into our Monte Carlo Simulation Model. If we want to make very exact Cut – Offs, we can utilize from excel formulas of
“= NORMINV (Probability, mean, STDEV) and = NORMDIST (x, mean, STDEV, true)”
But, at the first we should normalize our Binomial Probability Distribution by using of below formula instead of “mean and STDEV” as follows:
mean = n / 2  ,  n = trials
STDEV = ((n) ^0.5) / 2
10) After making Cut – Offs by using of step (9), Using of the Rand formula = Rand () on our excel spreadsheet for all participants (787 people)
11) Using of the VLOOKUP formula = VLOOKUP (Rand cell, Cut-offs, 2) for all participants (787 people)
12) To calculate the average of all participants (787 people), 100 people, 50 people, and 10 people
13) The increase iterative calculations by using of a two –way Table just like Sensitivity analysis in which row is the numbers of 1 to 10; column is the numbers of 1 to 400 or 1000 and independent variable is the result of step 12
13) To calculate the average of ox’s weight, Standard deviation (STDEV), Coefficient of Variation (CV) or Risk percentage and Variance of the real weight (%) for 1% error (plus/minus), 10% error (plus/minus), 20% error (plus/minus), 30% error (plus/minus) and also for 787 participants, 100 participants, 50 participants, and 10 participants.

Finding and discussion

 According to the step 1 to 9, I calculated all Cut – Offs as follows:

1% error
Cut offs
Weight
0
1186
3.55E-06
1192
1%
1195
10%
1197
40%
1201
90%
1210
100.00%
1210


10% error
Cut offs
Weight
0
1078
1.45E-53
1180
1%
1188
10%
1196
40%
1208
90%
1257
100.00%
1318


20% error
Cut offs
Weight
0%
958
1.2E-105
1172
1%
1184
10%
1195
40%
1212
90%
1284
1
1438

30% error
Cut offs
Weight
0
838
1.5E-157
1166
1%
1180
10%
1194
40%
1215
90%
1304
100%
1557

Regarding to the step 10 to 13, the final results are below cited:

   30% Error
787 people
Average
1213.951188
STDEV
1.096048697
CV
0.000902877
CV (max)
0.10%
CV (Ave.)
0.000954337
STDEV (ave.)
1.158530439
STDEV (max)
1.216094929
Weight (Ave.)
1213.96
Real weight
1198
Error
-1.33%

100 people
Average
1213.854
STDEV
3.335317
CV
0.002748
CV (max)
0.28%
Cv (average)
0.26%

50people
Average
1214.281

STDEV
4.601357

CV
0.003789

CV (max)
0.39%

Cv (average)
0.38%


10people

Average
1213.519
STDEV
10.28537
CV
0.008476
CV (max)
0.87%
Cv (average)
0.84%

20% Error
787 people
Average
1209.92
STDEV
0.792926
CV
0.000655
CV (max)
0.07%
CV (Ave.)
0.0007
STDEV (ave.)
0.846342
STDEV (max)
0.891621
Weight (Ave.)
1209.892
Real weight
1198
Error
-0.99%
100 people
Average
1211.146
STDEV
2.659586
CV
0.002196
CV (max)
0.22%
Cv (average)
0.21%
50 people
Average
1211.209
STDEV
3.708314
CV
0.003062
CV (max)
0.32%
Cv (average)
0.31%
10 people
Average
1210.929
STDEV
7.793489
CV
0.006436
CV (max)
0.70%
Cv (average)
0.67%


10% Error
787 people
Average
1207.204
STDEV
0.655779
CV
0.000543
CV (max)
0.06%
CV (Ave.)
0.000531
STDEV (ave.)
0.640865
STDEV (max)
0.6879
Weight (Ave.)
1207.215
Real weight
1198
Error
-0.77%
100 people
Average
1207.214
STDEV
1.74593
CV
0.001446
CV (max)
0.16%
Cv (average)
0.15%
50 people
Average
1207.27
STDEV
2.682693
CV
0.002222
CV (max)
0.23%
Cv (average)
0.21%
10 people
Average
1206.959
STDEV
5.516085
CV
0.00457
CV (max)
0.51%
Cv (average)
0.47%


1% Error
787 people
Average
1200.064927
STDEV
0.144784429
CV
0.000120647
CV (max)
0.01%
CV (Ave.)
0.01%
STDEV (ave.)
0.141034486
STDEV (max)
0.145644979
Weight (Ave.)
1200.072824
Real weight
1198
Error
-0.17%
100 people
Average
1200.078
STDEV
0.399969
CV
0.000333
CV (max)
0.04%
Cv (average)
0.033%
50 people
Average
1200.086
STDEV
0.552151
CV
0.00046
CV (max)
0.05%
Cv (average)
0.047%
10 people
Average
1200.144
STDEV
1.314832
CV
0.001096
CV (max)
0.11%
Cv (average)
0.107%

I compared the changes of Error (plus/minus) to Risk (CV) on below graph:



As you can see, there is a significant increase on Risk from 1% error to 10% error then it will go up slowly.

In the case of the changes on error to variance of real weight and average weight, I can say that the results are the same above mentioned. Please see below graphs:






The most important finding is about the changes of Risk to increase of participants (people) on below graph:






This graph shows that the Risk will sharply decrease, if the participants (people) increase more than 50 people. This result is always the constant for the changes on error’s percentages. I can say that the research study conducted by Sir Francis Galton is approximately approved, if the range of error does not increase more than 1%.
 
Note:  “All spreadsheets and calculation notes are available. The people, who are interested in having my spreadsheets of this method as a template for further practice, do not hesitate to ask me by sending an email to: soleimani_gh@hotmail.com or call me on my cellphone: +989109250225. Please be informed these spreadsheets are not free of charge.”