2
votes

Suppose I conduct a survey of 10 people asking whether to rank a movie as 0 to 4 stars. Allowable answers are 0, 1, 2, 3, and 4.

The mean is 2.0 stars.

How do I calculate the certainty (or uncertainty) about this 2.0 star rating? Ideally, I would like a number between 0 and 1, where 0 represents complete uncertainty and 1 represents complete certainty.

It seems clear that the case where the 10 people choose ( 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ) would be the most certain, while the case where the 10 people choose ( 0, 0, 0, 0, 0, 4, 4, 4, 4, 4 ) would be the least certain. ( 0, 1, 1, 2, 2, 2, 2, 3, 3, 4 ) would be somewhere in the middle.

4

4 Answers

6
votes

The standard deviation does not have the properties requested. It is zero when everyone chooses the same answer, and can be as great as sqrt(40/9) = 2.11 when there are five 0s and five 4s.

I suggest you use 1-stdev(x)/sqrt(40/9) which will take value 1 when everyone agrees, and value 0 when there are five 0s and five 4s.

3
votes

The function you're after here is the standard deviation.

The standard deviations of your three examples are 0 (meaning no deviation), 2.1 (large deviation) and 1.15 (in between).

0
votes
0
votes

You should consider whether or not the mean value is an appropriate statistic for this kind of information. ie Is a movie rated 2 stars twice as good as one rated 4 stars?

You may be better served by using a percentile measure (such as the median) to represent the central tendency, and a percentile range (such as the IQR) to measure 'certainty'. As in the answers above, certainty would be greatest with a value of 0, as you are really making a measurement of deviation from the central tendency.

Incidentally, a survey of 10 people is too small to perform much in the way of meaningful statistical analysis.