12
votes

I'm trying to visualize some data as a pie chart. My data is structured as a list of (Season, Int) tuples, where the integer represents the number of items that are available for a particular season.

The difficulty is that one item can be in multiple seasons: an item can be valid for both fall and winter.

Is it possible to display this information as a pie chart? It's not clear what the denominator should be when calculating percentages, because the sum of the counts can be greater than the total item count.

As an example to make it clearer, let's say I have 10 items. A valid dataset might be:

  • (Fall, 4)
  • (Winter, 5)
  • (Summer, 3)
  • (Spring, 10)

The total number of items represented here are 22, but there are truly only 10 items.

Calculating the fall percentage as 4 / 10 doesn't really make sense, but neither does 4 / 22.

Is this data just not compatible with pie charts (or other percentage-oriented charts)?

1
Why do you consider that the 4 / 22 percentage does not make sense? It is the percentage of "occurences" of items for the Fall season...Edit: whether it makes sense or not all depends on what "items availability" for a season mean, and what is being represented. Say, if the purpose is to compare number of representations of theater plays per season, 4 / 22 would make sense...Mehdi
Is this all the data you have or do you have a way to count in how many categories the items appear?7hibault
@7hibault Yes the original data is of type [(Season, Item)]Bill

1 Answers

1
votes

It depends on what business question the plot tries to represent. There are best practices that we use in data visualization. These best practices are guided by statistics and human perception. We want the plot to immediately tell the story.

Reading your post it looks that the business question you are trying to answer is: What is the percent of items available in each season, relative to the overall (total) number of items the company has. Using the numbers in your post 100% of the items are available in the spring, and only 50% of the items are available in the winter.

This is great business question to visualize (if this indeed is the business question you are trying to solve) but you can't use a pie chart for it. Pie charts must represent 100%. Bar charts are good for comparisons and I recommend you would use them. You can make the y-axis units be percent, and have 4 bars along the x-axis.

Bar charts do not have to (statistically) add to 100% but if you are concerned that people may be wondering about it, you can achieve the same effect by making the y-axis a count of the number of items in each season. The plot will still show the relative number of items available in each season. This is another good reason to use bar chart in this case.

Lastly, note that pie charts look nice but they are also not recommended from human perception point of view. It is hard for us to compare the relative size of the slices.