I have an example dataframe (df) like the one shown below, and I would like to use pandas to create a series with labels that correspond to each color and the number of times it appears an entry with that color appears in the dataframe, kind of like a totals for each color. I have tried the following, but Instead get a series with the total number of rows showing as the color sum for each color:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv('data_set.txt', index _col=0)
total_count = {_:len(df['type']) for _ in df['type'].unique() }
total_count
Current Output:
{'red': 12,
'green': 12,
'yellow': 12,
'blue': 12}
However, clearly there are not 12 entries for each of the 4 colors in the dataframe. What am I doing wrong?
| number | date | color | weight | temperature | size |
|---|---|---|---|---|---|
| 0 | 1/1/2021 | red | 0.2 | 0.2 | big |
| 1 | 1/1/2021 | red | 0.6 | 0.6 | small |
| 2 | 1/1/2021 | red | 0.4 | 0.6 | small |
| 3 | 1/1/2021 | green | 0.2 | 0.4 | big |
| 4 | 1/1/2021 | green | 1 | 1 | small |
| 5 | 1/1/2021 | yellow | 0.4 | 0.4 | big |
| 6 | 1/1/2021 | yellow | 0.1 | 0.2 | big |
| 7 | 1/1/2021 | yellow | 1.3 | 0.5 | big |
| 8 | 1/1/2021 | yellow | 1.5 | 0.5 | small |
| 9 | 1/1/2021 | yellow | 1.5 | 0.5 | small |
| 10 | 1/1/2021 | blue | 0.4 | 0.3 | big |
| 11 | 1/1/2021 | blue | 0.8 | 0.2 | small |
df['color'].value_counts().to_dict()- Anurag Dabas