1
votes

I need to get the frequency of each element in lists when the list is in a pandas dataframe columns.

it like data.groupby(["element in a","element in b"]).size(),but column 'a' and column 'b' is list.

I need the size of each combination by element in 'a' and b'b'

in data:
        a           b
0   [17, 21, 22]    [zhinan, shejiyuanze, fankui]
1   [17, 21, 23]    [zhinan, shejiyuanze]
2   [17, 21]        [zhinan, shejiyuanze, fankui]
3   [17, 21, 22]    [zhinan, shejiyuanze, fankui]
4   [17, 21]        [zhinan, shejiyuanze, yizhi]

Desired Output:

              17 21 22 23 
zhinan        5  5  2  1
shejiyuanze   .  .  .  . 
fankui        .  .  .  . 
yizhi         .  .  .  .

For example, when a=17 and b=zhinan, the number is 5.when a=17 and b=fankui,the number is 3.when a=23 and b= fankui or b=yizhi the number is 0.

I was wondering if there is a efficient/direct way to do this.

thanks

1
u can technically get ur rows & columns by using pd.data.pivot(index='b', columns='a') but I fail to see how u are counting frequencies here? could you elaborate I lack the information to reproduce ur exampleUralan
it's like groupby element in column 'a' and element in columns 'b'. i need the size of Each combination by 'a' and 'b'.hanson james
ok. i see. you want to count how many times each combination of 17, 21, 22, 23 x zhinan, shejiyuanze, fankui, yizhi occurs?Uralan

1 Answers

2
votes

Use explode to explode lists. Remember to reset_index before second explode.

Then use group_by to count number of occurrences.

Finally use unstack to convert Series to Dataframe

df.explode('a').reset_index(drop=True).explode('b').groupby(['b', 'a']).a.count().unstack()