I have a dataframe with 3 columns. I am trying to find distinct and aggregated values for each group in the row.
I am using a jupyter notebook for running this. Also using python libraries like numpy
For example my original dataframe 'df' is like:
Name Subject Grade
0 Tom Math 20
1 Tom Sci 30
2 Tom Eng 10
3 Tom Math 40
4 John Math 30
I want to count the number of times a student gave a particular exam and the sum of grades of both the exam.
I applied the groupby query as follows:
new_df = df.groupby(['Name', 'Subject']).agg({ 'Grade': np.sum, 'Subject': np.size})
But this resulted in only 2 columns as expected. It did demonstrate the aggregate score for each subject for each student but in the form of rows.
I also tried using crosstab:
pd.crosstab(df.Name,df.Subject)
But I am not sure how can I incorporate the aggregation function of Sum.
My expected output is as follows:
Name Eng Math Sci GradeEng GradeMath GradeSci
0 John 0 30 0 0 1 0
1 Tom 10 60 30 1 2 1