0
votes

Hi I use groupby on two columns, product and city as it is displayed in the image below, with the quantity order of which product in a specific City. Now, i want only to see the city that had the maximum value of quantity order for each Product.

A screenshot of my data is here.

here is the code that I used:

city_grp = df.groupby(["Product","City"])
p_c_df = pd.DataFrame(city_grp["Quantity Ordered"].sum())
1

1 Answers

0
votes

Since your example is a screenshot I had to create some of my own data, but this should be a good enough example for you to replicate on your own data

Since your example is a screenshot I had to create some of my own data, but this should be a good enough example for you to replicate on your own data

#Sample Data
df = pd.DataFrame({
    'Column1' : ['A', 'A', 'A', 'B', 'B', 'B'],
    'Column2' : ['AA', 'AB', 'AB', 'BA', 'BA', 'BB'],
    'Column3' : [100, 200, 100, 300, 200, 100]
})

#Set sample data to a multi-index as in your screenshot
df = df.set_index(['Column1', 'Column2'])

#reset the index to work with the data
df = df.reset_index()

#Group by multiple columns finding only the max of each grouping and reseting the index to the multi-index it was before
df.loc[df.groupby(['Column1', 'Column2'])['Column3'].idxmax()].set_index(['Column1', 'Column2'])