I'm new to Pandas. To simplify, I have a data frame with two columns: product_id and rating. Each entry is a new review for the given product. Now I want to get a new data frame in which lines corresponding to the product which received less then 20 reviews (ie. appears less then 20 times in the original data frame) are removed. I can count the number of occurences with:
a = data.groupby('product_id').count()
b = a.loc[a['rating']>20]
but that gives me back a 1D data frame. When displayed, each product_id has its count, but I'm unable to access the actual product_id's to use them to filter the original table. For instace,
b.values
gives back a 1D array of the counts, but no the product_ids.