2
votes

I am trying to take a pandas dataframe and returns a pandas dataframe object after adding the new column 'Size_Category' with value of either small medium or large based on some conditions.

mod_df = df.copy(deep=True)
mod_df.loc[(mod_df['Length'] <= 300 , 'Size_Category')] = 'small' # condition, new_column
mod_df.loc[(mod_df['Length'] <= 300 | mod_df['Length'] > 450) , 'Size_Category')] = 'medium' # condition, new_column
mod_df.loc[(mod_df['Length'] >= 450, 'Size_Category')] = 'large' # condition, new_column

When I do this, it gives me an error saying

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I handle this?

1
If my answer was helpful, don't forget accept it. thanks.jezrael

1 Answers

2
votes

You missing ():

mod_df.loc[(mod_df['Length'] <= 300) | (mod_df['Length'] > 450) , 'Size_Category')]

Another solution is use cut:

df = pd.DataFrame({'Length': [0,10,300,400,449,450,500]})

bins = [-np.inf, 300, 449, np.inf]
labels=['small','medium','large']
df['Size_Category'] = pd.cut(df['Length'], bins=bins, labels=labels)
print (df)
   Length Size_Category
0       0         small
1      10         small
2     300         small
3     400        medium
4     449        medium
5     450         large
6     500         large