0
votes

I am currently working on building a regressor model to predict the food delivery time.

This is the dataframe with a few observation

1

If you observe the Cuisines column has many strings. Used the code

pd.get_dummies(data.Cuisines.str.split(',',expand=True),prefix='c')

This helped me split the strings and hot encode, however, there is a new issue to be dealt with.

Merged the dataframe and dummies. fastfood appears in 1st and 3rd rows. Expected output was a single fastfood column with value 1 on first and third rows, however, there are two fastfood columns are created. fastfood(4th column) is created for first row and fastfood(15th column) for thrid row.

2

Can someone help me solve this help me get a single fastfood column with value 1 on first and third rows and similarly for the other cuisines too.

1
It still is the same. This code again creates two different fastfood columns. - Ranjini

1 Answers

1
votes

The two Fast Food are different by a trailing space. You probably want to try:

data.Cuisines.str.get_dummies(',\s*')