I have a data frame with a column for 'genre' with strings like 'drama, comedy, action'.
I want to split the elements like this 'drama', 'comedy', 'action' so I've used;
Genre=[]
for genre_type in books['genre'].astype('str'):
Genre.append(genre_type.split(','))
genre['genres_1']=genres_1
but, the result contains spaces between genres (other than the first one listed) like 'drama','_comedy','_action'. (I used an underscore to represent the space because otherwise it's hard to see).
so I tried
Genre_clean=[]
for x in books['genres_1'].astype('str'):
Genre_clean.append(x.strip(' '))
Genre_clean
but the space remains, what am I doing wrong?
my full code is below;
import pandas as pd
# Creating sample dataframes
books = pd.DataFrame()
books['genre']=['drama, comedy, action', 'romance, sci-fi, drama','horror']
# Splitting genre
Genre=[]
for genre_type in books['genre'].astype('str'):
Genre.append(genre_type.split(','))
books['genres_1']=Genre
# trying to remove the space
Genre_clean=[]
for x in books['genres_1'].astype('str'):
Genre_clean.append(x.strip(' '))
Genre_clean
strip
only removes spaces from the beginning and ends of strings, what you have after your first step is a string representation of a list (books['genres_1'].astype('str')
), there aren't any outer spaces to remove from"['romance', 'sci-fi', 'drama']"
... – BeRT2me