Removing right part of string from pandas column if equal to another pandas column

Question

I am having a nan value when trying to get left part of a string a pandas dataframe, where the left condition is depending on the lengh of the cell in another column of the dataframe :

Example of df :

Phrase	Color
Paul like red	red
Mike like green	green
John like blue	blue

My objectives is to obtain a series of the first part of the phrase => before "like {Color}". Here it would be :

|First Name|

i try to call the function below :

df["First  Name"] = df["Phrase"].str[:- df["Color"].str.len() - 6]

But i keep having Nan value results. It seems my length calculation of the colors can't transmit to my str[:-x] function.

Can someone help me understand what is happening here and find a solution ?

Thanks a lot. Have a nice day.

Mayank Porwal Mayank Porwal · Accepted Answer · 2021-02-25T17:36:32

Consider below df:

In [128]: df = pd.DataFrame({'Phrase':['Paul like red', 'Mike like green', 'John like blue', 'Mark like black'], 'Color':['red', 'green', 'blue', 'brown']})

In [129]: df
Out[129]: 
            Phrase  Color
0    Paul like red    red
1  Mike like green  green
2   John like blue   blue
3  Mark like black  brown

Use numpy.where:

In [134]: import numpy as np

In [132]: df['First Name'] = np.where(df.apply(lambda x: x['Color'] in x['Phrase'], 1), df.Phrase.str.split().str[0], np.nan)

In [133]: df
Out[133]: 
            Phrase  Color First Name
0    Paul like red    red       Paul
1  Mike like green  green       Mike
2   John like blue   blue       John
3  Mark like black  brown        NaN

Removing right part of string from pandas column if equal to another pandas column

|First Name|

2 Answers