I have a pandas dataframe df as illustrated below:
BrandName Specialty
A H
B I
ABC J
D K
AB L
I want to replace 'ABC' and 'AB' in column BrandName by A. Can someone help with this?
The easiest way is to use the replace
method on the column. The arguments are a list of the things you want to replace (here ['ABC', 'AB']
) and what you want to replace them with (the string 'A'
in this case):
>>> df['BrandName'].replace(['ABC', 'AB'], 'A')
0 A
1 B
2 A
3 D
4 A
This creates a new Series of values so you need to assign this new column to the correct column name:
df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A')
DataFrame
object has powerful and flexible replace
method:
DataFrame.replace(
to_replace=None,
value=None,
inplace=False,
limit=None,
regex=False,
method='pad',
axis=None)
Note, if you need to make changes in place, use inplace
boolean argument for replace
method:
inplace: boolean, default
False
IfTrue
, in place. Note: this will modify any other views on this object (e.g. a column form a DataFrame). Returns the caller if this isTrue
.
df['BrandName'].replace(
to_replace=['ABC', 'AB'],
value='A',
inplace=True
)
You could also pass a dict
to the pandas.replace
method:
data.replace({
'column_name': {
'value_to_replace': 'replace_value_with_this'
}
})
This has the advantage that you can replace multiple values in multiple columns at once, like so:
data.replace({
'column_name': {
'value_to_replace': 'replace_value_with_this',
'foo': 'bar',
'spam': 'eggs'
},
'other_column_name': {
'other_value_to_replace': 'other_replace_value_with_this'
},
...
})
Just wanted to show that there is no performance difference between the 2 main ways of doing it:
df = pd.DataFrame(np.random.randint(0,10,size=(100, 4)), columns=list('ABCD'))
def loc():
df1.loc[df1["A"] == 2] = 5
%timeit loc
19.9 ns ± 0.0873 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
def replace():
df2['A'].replace(
to_replace=2,
value=5,
inplace=True
)
%timeit replace
19.6 ns ± 0.509 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)