3
votes
data['BUILDING CLASS CATEGORY'] = np.where(data['BUILDING CLASS 
CATEGORY']!='01 ONE FAMILY DWELLINGS' or '02 TWO FAMILY 
DWELLINGS ', 'OTHERS' , data['BUILDING CLASS CATEGORY'])

neither

data['BUILDING CLASS CATEGORY'] = np.where(data['BUILDING CLASS 
CATEGORY']!='01 ONE FAMILY DWELLINGS' or data['BUILDING 
CLASS CATEGORY']!='02 TWO FAMILY DWELLINGS', 'OTHERS' , 
data['BUILDING CLASS CATEGORY'])

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

1
With & for and and ^ for or - foxyblue
And wrap the comparisons in () so they evaluated first, e.g. (a<0) | (b>3). - hpaulj

1 Answers

3
votes

Your second try is very close, using numpy.where and note that [its] conditional statement uses bitwise operators (& | ^ << >> ~).
Put everything together we'll have the following;

import pandas as pd
import numpy as np

data = pd.DataFrame({'COL': ['01 thing','02 thing','03 thing']})

print(data)
>>>    COL
>>> 0  01 thing
>>> 1  02 thing
>>> 2  03 thing

data['COL'] = np.where((data['COL'] != '01 thing') | 
                       (data['COL'] != '02 thing'), 'other', data['COL'])

print(data)
>>>    COL
>>> 0  other
>>> 1  other
>>> 2  other

(suggestion:) If you want to replace all records that is not '01 thing' and not '02 thing' you might want to replace | with & instead. Also, I would consider using str.startswith.
Substituting that into your np.where(condition) we have;

data['COL'] = np.where(~data['COL'].str.startswith('01') &
                       ~data['COL'].str.startswith('02'), 'other', data['COL'])

print(data)
>>>    COL
>>> 0  01 thing
>>> 1  other
>>> 2  02 thing