OneHotEncoder ,TypeError: init() got an unexpected keyword argument 'drop'

Question

In sklearn 0.20.3 documentation, https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html init has parameter drop, but when I use the same it's throwing a type error.

I didn't find any examples using the "drop" Keyword, most of the examples I have seen are using the older version of sklearn. And in some cases, they used ColumnTransfer (even that's for the older version of OnehotEncoder as it gives Future Warning)

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])

onehotencoder = OneHotEncoder(categories = [0],handle_unknown='ignore',drop=[0])

Expected results: should be able to compile the above code. Actual results: TypeError (init() got an unexpected keyword argument 'drop')

Ganesh Jadhav Ganesh Jadhav · Accepted Answer · 2019-05-13T06:00:27

Try with this:

onehotencoder = OneHotEncoder(categories = [0],handle_unknown='ignore',drop[0])

Possible explanation from docs:
None : retain all features (the default).
‘first’ : drop the first category in each feature. If only one category is present, the feature will be dropped entirely.
array : drop[i] is the category in feature X[:, i] that should be dropped.

# Own implementation of One Hot Encoding - Data Transformation
def convert_to_binary(df, column_to_convert):
    categories = list(df[column_to_convert].drop_duplicates())

    for category in categories:
        cat_name = str(category).replace(" ", "_").replace("(", "").replace(")", "").replace("/", "_").replace("-", "").lower()
        col_name = column_to_convert[:5] + '_' + cat_name[:10]
        df[col_name] = 0
        df.loc[(df[column_to_convert] == category), col_name] = 1

    return df

# One Hot Encoding
print("One Hot Encoding categorical data...")
columns_to_convert = [col1,col2]#Enter your column names here that you want to one hot encode.

for column in df_all.columns:              #columns_to_convert
    if df_all.column.dtype == 'category':
        df_all = convert_to_binary(df=df_all, column_to_convert=column)
        df_all.drop(column, axis=1, inplace=True)
print("One Hot Encoding categorical data...completed")

Make sure you enter your list of columns (if you dont want all categorical variables to be converted) in the columns_to_convert

OneHotEncoder ,TypeError: __init__() got an unexpected keyword argument 'drop'

3 Answers

OneHotEncoder ,TypeError: init() got an unexpected keyword argument 'drop'