0
votes

I am a beginner in Python.I merged two columnsAfter that i tried to change 'not assigned' value of a column with another column value. I cant do that. If I use premodified dataframe then I can change.

I scraped a table from a page then modifying the data in that dataframe.

import pandas as pd
import numpy as np
import requests

pip install lxml

toronto_url='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

toronto_df1= pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]

toronto_df1.head()

toronto_df1.drop(toronto_df1.loc[toronto_df1['Borough']=="Not assigned"].index, inplace=True)

toronto_df2=toronto_df1.groupby(['Postcode','Borough'],sort=False).agg(lambda x: ','.join(x))

toronto_df2.loc[toronto_df2['Neighbourhood'] == "Not assigned", 'Neighbourhood'] = toronto_df2['Borough']

This is the code i have used.

I expect to change the neighbourhood value with borough value.

I got this error.

KeyError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2656 try: -> 2657 return self._engine.get_loc(key) 2658 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Borough'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) 9 frames /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2657 return self._engine.get_loc(key) 2658 except KeyError: -> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2660
indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2661 if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Borough'

1
Check is the column 'Borough' is present in your dataset exactly as you have written. Even an additional space at either end can result in a KeyErrormoys

1 Answers

0
votes

Reason of your keyerror is Neighbourhood is not column, but index level, solution is add reset_index:

toronto_df1= pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]

#boolean indexing        
toronto_df1 = toronto_df1.loc[toronto_df1['Borough']!="Not assigned"]
toronto_df2 = toronto_df1.groupby(['Postcode','Borough'],sort=False)['Neighbourhood'].agg(','.join).reset_index()
toronto_df2.loc[toronto_df2['Neighbourhood'] == "Not assigned", 'Neighbourhood'] = toronto_df2['Borough']

Or parameter as_index=False to groupby:

toronto_df1= pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]

#boolean indexing  
toronto_df1 = toronto_df1.loc[toronto_df1['Borough']!="Not assigned"]
toronto_df2=toronto_df1.groupby(['Postcode','Borough'],sort=False, as_index=False)['Neighbourhood'].agg(','.join)
toronto_df2.loc[toronto_df2['Neighbourhood'] == "Not assigned", 'Neighbourhood'] = toronto_df2['Borough']