I have two data frames "base_level" and "raw_inventory" with the following columns:
"base_level" columns -> "a" , "b", "c" , "inventory_id"...
"raw_inventory" columns -> "1", "2", "3", "inventoryparentid",.....
when I use merge join directly as shown below, everything works as expected.
level = pd.merge(base_level, raw_inventory, left_on='inventory_id', right_on='inventoryparentid', how='left')
print(level)
But when I use it in a function and try to call as shown below
def inv_level ( child_inv, parent_inv, lefton, righton, how ):
level_inv = pd.merge(parent_inv, child_inv, left_on=lefton, right_on=righton, how=how)
return level_inv
level = inv_level(base_level, raw_inv, 'inventory_id', 'inventoryparentid', 'left')
print(level)
It throws the following error
File "C:\temp\env\3.8.6\lib\site-packages\pandas\core\reshape\merge.py", line 652, in __init__
) = self._get_merge_keys()
File "C:\temp\env\3.8.6\lib\site-packages\pandas\core\reshape\merge.py", line 1005, in _get_merge_keys
right_keys.append(right._get_label_or_level_values(rk))
File "C:\temp\env\3.8.6\lib\site-packages\pandas\core\generic.py", line 1563, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'inventoryparentid'
I am not able to identify what could be the reason. Any inputs regarding this issue is appreciated.
Edit:
I tried in following sample code to show case what I am trying to do and easy for understanding. I get the same error.
import numpy as np
import pandas as pd
def inv_level ( child_inv, parent_inv, lefton, righton, how ):
level_inv = pd.merge(parent_inv, child_inv, left_on=lefton, right_on=righton, how=how)
return level_inv
def main(event, context):
np.random.seed(0)
# transactions
left = pd.DataFrame({'transaction_id': ['A', 'B', 'C', 'D'],
'user_id': ['Peter', 'John', 'John', 'Anna'],
'value': np.random.randn(4),
})
# users
right = pd.DataFrame({'new_id': ['Paul', 'Mary', 'John', 'Anna'],
'favorite_color': ['blue', 'blue', 'red',
np.NaN],
})
'''
test = inv_level(left, right, 'user_id', 'new_id', 'left') #left.merge(right, on='user_id', how='left')
The above throws an error
'''
test = pd.merge(left, right, left_on='user_id', right_on='new_id', how='left')
print(test)
if __name__ == "__main__":
main("", "")
Error:
File "C:\temp\env\3.8.6\lib\site-packages\pandas\core\reshape\merge.py", line 1005, in _get_merge_keys right_keys.append(right._get_label_or_level_values(rk)) File "C:\temp\env\3.8.6\lib\site-packages\pandas\core\generic.py", line 1563, in _get_label_or_level_values raise KeyError(key) KeyError: 'new_id'
Here is the intended output:
transaction_id user_id value new_id favorite_color
0 A Peter 1.764052 NaN NaN
1 B John 0.400157 John red
2 C John 0.978738 John red
3 D Anna 2.240893 Anna NaN
Thanks,