0
votes

I am currently facing a problem that I don't seem to be able to solve with regards to handling and manipulating dataframes using Pandas.

To give you an idea of the dataframes I'm talking about and that you'll see in my code:

enter image description here enter image description here enter image description here

I’m trying to change the words found in column ‘exercise’ of the dataset ‘data’ with the words found in column ‘name’ of the dataset ‘exercise’.

For example, the acronym ‘Dl’ in the exercise column of the ‘data’ dataset should be changed into ‘Dead lifts’ found in the ‘name’ column of the ‘exercise’ dataset.

I have tried many methods but all have seemed to fail. I receive the same error every time.

Here is my code with the methods I tried:

### Method 1 ###

# Rename Name Column in 'exercise'
exercise = exercise.rename(columns={'label': 'exercise'})

# Merge Exercise Columns in 'exercise' and in 'data'
data = pd.merge(data, exercise, how = 'left', on='exercise')

### Method 2 ###
data.merge(exercise, left_on='exercise', right_on='label')

### Method 3 ###

data['exercise'] = data['exercise'].astype('category')
EXERCISELIST = exercise['name'].copy().to_list()
data['exercise'].cat.rename_categories(new_categories = EXERCISELIST, inplace = True)
                
### Same Error, New dataset ###

# Rename Name Column in 'area'
area = area.rename(columns={'description': 'area'})

# Merge Exercise Columns in 'exercise' and in 'data'
data = pd.merge(data, area, how = 'left', on = 'area')

This is the error I get:

Traceback (most recent call last):

File "---", line 232, in
data.to_frame().merge(exercise, left_on='exercise', right_on='label')

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/frame.py", line 8192, in merge
return merge(

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/reshape/merge.py", line 74, in merge
op = _MergeOperation(

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/reshape/merge.py", line 668, in init
) = self._get_merge_keys()

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/reshape/merge.py", line 1046, in _get_merge_keys
left_keys.append(left._get_label_or_level_values(lk))

File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/generic.py", line 1683, in _get_label_or_level_values
raise KeyError(key)

KeyError: 'exercise'

Is someone able to help me with this? Thank you very much in advance.

1
Call me crazy, but the line of code causing the error in the traceback doesn't seem to be in the code samples you submitted.Chris
Also, I see you are using Python 3.9. I would double check if 3.9 is supported with pandas yet.zerecees
On the data dataframe, run this, and copy-paste the results: print(data.columns)zerecees
it's just a merge() between the three data frames. A classic joinRob Raymond

1 Answers

1
votes
  1. merge, then drop and rename columns between data and area
  2. merge, then drop and rename columns between step 1 and exercise
area = pd.DataFrame({"arealabel":["AGI","BAL"],
                    "description":["Agility","Balance"]})
exercise = pd.DataFrame({"description":["Jump rope","Dead lifts"],
                        "label":["Jr","Dl"]})
data = pd.DataFrame({"exercise":["Dl","Dl"],
                    "area":["AGI","BAL"],
                    "level":[0,3]})

(data.merge(area, left_on="area", right_on="arealabel")
 .drop(columns=["arealabel","area"])
 .rename(columns={"description":"area"})
 .merge(exercise, left_on="exercise", right_on="label")
 .drop(columns=["exercise","label"])
 .rename(columns={"description":"exercise"})
)
level area exercise
0 0 Agility Dead lifts
1 3 Balance Dead lifts