I'm attempting to run an if statement to match the country of origin of marathon winners to theirs countrie's gdp data. I am getting the error 'Can only compare identically-labeled Series objects'.
if df['Winner Country'] == gdp_data['Country']:
if df['YEAR'] == 1970 :
df['gdp'] = gdp_data['1970 gdp/cap']
gdp_data example:
Country 1970 gdp/cap
Kenya 98
df example:
YEAR Winner_Name Winner_Country Time Gender
1977 Dan Cloeter USA 2:17:52 M
I intend to assign a gdp value to df based off both country and year(I only included partial data, there are extra columns for each year in the gdp_data datarame).
If I opt to merge I run into this issue:
data example:
YEAR Winner_Name Winner_Country Time Gender Marathon_City Country 1970 1971
1977 Dan Cloeter USA 2:17:52 M Chicago USA 5247.0 5687.0
1978 Mark Stanforth USA 2:19:20 M Chicago USA 5247.0 5687.0
as seen the number 1970 is a variable but is also a possible result for year. How can I create a gdp variable based the year the race occurred?
What I initially tried:
YEAR = df_gdp['YEAR']
df_gdp['gdp'] = df[YEAR]
resulting in this error
KeyError: "None of [Int64Index([1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986,\n ...\n 2009, 2010, 2011, 2013, 2014, 2015, 2016, 2017, 2018, 2019],\n dtype='int64', length=258)] are in the [columns]"
a simplified example of desired results
Take this example data set
letter a b c d
a 1 3 4 2
b 4 3 2 1
c 2 1 4 3
d 3 4 2 1
desired results
letter a b c d correct answer
a 1 3 4 2 1
b 4 3 2 1 3
c 2 1 4 3 4
d 3 4 2 1 1
how to create the 'correct answer' column?
df['Winner_Country'] == gdp_data['Country']will return a pandasSeriesofTrueandFalsevalues, so you wouldn't do this iteratively. Can you give more of an explanation of what you're trying to achieve? Are you trying to join DataFrames on their country? - A Poor