0
votes

I am currently trying to plot data where the x variable is the year and the y variable is the number of wins the Philadelphia Phillies won in a season. I have tried multiple methods of plotting these two variables from my data set however nothing is working. Below is the last option that I tried.

The first column of my file is the year and the third column is the number of wins (aka columns 0 and 2).

I've tried setting x and y to the columns and what I have below is what I most recently tried.

import csv
import numpy
import random
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize']=(10,6)

phillies_data = 
pd.read_csv('/Users/hannahbeegle/Desktop/Teams/PHILLIEScsv.csv', 
header = None)

phillies_data.plot(x='Year',y='W')
plt.xlabel('Year')
plt.ylabel('Wins')
plt.title('Amount of Wins in Phillies History (1871-2018)')
plt.xlim(1870, 2020)
plt.ylim(0, 170)

plt.show()

ERROR MESSAGE:

Traceback (most recent call last):

File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2657, in get_loc return self._engine.get_loc(key) File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 129, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type KeyError: 'Year'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/hannahbeegle/Desktop/Text Files/TeamDataBase.py", line 121, in phillies_data.plot(x="Year", y="W") File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/plotting/_core.py", line 2942, in call sort_columns=sort_columns, **kwds) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/plotting/_core.py", line 1973, in plot_frame **kwds) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/plotting/_core.py", line 1763, in _plot elif not isinstance(data[x], ABCSeries): File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2927, in getitem indexer = self.columns.get_loc(key) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 129, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type KeyError: 'Year'

1
There is a misunderstanding of the parameters. the x and y arguments take the names of the columns in your dataframe, not the columns themselves. df.plot(x="column1", y="column3")ImportanceOfBeingErnest
try to use .iloc untead of .loc ?Youcef Benyettou
@ImportanceOfBeingErnest I did what you said and got this error that I edited aboveHannah
We have no clue on how the dataframe looks like. See minimal reproducible example, or How to make good reproducible pandas examples.ImportanceOfBeingErnest
@ImportanceOfBeingErnest file:///Users/hannahbeegle/Desktop/Screen%20Shot%202019-06-18%20at%2012.30.27%20PM.pngHannah

1 Answers

0
votes

Try the following example in your jupyter session:

df = pd.DataFrame(np.random.randn(1000, 2), columns=['BB', 'CC']).cumsum()
df['AA'] = pd.Series(list(range(len(df))))
df.plot(x='AA', y='BB')

You'll see a plot of BB vs AA (that is a column that increases 1 step at a time, while not plotting other column. I hope that this can translate easily to your example. If you need to check your column names try:

df.columns