0
votes

How to draw time-series chart by using Python? Since in the data set, time is split as year, and period (which is Month like M1, M2).

I am using matplotlib, but don't know how to split the time.

The codes I wrote to get the data: import pandas as pd from pandas import DataFrame data1 = pd.read_csv('CUUR0000SA0.txt', header = None) data2 = pd.read_csv('SUUR0000SA0.txt', header = None) data = pd.concat([data1, data2]) data.columns = ["a"] data = DataFrame(data) print(data.head())

However, the output dataframe has only one column.

Part of the the data set looks like this:

+-------------+------+--------+---------+-----------+
|  series id  | year | period |  value  | footnotes |
+-------------+------+--------+---------+-----------+
| CUUR0000SA0 | 2014 |  M12   | 234.812 |           |
| CUUR0000SA0 | 2014 |  M11   | 236.151 |           |
| CUUR0000SA0 | 2014 |  M10   | 237.433 |           |
| CUUR0000SA0 | 2014 |  M09   | 238.031 |           |
| CUUR0000SA0 | 2014 |  M08   | 237.852 |           |

The chart should explain the trend of values using a graph according to the time periods. But I don't know how to transfer it into the right format first.

1
Hi Sandy, are you using pandas or numpy? You will need to use the datetime package but the answer may be dependent on how your data is structured. - BenT
Oh, I am using pandas data frame to store the above data. - Sandy
It is not clear to me from the question exactly what you would like the resulting chart to look like. What have you tried so far? For example does df.plot() get what you need? How about df.sort_values(['year', 'period']).plot(x='period', y='value')? - johnchase
The thing is that I want to show the value trend based on year and period. But I don't know how to convert this one column dataset into dataframe with several columns. - Sandy

1 Answers

0
votes

These are the steps to get a solution:

  1. transform period values into numbers
  2. add a column where values are the combination of year and period
  3. plot the time series

And this is the code:

import pandas as pd
from datetime import datetime

df = {0: {"series id":"CUUR0000SA0", "year":2014, "period":"M12", "value":234.812},
        1: {"series id":"CUUR0000SA0", "year":2014, "period":"M11", "value":236.151},
       2: {"series id":"CUUR0000SA0", "year":2014, "period":"M10", "value":237.433},
       3: {"series id":"CUUR0000SA0", "year":2014, "period":"M09", "value":238.031},
       4: {"series id":"CUUR0000SA0", "year":2014, "period":"M08", "value":237.852},
       }

d = {'M01':1,
     'M02':2,
     'M02':3,
     'M04':4,
     'M05':5,
     'M06':6,
     'M07':7,
     'M08':8,
     'M09':9,
    'M10':10,
    'M11':11,
    'M12':12,}

df = pd.DataFrame.from_dict(df, orient="index")
df.period = df.period.map(d)

df['date'] = pd.to_datetime(df.year.astype(str) + '/' + df.period.astype(str) + '/01')

df.plot(x='date', y='value')