My Data Frame
The below data frame consist of "Year", "Month" and "Data" as column:
np.random.seed(0)
df = pd.DataFrame(dict(
Year = [2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003],
Month = [1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12],
Data = np.random.randint(21,100,size=36)))
df
I want to a pythonic way to convert it to time series data such that I will have "Data" and "Data" in place as time series data instead of data frame.
What I Tried
I have tried:
import pandas as pd
timeseries = data.assign(Date=pd.to_datetime(data[['Year', 'Month']].assign(day=1)))
columns = ['Year','Month']
df.drop(columns, inplace=True, axis=1) # I don't need day but year and month timeseries
but the new data only add a column called "Date" to the data frame.
What I Want
I want a time series data which will consist of "Date" (2001-1 for instance) and "Data" column only such that I can make a time plot, do time series analysis and forecast with the data.
I mean how to index such time series data such that when I plot with this code:
plt.figure(figsize=(5.5, 5.5))
data1['Data'].plot(color='b')
plt.title('Monthly Data')
plt.xlabel('Data')
plt.ylabel('Data')
plt.xticks(rotation=30)
I will have my x-axis graduated as data not as number