1
votes

Say I'm looking at the Rdataset acme.csv found here. How do I import this with appropriately coarse date? Using parse_dates, it assigns the day to the present day (today being the 18th of July), since no day was specified. Can I make it deal with just month/year like the table is but keep using the date functionality of PANDAS?

import pandas as pd
url = 'http://vincentarelbundock.github.io/Rdatasets/csv/boot/acme.csv'
df = pd.read_csv(url, parse_dates=[1])
df.drop('Unnamed: 0', axis=1, inplace=True)
1
You could add few rows from that file and expecte result.furas
there's no way to have a month. You will always have a day as wellMichael WS
BTW: you can use two columns - first with date as datetime (to use its functionality), second with date as string m/y` (to show in results`).furas

1 Answers

3
votes

Don't parse dates in read_csv() but use to_datetime with format

df['month'] = pd.to_datetime(df['month'], format='%m/%y')

or you can use that function in read_csv() using lambda

df = pd.read_csv(url, parse_dates=['month'], date_parser=lambda x:pd.to_datetime(x, format='%m/%y'))

But you always get some day number in datetime.

BTW: In datetime you have always time too, but sometimes pandas doesn't show it.

print df['month'].head()
print df['month'].apply(lambda x:x.time()).head()