I would like to read several csv files from a directory into pandas and concatenate them into one big DataFrame. I have not been able to figure it out though. Here is what I have so far:
import glob
import pandas as pd
# get data file names
path =r'C:\DRO\DCL_rawdata_files'
filenames = glob.glob(path + "/*.csv")
dfs = []
for filename in filenames:
dfs.append(pd.read_csv(filename))
# Concatenate all data into one DataFrame
big_frame = pd.concat(dfs, ignore_index=True)
I guess I need some help within the for loop???
dfs
list, don't you want to replace the linedata = pd.read_csv(filename)
withdfs.append(pd.read_csv(filename)
. You would then need to loop over the list andconcat
, I don't thinkconcat
will work on a list ofdf
s. – EdChumbig_frame = pd.concat(dfs, ignore_index=True)
?, anyway once you have a list of dataframes you will need to iterate over the list and concat tobig_frame
– EdChumdfs
now, so something likefor df in dfs: big_frame.concat(df, ignore_index=True)
should work, you could also tryappend
instead ofconcat
also. – EdChumconcat
should handle a list of DataFrames just fine like you did. I think this is a very good approach. – joris