Import multiple excel files into python pandas and concatenate them into one dataframe

Question

I would like to read several excel files from a directory into pandas and concatenate them into one big dataframe. I have not been able to figure it out though. I need some help with the for loop and building a concatenated dataframe: Here is what I have so far:

import sys
import csv
import glob
import pandas as pd

# get data file names
path =r'C:\DRO\DCL_rawdata_files\excelfiles'
filenames = glob.glob(path + "/*.xlsx")

dfs = []

for df in dfs: 
    xl_file = pd.ExcelFile(filenames)
    df=xl_file.parse('Sheet1')
    dfs.concat(df, ignore_index=True)

Your code in the other question was just fine, just replace read_csv with read_excel. — joris
Your code here is not really correct (it was in the other question). You cannot loop over the empty list dfs you just created, so loop iver the filenames, then dfs.append(df) in the loop, and after that pd.concat(dfs, ignore_index=True) — joris

ericmjl ericmjl · Accepted Answer · 2014-01-03T16:33:41

As mentioned in the comments, one error you are making is that you are looping over an empty list.

Here is how I would do it, using an example of having 5 identical Excel files that are appended one after another.

(1) Imports:

import os
import pandas as pd

(2) List files:

path = os.getcwd()
files = os.listdir(path)
files

Output:

['.DS_Store',
 '.ipynb_checkpoints',
 '.localized',
 'Screen Shot 2013-12-28 at 7.15.45 PM.png',
 'test1 2.xls',
 'test1 3.xls',
 'test1 4.xls',
 'test1 5.xls',
 'test1.xls',
 'Untitled0.ipynb',
 'Werewolf Modelling',
 '~$Random Numbers.xlsx']

(3) Pick out 'xls' files:

files_xls = [f for f in files if f[-3:] == 'xls']
files_xls

Output:

['test1 2.xls', 'test1 3.xls', 'test1 4.xls', 'test1 5.xls', 'test1.xls']

(4) Initialize empty dataframe:

df = pd.DataFrame()

(5) Loop over list of files to append to empty dataframe:

for f in files_xls:
    data = pd.read_excel(f, 'Sheet1')
    df = df.append(data)

(6) Enjoy your new dataframe. :-)

df

Output:

  Result  Sample
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10

Import multiple excel files into python pandas and concatenate them into one dataframe

6 Answers