0
votes

I have been using the following code from another StackOverflow answer to concatenate data from multiple excel sheets in the same workbook into one sheet.

This works great when the column names is uniform across all sheets in a workbook. However, I'm running into an issue with one specific workbook where only the first column is named differently (or not named at all.. so is blank) but the rest of the columns are the same.

How do I merge such sheets? Is there a way to rename the first column of each sheet into one name so that I can then use the steps from the answer linked above?

1

1 Answers

0
votes

Yes, you can rename all the columns as:

# read excel
dfs = pd.read_excel('tmp.xlsx', sheetname=None, ignore_index=True)

# rename columns
column_names = ['col1', 'col2', ...]
for df in dfs.values(): df.columns = column_names

# concat
total_df = pd.concat(dfs.values())

Or, you can ignore the header in read_excel so that the columns are labeled as 0,1,2,...:

# read ignore header
dfs = pd.read_excel('tmp.xlsx', sheet_name=None,
                    header=None, skiprows=1)

total_df = pd.concat(dfs.values)

# rename
total_df.columns = column_names