1
votes

I have a multi header excel sheet without any index column. When I read the excel in pandas, it treats first column as an index. I want pandas to create an index instead of treating 1st column as an index. Any help would be appreciated.

I tried below code:

df = pd.read_excel(file, header=[1,2], sheetname= "Ratings Inputs", parse_cols ="A:AA", index_col=None)
2

2 Answers

0
votes

From my tests, read_csv seems broken with a multi_line header: when index_col is absent or None, it behaves as is it was 0.

You have 2 possible workarounds here:

  1. reset_index as suggested by @mounaim:

    df = pd.read_excel(file, header=[1,2], sheetname= "Ratings Inputs",
                       parse_cols ="A:AA", index_col=None).reset_index()
    

    It is almost correct except that the header for first columns are used to name the MultiIndex df.columns and the first column is named `('index', ''). So you must re-create it:

    df.columns = pd.MultiIndex.from_tuples([tuple(df.columns.names)]
                                       + list(df.columns)[1:])
    
  2. Read separetely the headers

    head = pd.read_excel('3x3.xlsx', header=None, sheetname= "Ratings Inputs",
                       parse_cols ="A:AA", skiprows=1, nrows=2)
    df = pd.read_excel(file, header=2, sheetname= "Ratings Inputs",
                       parse_cols ="A:AA", index_col=None).reset_index()
    df.columns = pd.MultiIndex.from_tuples(list(head.transpose().to_records(index=False)))
    
0
votes

Have you tried reset_index() :
your_data_frame.reset_index(drop=True,inplace=True)