Pandas : Concat rows of a dataframe with same index to form custom string in pairs

Question

Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])

   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that's taking a lot of time. I was thinking of something like .groupby(level=0) but can't figure out the next aggregation apply part

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0])) — Scott Boston
@ScottBoston Is it possible to apply multiple function on aggregate? like say first list then tuple, obviously no one would want that but just thinking if multiple function could be applied or not — Himanshu Poddar

Scott Boston Scott Boston · Accepted Answer · 2022-07-28T17:42:50

IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Or

df_out=df.groupby(level=0).agg('_'.join)
df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
print(df_out)

Pandas : Concat rows of a dataframe with same index to form custom string in pairs

3 Answers