1
votes

I have two Excel sheets, sheet1, and sheet2. Sheet1 has the row id, First name, Last name, Description columns, etc. Sheet2 has also a column that stores the First name, Last name, and also two other columns, column D, and column E, that need to be merged in the Description column.

The combination of First name, Last name, exists only once in both sheets.

How could I merge the contents of column D, E from sheet 2, in column named Description, in sheet 1, based on the matching criteria First name and Last name are equal in row from sheet 1, and from sheet 2, using Python Pandas?

Sheet 1:

ID | columnB  | column C | Column D 
1  | John     | Hingins  | Somedescription

Sheet 2:

ID | column Z | column X | Column Y     | Column W
1  | John     | Hingins  | description2 | Somemoredescription

Output: Sheet 1:

ID | columnB  | column C | Column D
1  | John     | Hingins  | description2-separator-Someotherdescription-separator-Somedescription
1
We need an example. See minimal reproducible example.IanS

1 Answers

1
votes

I think you should look at this. But that's mostly for context.

http://pbpython.com/excel-file-combine.html

I think your issue actually boils down to this.

>>> !cat scores3.csv
ID,JanSales,FebSales
1,100,200
2,200,500
3,300,400
>>> !cat scores4.csv
ID,CreditScore,EMMAScore
2,good,Watson
3,okay,Thompson
4,not-so-good,NA

We could read these into objects called DataFrames (think of them sort of like Excel sheets):

>>> import pandas as pd
>>> s3 = pd.read_csv("scores3.csv")
>>> s4 = pd.read_csv("scores4.csv")
>>> s3
   ID  JanSales  FebSales
0   1       100       200
1   2       200       500
2   3       300       400
>>> s4
   ID  CreditScore EMMAScore
0   2         good    Watson
1   3         okay  Thompson
2   4  not-so-good       NaN

And then we can merge them on the ID column:

>>> merged = s3.merge(s4, on="ID", how="outer")
>>> merged
   ID  JanSales  FebSales  CreditScore EMMAScore
0   1       100       200          NaN       NaN
1   2       200       500         good    Watson
2   3       300       400         okay  Thompson
3   4       NaN       NaN  not-so-good       NaN

After which we could save it to a csv file or to an Excel file:

>>> merged.to_csv("merged.csv")
>>> merged.to_excel("merged.xlsx")

From...here...

Looking to merge two Excel files by ID into one Excel file using Python 2.7