0
votes

Is it possible to add a VLOOKUP formula to a column with pandas python without using a for loop to loop through each row?

This is the loop I have which works but is a bit slow on an Excel file with over 60,000 rows

for i, row in df1.iterrows():
    df1.loc[i, 'OutcomeGroupCodeName'] = "=VLOOKUP(N" + str(i + 2) + ",OUTCOMES!A:O,3,FALSE)"

The ID to use for the vlookup is column N on each row in the Excel sheet. This is the equivalent of opening Excel and adding it to the cells i.e.

=VLOOKUP(N2,OUTCOMES!A:O,3,FALSE)
=VLOOKUP(N3,OUTCOMES!A:O,3,FALSE)

I know you can use pandas to join data which does the same as VLOOKUP but I need the formula in the final Excel file that pandas writes so that if the value is ever changed in the exported Excel sheet later on it will automatically apply the VLOOKUP. So applying the merge/VLOOKUP in pandas before saving to Excel won't work for me.

This will write the formula to each cell in the column but I don't know how to add the index/row number to part of the string

df1['OutcomeGroupCodeName'] = "=VLOOKUP(N,OUTCOMES!A:O,3,FALSE)"
1

1 Answers

1
votes

Using .apply can solve this

df.apply(lambda row: "=VLOOKUP(N{},OUTCOMES!A:O,3,FALSE)".format(row.name), axis=1)