0
votes

Dataframes:

df1:

col1        col2        col3        col4        col5        col6        col7        col8 
7865                                                                                abc
                                    7269        689                                 def
            8726                                                                    ghi
                                                                        986         jkl
                                                7689                                mno
                        8762                                                        pqr
                                                                                    stu
                        9698                                                        vwx
            3568                                                                    yz

df2:

Scientific value         mapping_value
1                        8726
2                        9698
3                        3568
4                        986
5                        7269

I want to match "col1, col2, col3, col4, col5, col6, col7" column values in df1 with "mapping_value" column in df2 and create a new column called "Scientific value" in df1 which would have entries from "Scientific value" column in df2.

Output:

col1        col2        col3        col4        col5        col6        col7        col8   Scientific value
7865                                                                                abc    
                                    7269        689                                 def    5
            8726                                                                    ghi    1
                                                                        986         jkl    4
                                                7689                                mno
                        8762                                                        pqr
                                                                                    stu
                        9698                                                        vwx    2
            3568                                                                    yz     3

Thanks!

2

2 Answers

0
votes
#Data
df=pd.DataFrame({'col1':[9698, '',''],'col2':['', 7269, ''], 'col3':['','',986]})

enter image description here

df2=pd.DataFrame({'Scientific value':[1,2,3,4],'mapping_value':[9698,7269,986,304]})

enter image description here

#Create dict
d=dict(zip(df2['mapping_value'],df2['Scientific value']))
#Apply dict to dataframe
df.apply(lambda x: x.map(d))

Result

enter image description here

0
votes

I noticed that:

  • all columns in df1 are of object type (actually string),
  • but all columns in df2 (including mapping_value) are of int type.

Because something like join has to be performed between mapping_value and col... columns, they must be brought to a common type, e.g. using some auxiliary object.

So the first thing to do is to create a "mapper" Series:

  • values taken from df2['Scientific value'],
  • index taken from df2.mapping_value, but coverted to string.

The code to do it is:

mapper = df2.set_index('mapping_value')['Scientific value']
mapper.index = mapper.index.astype(str); mapper

Then define a function to get the Scientific value (either a "true" value or an empty string in the "negative" case):

def getScVal(row):
    rw = row[:-1]    # without "col8" 
    rw = rw[rw != ''].to_frame('code').set_index('code')
    res = rw.join(mapper, how='inner')
    return res.iloc[0,0] if res.size > 0 else ''

And to generate the new column, run:

df1['Scientific value'] = df1.apply(getScVal, axis=1)