2
votes

df1:

col1        col2        col3        col4        col5        col6        col7        col8 
7865                                                                                abc
                                    7269                                            def
            8726                                                                    ghi
                                                                        986         jkl
                                                7689                                mno
                        8762                                                        pqr
                                                                                    stu
                        9698                                                        vwx
            3568                                                                    yz

df2:

Scientific value         mapping_value
1                        8726
2                        9698
3                        3568
4                        986
5                        7269

I want to match "col1, col2, col3, col4, col5, col6, col7" column values in df1 with "mapping_value" column in df2 and create a new column called "Scientific value" in df1 which would have entries from "Scientific value" column in df2.

Output:

col1        col2        col3        col4        col5        col6        col7        col8   Scientific value
7865                                                                                abc    
                                    7269                                            def    5
            8726                                                                    ghi    1
                                                                        986         jkl    4
                                                7689                                mno
                        8762                                                        pqr
                                                                                    stu
                        9698                                                        vwx    2
            3568                                                                    yz     3

Would merge work in this case..or would be happy to learn any other efficient method as well!!

Thanks!

1

1 Answers

1
votes

If there are only one number per rows for any of column without last solution should be simplify by sum, max, min function for one column, so is possible use Series.map:

s = df2.set_index('mapping_value')['Scientific value']
df1['Scientific value'] = df1.iloc[:, :-1].max(axis=1).map(s)
#if empty string are not missing values
#df1['Scientific value'] = df1.iloc[:, :-1].replace('',np.nan).max(axis=1).map(s)

If possible multiple values first select by position by DataFrame.iloc, then reshape by DataFrame.stack, mapping by Series.map, remove possible missing values for non match values and last aggregate lists:

s = df2.set_index('mapping_value')['Scientific value']
df1['Scientific value'] = (df1.iloc[:, :-1]
                              .stack()
                              .map(s)
                              .dropna()
                              .groupby(level=0)
                              .agg(list))