I am a beginner to PYSPARK/SPARKSQL, and I have a requirement as below,
I have a configuration table as below (DataFrame:Config),
Config :
| Dataframe | Col1 | Col2 |Col3
|:---- |:------:| -----:|-----:
| Emp | Name1 |Name2 |Address
| Job | Doj | Role |DOB
I have iterated the above dataframe and assigned values to variables, and need to pass variable values as columns to another DF as below.
Example,
First_Name = Config.alias('a').select('a.col1).filter("Rownumber = '" + str(i) + "'" ).first()[0]
print("First_Name :" + First_Name )
Last_Name = Config.alias('a').select('a.col2).filter("Rownumber = '" + str(i) + "'" ).first()[0]
print("Last_Name :" + Last_Name )
Now First_Name, Last_Name variable holds the column name of below Dataframe Emp,
Need the dataframe as below,
DF =Emp.select (col(‘Name1’),col(‘Name2),col(‘Address))