0
votes

I am creating a dataFrame and applying some modifications to the data of a particular column.

Modificiation requirement -

  • If there is any row/data with NULL value, then replace it with "unknown" (Of Type String).

So My Code -

val rawDF = reader.readFrmDatabase(DatabaseQueries.rawQuery,ConfigUtils.getDatabaseReadProps)
                  .withColumn("osrelease", when (col("osrelease").isNull || col("osrelease") === "","unknown")
                  .otherwise("osrelease"))

The function readFrmDatabase takes Query:String & Configurations: Map[String,String] as parameters and returns a dataframe. Example -

@throws[Exception]
  def readFrmDatabase(query: String, dbProps: Map[String, String], optionalArgs: Option[Map[String, String]]=None)(implicit spark: SparkSession): DataFrame = {
    logInfo("Reading From Database")

    val outDF = Try {
      spark.read.format("jdbc")
        .options(dbProps)
        .options(optionalArgs.getOrElse(Map.empty))
        .option("dbTable",s"""(${query})""")
        .load()
    }
match {
      case Success(success) => success
      case Failure(error) => logError(s"Error while reading Database table $query", error)
        throw new Exception(s"""Error while reading Database table : $query""", error)
    }
    outDF
  }

The issue is when I am using withColumn, it accepts the first parameter as colName:String, but the second argument, it doesnt. I have tried using col(") & $"" but either of them didnt work.

I am getting such kind of errors - ( In the picture , it is RED COLORED ) enter image description here

  • When using withColumn("<column_name>",when (col("<column_name>").isNull,"unknown").otherwise("<column_name>")) THEN , the error are - Cannot resolve symbol when, Cannot resolve symbol col etc etc.
    • when using withColumn("<column_name>",when ($"<column_name>".isNull,"unknown).otherwise("<column_name>")) THEN, the error are - Cannot resolve symbol when, value$is not a member of StringContext

Please help me find what is the issue here. Thanks in Advance.

1

1 Answers

1
votes

you are missing import

import org.apache.spark.sql.functions._

if you looking to use $ then import

val spark = SparkSession.builder.master("local[*]").getOrCreate
import spark.implicits._