0
votes

I have a DataFrame output from Scala coming into Either. I am looking to retrieve the Spark DataFrame out from it.

I have an Azure SQL connection in databricks that I used scala to connect. I can connect to the database and can output the query. It gives me a Spark DataFrame in the below Scala format which I am new to, can someone help me retrieve it so I can use save it as a hive table

Result of my scala code:

outputData: Either[org.apache.spark.sql.DataFrame,Boolean] = Left([Product: string, OrderNumber: string ... 27 more fields])

outputData is the Spark DataFrame that contains :

Product: string, OrderNumber: string ... 27 more fields..

I am not sure how to get the DF from Either.

2
What do you want to happen when there is no dataframe in the result? (i.e. when the return value is a boolean instead of dataframe). - Shaido

2 Answers

0
votes

Without knowing too much about the exact way you retrieve your DataFrame and how/why you assign the right value, let me write a toy function that returns a DataFrame in case if the input is even, otherwise returns Right(false):

import org.apache.spark.sql.DataFrame

def readDfIfEven(n: Int): Either[DataFrame, Boolean] = {
  if (n % 2 == 0) {
    val df = spark.read.format("json").load("/databricks-datasets/definitive-guide/data/flight-data/json/2015-summary.json")
    Left(df)
  } else {
    Right(false)
  }
}

Now to answer your question:

What I would do in this case, is to apply pattern matching to get the DataFrame:

readDfIfEven(2) match {
  case Left(df: DataFrame) => df.show() // note that type annotation here is just for illustration
  case Right(status) => println($"Could not get DataFrame with status ${status}")
}
0
votes

Suppose you have a function that is returning Either as below.

def abc(): Either[DataFrame, Boolean] = {...}

To get DataFrame, you just need to apply get function as below

abc.left.get

It will give you the DataFrame.