Suppose I have a trait with read operations wrapped inside a Try block:
import scala.util.Try
trait ReadingProvider[T] {
def readTable(tableName: String):Try[T]
}
Also a class which provides methods for reading with spark and an implicit class for methods to recover from failure
import org.apache.spark.sql._
import org.apache.spark.sql.types.StructType
import scala.util.{Try, Success, Failure}
class SparkReadingProvider(spark: SparkSession) extends ReadingProvider[DataFrame] {
override readTable: Try[DataFrame] = Try(spark.read.table(tableName))
def createEmptyDF(schema: StructType): DataFrame =spark.createDataFrame(spark.sparkContext.emptyRDD[Row], schema)
}
implicit class ReadingHandler(tryDF: Try[DataFrame]) {
def recoverWithEmptyDF(schema: StructType): DataFrame = tryDF match {
case Failure(ex) => //Log something
createEmptyDF(schema)
case Success(df) => //Log something
df
}
}
}
Now I have an object which contains the reading and some transformation:
object MyObject {
def readSomeTable(tableName): SparkReadingProvider => DataFrame = provider => {
import provider.ReadingHandler
provider.readTable(tableName).recoverWithEmptyDF
}
def transform: DataFrame => DataFrame = ???
def mainMethod(tableName)(implicit val provider: SparkReadingProvider): DataFrame =
readSomeTable(tableName) andThen transform apply provider
}
I want to unit test the methods inside MyObject
. I don't want to work with real files or tables, thus my goal is to use mocking.
In my test I was trying to mock the SparkReadingProvider:
describe("reading") {
it("should return empty dataframe when reading failed") {
val provider: SparkReadingProvider = mock[SparkReadingProvider]
val tableName: String = "no_table"
provider.readTable _ expects tableName returning Failure(new Exception("Table does not exist"))
MyObject.readSomeTable(tableName) shouldBe empty
}
}
However it fails with the error:
Unexpected call: < mock-1> SparkReadingProvider.ReaderHandler(Failure(java.lang.Exception: table does not exist))
Expected: inAnyOrder { < mock-1> SparkReadingProvider.readTable(no_table) once (called once) }
Actual: < mock-1> SparkReadingProvider.readTable(no_table) < mock-1> SparkReadingProvider.ReaderHandler(Failure(java.lang.Exception: table does not exist))
My questions are:
- Is it possible to achieve what I want in the current setup?
- If not, how should I refactor my code
- If I test in a different class the methods available in the implicit class, does it make sense to test the
readSomeTable
and themainMethod
insideMyObject
?