1
votes

This is the code I am writing using SparkListener. I am using Spark 2.4.4.

class CustomListener extends SparkListener {

    var recordsReadCount = 0L
    var recordsWrittenCount = 0L

    override def onTaskEnd(taskEnd: SparkListenerTaskEnd) {
        synchronized {
            if(taskEnd.taskMetrics.inputMetrics!=None) {
                recordsReadCount += taskEnd.taskMetrics.inputMetrics.recordsRead
            }

            if(taskEnd.taskMetrics.outputMetrics!=None) {

                recordsWrittenCount += taskEnd.taskMetrics.outputMetrics.recordsWritten
            }

        println(s"WRITTEN : $recordsWrittenCount READ : $recordsReadCount")
        }
    }
}

I am getting non-zero result for input metrics but I am unable to get result for output metrics. And yes I am writing data in delta format I am getting "WRITTEN : 0" as output. Calling in main (sc is SparkSession):

val myListener=new CustomListener
sc.sparkContext.addSparkListener(myListener)

// my write operation goes here

sc.sparkContext.removeSparkListener(myListener)
1
I helped format the code, please check so it looks correct (you can click the edit button to change/add something to the question). Are you getting non-zero result for the recordsReadCount? - Shaido
Thank you , yes I am getting non-zero result for recordsReadCount,although I have write.save operation , the recordsWrittenCount is always zero.. And taskEnd.taskMetrics.outputMetrics is always not equal to None , i.e.,the condition is true - sk_211

1 Answers

0
votes

Well databricks does not provide the input output metrics for delta as of now