0
votes

The offical documentation says we can get the values of processing time by StreamingListener: The progress of a Spark Streaming program can also be monitored using the StreamingListener interface, which allows you to get receiver status and processing times. http://spark.apache.org/docs/latest/streaming-programming-guide.html#monitoring-applications

I know there is some information about the metrics of Spark, but it does not contains the processing time and scheduling delay. http://spark.apache.org/docs/latest/monitoring.html#rest-api

I read the source code of StreamingListener. It contains a method like this:

def printStats() {
    showMillisDistribution("Total delay: ", _.totalDelay)
    showMillisDistribution("Processing time: ", _.processingDelay)
}

I think it is possible to get these metrics but I did not realize it. I need these metrics for my research. How can I get them? Thanks very very much.

1

1 Answers

2
votes

I found the solution.

class MyListener() extends StreamingListener {
    override def onBatchCompleted(batchStarted: StreamingListenerBatchCompleted) {
    println("Total delay: " + batchStarted.batchInfo.totalDelay)
    println("Processing time: " + batchStarted.batchInfo.processingDelay)
  }
}