1
votes

I'm new about Scala and large dataset programming. I need to use a profiler in a local environment, in order to inspect which operation/function is too slow in my Scala code, I tried a Spark UI both in local node and in a cluster environment, but it's not sufficient. The problem is that my code is a Scala "script", or better it is only a sequence of code lines executed directly in a spark-shell.

All common profilers require well structured Scala code in order to generate a jar file and run the jar file to attach the profiler agent. I don't know where to search. Any ideas? Is it possible or not?

1

1 Answers

2
votes

You can attach Java profilers (e.g. the free jvisualvm that comes with the JDK) to any running JVM. I have not tried it, but I believe that you should be able to profile code that gets executed by Spark.

Of course, you have to connect to the right JVM where the code is executed. If it is executed remotely, it won't help to connect to the local JVM running your Spark shell.

You also have to make sure to profile at the right moment.