2
votes

I have setup Stackdriver, installed the monitoring agent and followed this guide for JVM monitoring https://cloud.google.com/monitoring/agent/plugins/jvm

However I cannot access the JMX metrics from Dataproc, specifically HeapMemoryUsage.

I downloaded jvm-sun-hotspot.conf from the GitHub configuration repository and place it in the directory /opt/stackdriver/collectd/etc/collectd.d/

It asks me to edit downloaded configuration file and replace JMX_PORT by the port on which your JVM is configured to allow JMX connections.

Where do I find this port? Do I need to setup an application to monitor JMX metrics?

1
@NathanGriffiths yes and no. The main answer suggests to have a random JMX port, but the problem is to detect it from the ports the spark process opens. Also, Since there are no spark driver or executor processes running by default, it means that for each submitted job the stackdriver agent configuration needs to be updated and the service restarted - two actions which requires root access and yarn is not a sudoer. Having a static port for the executor means that several executor processes will try to bind to to the same jmx port on the same host. - David Rabinowitz

1 Answers

0
votes

Unfortunately the JVM monitoring that the Stackdriver monitoring provides is JMX based, and therefore it poses a problem when Spark is involved as two executors share the same host and compete for the same JMX port.

However, another solution can be by using Spark's built in monitoring: