I am building a log analysis planform to monitor spark jobs on a yarn cluster and I want to get a clear idea about spark/yarn logging. I have searched a lot about this and these are the confusions I have.
The directory specified in spark.eventLog.dir or spark.history.fs.logDirectory get stored all the application master logs and through log4j.properties in spark conf we can customize those logs ?
In default all data nodes output their executor logs to a folder in /var/log/. with log-aggregation enabled you can get those executer logs to the spark.eventLog.dir location as well?
I've managed to set up a 3 node virtual hadoop yarn cluster, spark installed in the master node. When I'm running spark in client mode I'm thinking this node becomes the application master node. I'm a beginner to Big data and appreciate any effort to help me out with these confusions.