Where does log4j writes the logs in cluster mode?

Question

Purpose - Store custom log from streaming app to HDFS or UNIX directory for streaming application

I am running spark streaming program in cluster mode.But logs are not getting written to given log path. checked both HDFS and Local directory.By log4j debug property i can see files in action. Am i missing something?

--files log4j_driver.properties
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j_driver.properties -Dlog4j.debug=true "
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j_driver.properties -Dlog4j.debug=true"

Log4j-Property file

My Log4j properties file -

log=/tmp/cc

log4j.rootLogger=INFO,rolling
log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.File=${log}/abc.log
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxFileSize=2KB
log4j.appender.rolling.maxBackupIndex=10
log4j.appender.rolling.encoding=UTF-8
log4j.logger.org.apache.spark=INFO
log4j.appender.rolling.ImmediateFlush=true
log4j.appender.rolling.Threshold=debug
log4j.appender.rolling.Append=true
log4j.logger.org.eclipse.jetty=INFO

Cluster Driver Log

log4j: Renaming file /tmp/cc/abc.log.2 to /tmp/cc/abc.log.3
log4j: Renaming file /tmp/cc/abc.log.1 to /tmp/cc/abc.log.2
log4j: Renaming file /tmp/cc/abc.log to /tmp/cc/abc.log.1
log4j: setFile called: /tmp/cc/abc.log, false
log4j: setFile ended
log4j: rolling over count=5141
log4j: maxBackupIndex=10
log4j: Renaming file /tmp/cc/abc.log.9 to /tmp/cc/abc.log.10
log4j: Renaming file /tmp/cc/abc.log.8 to /tmp/cc/abc.log.9
log4j: Renaming file /tmp/cc/abc.log.7 to /tmp/cc/abc.log.8
log4j: Renaming file /tmp/cc/abc.log.6 to /tmp/cc/abc.log.7

I read- We can specify - ${spark.yarn.app.container.log.dir}/app.log in log4j but not sure what is the default path for this property or if we need to set manually then as well . When i was runninng this application in client mode - logs are perfectly getting logged to local directory.

"I am running spark streaming program in cluster mode" <-- Can you show the command line that you use to execute the Spark app? — Jacek Laskowski

Yves Yves · Accepted Answer · 2019-02-15T12:24:42

In my yarn cluster,the log of spark streaming application is written on the node of the application container.Actually there’s a dictionary for writing log that belongs to the application and it’s configured by a field named yarn.log.directory?. I don’t remember the precise name so you can check it out.

Where does log4j writes the logs in cluster mode?

3 Answers