Use a custom Log4J appender when running spark in AWS EMR

Question

I'm trying to execute spark submit in AWS EMR to execute a simple project that uses a custom log4j appender that I wrote.
Im able to pass my log4j properties by providing the following configuration in cluster software settings:

[{
    "classification": "spark-log4j",
    "properties": {
        "log4j.appender.S": "CustomLog4JAppender",
        "log4j.rootLogger": "DEBUG,S"
    }
}

]

But when I'm running the cluster step I'm getting:
log4j:ERROR Could not instantiate class [CustomLog4JAppender]. java.lang.ClassNotFoundException: CustomLog4JAppender
in the cluster stderr.

The jar that I'm executing is located in S3 and it contains the Main class, my appender class and all the dependencies.

I'm executing the cluster using: command-runner.jar
and executing the following command:
spark-submit --deploy-mode client --class Main s3://{path_to_jar}.jar

So a few questions here:

Which component in the cluster loads the log4j logger and properties? does it happen in the master node? in the core node?
What can I do in order to solve this issue? How should I execute it differently? how to make it recognize my custom appender class?

Thanks!

sticky_elbows sticky_elbows · Accepted Answer · 2018-11-19T15:08:08

I also developed a custom log4j appender class and used it as follows in my log4j.properties file with no problem:

log4j.rootLogger=ERROR, defaultLog
log4j.appender.defaultLog=com.my.package.CustomLog4jFileAppender

so my guess is that this line of code "log4j.appender.S": "CustomLog4JAppender" is not enough to locate your custom appender, and you probably need to give the location of your custom appender class. Try this:

"log4j.appender.S": "com.yourPackage.CustomLog4JAppender",

Use a custom Log4J appender when running spark in AWS EMR

1 Answers