3
votes

I am trying to use log4j2.xml instead of the default log4j logging of spark.

My Log4j2.xml is as below

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE log4j:configuration PUBLIC
  "-//APACHE//DTD LOG4J 1.2//EN" "http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/xml/doc-files/log4j.dtd">

<Configuration status="WARN" name="MyApp" monitorInterval="30">

        <Properties>
            <Property name="appName">MyApp</Property>
            <Property name="appenderPatternLayout">%d{yyyy-MM-dd HH:mm:ss} %c{1} [%p] %m%n</Property>
            <Property name="fileName">/app/vodip/logs/${appName}.log</Property>
        </Properties>

        <Appenders>
            <RollingFile name="RollingFile"
                         fileName="${fileName}"
                         filePattern="a1
                         ${appName}-%d{yyyy-MM-dd-HH}-%i.log">
                <PatternLayout>
                    <Pattern>${appenderPatternLayout}</Pattern>
                </PatternLayout>
                <Policies>
                    <TimeBasedTriggeringPolicy interval="4" modulate="true"/>
                    <SizeBasedTriggeringPolicy size="250 MB"/>
                </Policies>
            </RollingFile>
        </Appenders>


      <Loggers>
          <Logger name="xyz.abcs.MyApp" level="debug" additivity="false">
              <AppenderRef ref="RollingFile"/>
          </Logger>
          <Root level="debug">
              <AppenderRef ref="RollingFile"/>
          </Root>
      </Loggers>

    </Configuration>

I have places my log4j2.xml in spark/conf folder on all nodes and restarted spark and submitted spark programs as below.

spark-submit --master spark://xyzzz.net:7077 \
--class abcd.myclass \
--deploy-mode cluster --executor-memory 2G --total-executor-cores 4  \
--conf spark.network.timeout=150 \
--files /app/spark/spark-1.6.1-bin-hadoop2.6/conf/log4j2.xml \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j2.xml" \
--driver-java-options "-Dlog4j.configuration=file:/app/spark/spark-1.6.1-bin-hadoop2.6/conf/log4j2.xml" \
/app/spark/my.jar

I a seeing this in my worker stderr log. It means that my logs are not using log4j2 functionality.

log4j:WARN Continuable parsing error 10 and column 78 log4j:WARN Document root element "Configuration", must match DOCTYPE root "null". log4j:WARN Continuable parsing error 10 and column 78 log4j:WARN Document is invalid: no grammar found. log4j:ERROR DOM element is - not a element. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

Can anyone advise what is wrong with configuration???

2
I think you are missing <!DOCTYPE check stackoverflow.com/questions/5000884/…VladoDemcak
I have added this line in code. <!DOCTYPE log4j:configuration PUBLIC "-//APACHE//DTD LOG4J 1.2//EN" "logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/xml/…"> But it gave log4j:WARN Continuable parsing error 13 and column 78 log4j:WARN Document root element "Configuration", must match DOCTYPE root "log4j:configuration". log4j:WARN Continuable parsing error 13 and column 78 log4j:WARN Element type "Configuration" must be declared. log4j:WARN Continuable parsing error 15 and column 17AKC
seems log4j2 is not friend with spark check this and this if you have not alreadyVladoDemcak

2 Answers

1
votes

At least one mistake in your command line can lead to this error

-Dlog4j.configuration=. . . must actually be -Dlog4j.configurationFile=. . . when using log4j2

log4j.configuration is parsed by old log4j, which obviously don't understand new configuration format and throws parsing errors

0
votes

are you also putting log4j file in resource folder of your project if it will be there then remove it from there and for logging spark application with log4j for driver and executer you should also provide path of log4j for driver and executor for that as follows

spark-submit --class MAIN_CLASS --driver-java-options "-Dlog4j.configuration=file:PATH_OF_LOG4J" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:PATH_OF_LOG4J" --master MASTER_IP:PORT JAR_PATH

You can also refer to this blog for more details https://blog.knoldus.com/2016/02/23/logging-spark-application-on-standalone-cluster/