I have a long running Spark streaming job that runs on a kerberized Hadoop cluster. It fails every few days with the following error:
Diagnostics: token (token for XXXXXXX: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=XXXXXXXXXXXXXXX, maxDate=XXXXXXXXXX, sequenceNumber=XXXXXXXX, masterKeyId=XXX) can't be found in cache
I tried adding in --keytab and --principal options to spark-submit. But we already have the following options that do the same thing:
For the second option, we already pass in the keytab and principal with the following: 'spark.driver.extraJavaOptions=-Djava.security.auth.login.config=kafka_client_jaas.conf -Djava.security.krb5.conf=krb5.conf -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 -XX:ConcGCThreads=12' \
Same for spark.executor.extraJavaOptions. If we add the options --principal and --keytab it results in attempt to add file (keytab) multiple times to distributed cache