1
votes

I have Gradle Apache beam java project that reads events from pubsub and writes to Cloud Firestore. I can run this streaming job using DirectRunner. While executing on DataflowRunner, the worker startup logs shows following error lines :

Line1 : "A JNI error has occurred, please check your installation and try again"
Line2 : "java.lang.NoClassDefFoundError: org/apache/beam/vendor/guava/v26_0_jre/com/google/common/graph/Network
    at java.lang.Class.getDeclaredMethods0(Native Method) 
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) 
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048) 
    at java.lang.Class.getMethod0(Class.java:3018) 
    at java.lang.Class.getMethod(Class.java:1784) 
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) 
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) 
 Caused by: java.lang.ClassNotFoundException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.graph.Network 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)"

Line 3: "Error syncing pod 8db0cbe852963b21f84ca2ee52e431a4 ("dataflow-streamprocessorapplicatio-06011028-n4lo-harness-sz6k_default(8db0cbe852963b21f84ca2ee52e431a4)"), skipping: failed to "StartContainer" for "java-streaming" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=java-streaming pod=dataflow-streamprocessorapplicatio-06011028-n4lo-harness-sz6k_default(8db0cbe852963b21f84ca2ee52e431a4)"

I also tried adding beam-vendors-guava and beam-vendors-grpc dependency. But it didnt help while running on DataflowRunner.

Here is my build.gradle file dependencies section. I am using gradle shade plugin to build uber jar and execute it/create dataflow template and run it:

dependencies {
    testCompile('org.junit.jupiter:junit-jupiter-api:5.4.0')
    testCompile('org.junit.jupiter:junit-jupiter-params:5.4.0')
    testCompile('org.junit.jupiter:junit-jupiter-engine:5.4.0')
    testCompile(group: 'org.mockito', name: 'mockito-junit-jupiter', version: '3.2.4')
    testCompile(group: 'org.powermock', name: 'powermock-module-junit4', version: '2.0.4')
    testCompile(group: 'org.mockito', name: 'mockito-core', version: '2.28.2') { force true }
    testCompile(group: 'org.powermock', name: 'powermock-api-mockito2', version: '2.0.4')
    testCompile(group: 'org.hamcrest', name: 'hamcrest-all', version: '1.3')

//    compile(group: 'org.apache.beam', name: 'beam-runners-google-cloud-dataflow-java', version: '2.19.0')
//    runtimeClasspath(group: 'org.apache.beam', name: 'beam-runners-google-cloud-dataflow-java', version: '2.19.0')
    runtime(group: 'org.apache.beam', name: 'beam-runners-google-cloud-dataflow-java', version: '2.19.0')

    compile(group: 'org.apache.beam', name: 'beam-sdks-java-io-google-cloud-platform', version: '2.19.0') {
        exclude(group: 'io.grpc', module: 'grpc-all')
    }

    // https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-core
    compile group: 'org.apache.beam', name: 'beam-sdks-java-core', version: '2.19.0'

    compile group: 'org.apache.beam', name: 'beam-sdks-java-io-redis', version: '2.19.0'
    compile(group: 'org.apache.beam', name: 'beam-sdks-java-maven-archetypes-starter', version: '2.19.0')
//    runtime(group: 'org.apache.beam', name: 'beam-sdks-java-maven-archetypes-starter', version: '2.19.0')
    compileOnly group: 'org.apache.beam', name: 'beam-runners-direct-java', version: '2.19.0'
    compile(group: 'io.grpc', name: 'grpc-all', version: '1.25.0') {
        force true
    }

    compile group: 'org.projectlombok', name: 'lombok', version: '1.18.10'
    annotationProcessor 'org.projectlombok:lombok:1.18.8'

    compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.25'
    compile group: 'org.slf4j', name: 'slf4j-simple', version: '1.7.25'

    compile group: 'com.google.cloud', name: 'google-cloud-firestore', version: '1.32.2'
    compile group: 'org.apache.commons', name: 'commons-collections4', version: '4.0'
}

Any help/suggestions in this case would be helpful. I am using Apache Beam SDK 2.19.0 for Java.

1
Is org/apache/beam/vendor/guava/v26_0_jre/com/google/common/graph/Network in your uber jar? - Kenn Knowles
Assuming it is, can you share how you are launching the job? Make sure your uber jar is being staged. By default, everything on the classpath is staged. If there is a problem finding the uber jar, you can specify it with --filesToStage (but this overwrites all files to stage so you have to include everything in the uber jar) - Kenn Knowles

1 Answers

1
votes

Since your question says "any help suggestions" I will write my comments also as an answer to "how can I try to debug this"?

  1. Check that org/apache/beam/vendor/guava/v26_0_jre/com/google/common/graph/Network is in the uber jar. If not, it is a shadow plugin configuration issue.
  2. Check that the uber jar is being staged. If not, there may be a problem with the runtime classpath. You can also specify --filesToStage on the command line on setFilesToStage in Java code.
  3. Check that the uber jar has downloaded properly on the worker.