5
votes

I have a Java application I have written as a Spark Streaming job which requires some text resources that I have included in the jar in a resources directory (using the default Maven directory structure). With unit tests I have no problem accessing these files but when I run my program with spark-submit I get a FileNotFoundException. How do I access files on the classpath in my JAR when running with spark-submit?

The code I am currently using to access my file looks roughly like this:

    InputStream input;

    try {
        URL url = this.getClass().getClassLoader().getResource("my file");
        if (url == null) {
            throw new IOException("file does not exist");
        }
        String path = url.getPath();
        input = new FileInputStream(path);
    } catch(IOException e) {
        throw new RuntimeException(e);
    }

Thanks.

Note this is not a duplicate of Reading a resource file from within jar (which was suggested), because this code works when run locally. It only fails when run in a Spark cluster.

1
this is not related to Spark or Streaming, this is Plain java code.Shankar
No. The above works when run normally. When run with spark-submit it fails. Hence, spark question.Peter

1 Answers

2
votes

I fixed this by accessing the resources directory a different (and significantly less silly) way:

input = MyClass.class.getResourceAsStream("/my file");