0
votes

I am attempting to leverage the s3a file system in a Flink application using the Table API. I am on EMR 6.1 (that comes prepackaged with Flink 1.11.0).

I am creating a table over an s3 location as part of my application. When I run the program, I get the following error:

org.apache.flink.core.fs.UnsupportedFileSystemSchemeException

Indicating that there are no supported file systems for neither an s3:// or s3a:// URI.

Per Flink's documentation on using s3, if I am on EMR I do not need to configure any file systems as a plugin.

I have double checked that I am not including the fs-s3-hadoop dependency in my application.

Oddly enough, when I do follow the above directions for installing the file system as a plugin I can get this to work for s3:// URIs but not s3a:// ones.

It seems that the flink run command is not honoring the hadoop classpath pre-configured by the EMR.

I additionally wish to use a s3 custom credentials provider to supplement my application, but this also does not appear to work when used via a plugin system nor does it get picked up from the Hadoop classpath.

1

1 Answers

0
votes

Use s3:// URLs

  1. the s3a connector is the Apache open source one, in hadoop-aws
  2. the flink documentations is open source, and refers to s3a
  3. the s3 connector is from EMR.
  4. And EMR does not contain/support the s3a connector
  5. which leads to the stack trace you see