I'm managed to setup Hadoop with 3 datanodes as a small cluster and everything work ok. When trying to access AWS bucket on S3A protocol I get this error:
hadoop fs -ls s3a://my-bucket/
-ls: Fatal internal error
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2395)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3208)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3240)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:245)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:228)
at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:103)
at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:317)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:380)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2299)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2393)
... 16 more
What I did wrong ? How do fix that ?
P.S. Bucket on Amazon if fully public. Anyone can download from it.
Amazon credentials was configured in hadoop/core-site.xml as described here: Hadoop-AWS module: Integration with Amazon Web Services