Hadoop 2.4.0 depends on two different versions of beanutils, causing the following error with sbt-assembly:
[error] (*:assembly) deduplicate: different file contents found in the following:
[error] .ivy2/cache/commons-beanutils/commons-beanutils/jars/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/BasicDynaBean.class
[error] .ivy2/cache/commons-beanutils/commons-beanutils-core/jars/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/BasicDynaBean.class
Both of these dependencies are transitive from Hadoop 2.4.0, as confirmed using How to access Ivy directly, i.e. access dependency reports or execute Ivy commands?
How can I make an sbt-assembly including Hadoop 2.4.0?
UPDATE: As requested, here is the build.sbt dependencies:
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.4.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0" % "provided" exclude("org.apache.hadoop", "hadoop-client")
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.7.8"
libraryDependencies += "commons-io" % "commons-io" % "2.4"
libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" % "provided"
libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s" % "1.1.1.0"
The exclude hadoop is needed because, out of the box, Spark includes Hadoop 1, which conflicts with Hadoop 2.
build.sbtwith your dependencies? - lpiepiorahadoop-yarn-common-2.4.0.jar. I think you could maybe resolve them all of them, but that feels like a way through dependency hell. Maybe an option for you would be to include it as a project ref, which would be automatically built from git. I've seen that spark can be downloaded as pre-built for hadoop 2, that's why I said they had a support for hadoop 2 in their build process, but I think they don't publish that version to the maven repo. - lpiepiora