0
votes

Hadoop: hadoop-2.6.4

Spark: spark-1.6.0-bin-without-hadoop

JAVA_HOME and Hadoop/bin folder are in $PATH

In conf/spark-env.sh export SPARK_DIST_CLASSPATH=$(/hadoop-2.6.4/bin/hadoop classpath)

When I run example from Spark (bin/run-example SparkPi), exception is following:

    16/03/19 20:44:09 INFO spark.SparkContext: Running Spark version 1.6.0
16/03/19 20:44:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/19 20:44:10 INFO spark.SecurityManager: Changing view acls to: Etude
16/03/19 20:44:10 INFO spark.SecurityManager: Changing modify acls to: Etude
16/03/19 20:44:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Etude); users with modify permissions: Set(Etude)
16/03/19 20:44:10 INFO util.Utils: Successfully started service 'sparkDriver' on port 57408.
16/03/19 20:44:11 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/03/19 20:44:11 INFO Remoting: Starting remoting
16/03/19 20:44:11 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:57409]
16/03/19 20:44:11 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 57409.
16/03/19 20:44:11 INFO spark.SparkEnv: Registering MapOutputTracker
16/03/19 20:44:11 INFO spark.SparkEnv: Registering BlockManagerMaster
16/03/19 20:44:11 INFO storage.DiskBlockManager: Created local directory at /private/var/folders/8q/y95qhldn6m5bn6yrg07nx11c0000gn/T/blockmgr-a48b77b6-0acf-45cd-8036-3ce70b712016
16/03/19 20:44:11 INFO storage.MemoryStore: MemoryStore started with capacity 511.1 MB
16/03/19 20:44:11 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/03/19 20:44:11 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 20:44:11 INFO server.AbstractConnector: Started [email protected]:4040
16/03/19 20:44:11 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/03/19 20:44:11 INFO ui.SparkUI: Started SparkUI at http://192.168.1.16:4040
16/03/19 20:44:11 INFO spark.HttpFileServer: HTTP File server directory is /private/var/folders/8q/y95qhldn6m5bn6yrg07nx11c0000gn/T/spark-b6353e82-d3c0-4641-85e2-7fd0fc8e08d6/httpd-dcc1f420-5e4a-4836-9324-b6cf2b618c54
16/03/19 20:44:11 INFO spark.HttpServer: Starting HTTP Server
16/03/19 20:44:11 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/03/19 20:44:11 INFO server.AbstractConnector: Started [email protected]:57410
16/03/19 20:44:11 INFO util.Utils: Successfully started service 'HTTP file server' on port 57410.
16/03/19 20:44:11 INFO spark.SparkContext: Added JAR file:/Users/Etude/devlib/spark-1.6.0-bin-without-hadoop/lib/spark-examples-1.6.0-hadoop2.2.0.jar at http://192.168.1.16:57410/jars/spark-examples-1.6.0-hadoop2.2.0.jar with timestamp 1458445451967
16/03/19 20:44:12 INFO executor.Executor: Starting executor ID driver on host localhost
16/03/19 20:44:12 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57411.
16/03/19 20:44:12 INFO netty.NettyBlockTransferService: Server created on 57411
16/03/19 20:44:12 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/03/19 20:44:12 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:57411 with 511.1 MB RAM, BlockManagerId(driver, localhost, 57411)
16/03/19 20:44:12 INFO storage.BlockManagerMaster: Registered BlockManager
16/03/19 20:44:12 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:36
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:36) with 2 output partitions
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:36)
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Missing parents: List()
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:32), which has no missing parents
java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
  at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
  at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
  at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:154)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
  at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72)
  at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:65)
  at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
  at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)
  at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
  at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
  at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326)
  at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006)
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
  at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
  at java.lang.System.loadLibrary(System.java:1122)
  at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
  ... 26 more
16/03/19 20:44:12 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
16/03/19 20:44:12 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:36) failed in Unknown s
16/03/19 20:44:12 INFO scheduler.DAGScheduler: Job 0 failed: reduce at SparkPi.scala:36, took 0.056206 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.reflect.InvocationTargetException
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:422)
org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72)
org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:65)
org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)
org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326)
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006)
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
  at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1016)
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
  at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
  at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
  at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
  at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:36)
  at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
  at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
  at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
  at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
  at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72)
  at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:65)
  at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
  at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)
  at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
  at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
  at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326)
  at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006)
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
  at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: java.lang.IllegalArgumentException: org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
  at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:156)
  ... 18 more
Caused by: org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
  at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
  at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
  at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:154)
  ... 18 more
16/03/19 20:44:12 INFO spark.SparkContext: Invoking stop() from shutdown hook
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/03/19 20:44:12 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/03/19 20:44:12 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.16:4040
16/03/19 20:44:12 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/19 20:44:12 INFO storage.MemoryStore: MemoryStore cleared
16/03/19 20:44:12 INFO storage.BlockManager: BlockManager stopped
16/03/19 20:44:12 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/03/19 20:44:12 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/19 20:44:12 INFO spark.SparkContext: Successfully stopped SparkContext
16/03/19 20:44:12 INFO util.ShutdownHookManager: Shutdown hook called
16/03/19 20:44:12 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/8q/y95qhldn6m5bn6yrg07nx11c0000gn/T/spark-b6353e82-d3c0-4641-85e2-7fd0fc8e08d6/httpd-dcc1f420-5e4a-4836-9324-b6cf2b618c54
16/03/19 20:44:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/03/19 20:44:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/03/19 20:44:12 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/8q/y95qhldn6m5bn6yrg07nx11c0000gn/T/spark-b6353e82-d3c0-4641-85e2-7fd0fc8e08d6
1

1 Answers

2
votes

The error clearly says, it is not able to find snappy-java library in the class path. Download and Add it to spark lib. https://github.com/xerial/snappy-java

Snappy is used to compress data and then transfer it between spark nodes, and from source to spark.