0
votes

I installed a HDP 2.5 Hadoop/Spark cluster using cloudbreak on Azure.

Everything works except the spark history server. In the log it says the default uri for the event log hdfs:///spark-history is false, the hostname is missing. So I replaced it with a direct reference to the actual location on the azure blob storage: wasb://<host>:<port>/spark-history. This uri works when used with hdsf dfs -ls, but still the spark history server won't start. Now it complains about a class not found: Caused by: java.lang.NoClassDefFoundError: com/microsoft/azure/storage/blob/BlobListingDetails.

So, it seems it doesn't load some driver during start. I did find /usr/hdp/current/hadoop-client/lib/azure-storage-2.2.0.jar, that might be it. But I'm not sure how to make the history server load the jar during startup using the ambari config editor or whether this is even the right solution to the original problem. The strangest thing is that Azure HDInsight uses blob storage and there the spark history server simply runs using the default hdfs:///spark-history setting.

Any suggestions on how to load the azure-storage driver or any other approach to this problem?

Thanx

1
Could you post the solution in the description as an answer? Thanks. - Peter Pan
Moved the solution to an answer... - oneman

1 Answers

0
votes

I'll answer my own question. Someone on the hortonworks community forum had the answer: the spark assembly jar contains invalid storage jars. Updating the assembly jar solves the issue:

mkdir -p /tmp/jarupdate && cd /tmp/jarupdate
find /usr/hdp/ -name "azure-storage*.jar"
cp /usr/hdp/2.5.0.1-210/hadoop/lib/azure-storage-2.2.0.jar .
cp /usr/hdp/current/spark-historyserver/lib/spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar .
unzip azure-storage-2.2.0.jar
jar uf spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar com/
mv -f spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar /usr/hdp/current/spark-historyserver/lib/spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar
cd .. && rm -rf /tmp/jarupdate