0
votes

I have setup an Spark cluster within HDInsight on Azure, I have a service thaqt is pushing data in to HDInsight blob storage on a regular basis and I have created a Hive External table on top of that. I am able to use Jupyter and execute some Spark SQL queries and see results.

Now I have a ASP.Net web site that needs to execute similar Spark SQL query job on user request and display the result on the webpage.

Are there any library to felicitate this or can someone share some sample on how to accomplish this?

I see that HDInsight spark cluster cluster comes with the Livy, but I see no sample that shows us how to use it from my dotnet environment. BTW I am assuming this is the route we need to take to address my issue.

I am really new to all this, any pointers will really help.

Thanks, Kiran

1

1 Answers

0
votes

Sorry that we don't have the HDInsight Spark SDK currently. You can always send REST calls to the APIs as described here: https://azure.microsoft.com/en-us/documentation/articles/hdinsight-apache-spark-livy-rest-interface/.

However if you want to get results from Livy APIs, you need to do some workarounds to make that work. The reason is that we are using Spark on YARN in cluster mode in HDInsight, where the results are not written back to Livy APIs. You need to see the Spark driver's container logs and get the stdout/stderr there manually.

If you have more questions you can drop me an email at xiaoyzhu at microsoft dot com and I can help route to the right owner.

Xiaoyong Zhu from Microsoft HDInsight