I followed this tutorial on submitting mapreduce jobs to HDInsight from a .NET console app.
It works fine, but am wondering about this line:
var jobDefinition = new MapReduceJobCreateParameters()
{
JarFile = "wasb:///example/jars/hadoop-examples.jar",
ClassName = "wordcount"
};
"wasb:///example/jars/hadoop-examples.jar" refers to a jar in my Azure storage account that was automatically put there when I connected the account to my new HDInsight cluster.
Moving beyond the examples (I want to use Mahout)... can I reference a jar that I have added to the cluster node? I installed mahout into the apps/dist directory by RDP. I can run Mahout jobs from there just fine, but I can't put these two steps together.
It feels like I shouldn't have to add jar files to blob storage to use them.