0
votes

We are running a M/R job on a 4-node HDInsight cluster, written in C#. One of the Mapper classes uses a Azure Table storage to apply business specific rules.

The M/R job runs correctly, if no CloudTable, CloudTableClient & CloudStorageAccount objects are created.

But, on adding object references for the same, it gives errors and the job execution stops. Part of the code snippet is given below:

public class TopProgMapper : MapperBase
{
    CloudTable table = null;
    CloudStorageAccount storageAccount = null;
    CloudTableClient tableClient = null;

    //The above objects are instantiated and queried in the Mapper ctor

    public TopProgMapper()
    {
        // instantiation code here, which currently has been commented
    }
}

Code in the Mapper's ctor has been commented as execution errrors-out even when no object references are created, as described above.

The error code as received from MapReduceResult object (Info.ExitCode) is 1, indicating an issue with the M/R code. But, the rest of the code runs absolutely fine and produces correct output when the above reference objects are not created.

Any help on this will be highly appreciated. Will provide additional details, if required.

Thanks & regards, Subho

1

1 Answers

1
votes

It sounds like the assembly containing CoudTable, CloudStorageAccount and CloudTableClient is not available on the cluster where the mapper is running. This should be Microsoft.WindowsAzure.Storage.dll or Microsoft.WindowsAzure.StorageClient.dll depending on the version of the API that you are using.

Try adding config.FilesToInclude.Add("Microsoft.WindowsAzure.Storage.dll"); in your Configure method.

If this doesn't help, share the command line & output or code & exception details from your attempt to launch the job.