1
votes

I'm trying to run this mapper and reducer code (*Disclaimer - Part of the Solution to Training Course)

mapper.py

import sys

for line in sys.stdin:
    data = line.strip().split("\t")
    if len(data) == 6:
        date, time, store, item, cost, payment = data
        print "{0}\t{1}".format(1, cost)

reducer.py

import sys

sTotal = 0
trans = 0

for line in sys.stdin:
    data_mapped = line.strip().split("\t")
    if len(data_mapped) != 2:
        continue

    sTotal += float(data_mapped[1])
    trans += 1

print transactions, "\t", salesTotal

Keeps throwing this error:

UNDEF/bin/hadoop job  -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201404041914_0012
14/04/04 23:13:53 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201404041914_0012
14/04/04 23:13:53 ERROR streaming.StreamJob: Job not successful. Error: NA
14/04/04 23:13:53 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

I've tried both explicitly calling python function and also by specifying the python interpreter. (i.e. /usr/bin/env python)

Any idea where it is going wrong?

1
Jay Setti : CAn you tell the command you executedUSB
The information provided is not sufficient to debug, check the log files for any additional information.Praveen Sripati

1 Answers

0
votes

The job is failing, because your reducer.py, has a syntax error.

The problem is with this line:

print transactions, "\t", salesTotal

There are no variables with name transactions and salesTotal.

If I execute it locally, I get this error:

Traceback (most recent call last):
  File "r.py", line 14, in <module>
    print transactions, "\t", salesTotal
NameError: name 'transactions' is not defined

The correct line should be:

print trans, "\t", sTotal