0
votes

the question is related to a terasort example. Is there any parameter to change the amount of output records using terasort? The input generated with teragen is 65'536'000 but we are requested to run terasort and output 10'000'000 records. This request is part of a practice with Cloudera distribution, not a real case but benchmark on implementation practice. Teragen:

time hadoop jar opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hadoop-0.20-mapreduce/hadoop-examples.jar teragen -Dmapreduce.job.maps=12 -Ddfs.blocksize=33554432 -Dmapreduce.map.memory.mb=512 -Dyarn.app.mapreduce.am.containerlauncher.threadpool-initial-size=512 65536000 /user/haley/tgen

Result:

17/12/20 10:31:00 INFO terasort.TeraSort: starting
17/12/20 10:31:02 INFO hdfs.DFSClient: Created token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513776662042, maxDate=1514381462042, sequenceNumber=6, masterKeyId=14 on 172.31.10.43:8020
17/12/20 10:31:02 INFO security.TokenCache: Got dt for hdfs://ip-172-31-10-43.us-west-2.compute.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.10.43:8020, Ident: (token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513776662042, maxDate=1514381462042, sequenceNumber=6, masterKeyId=14)
17/12/20 10:31:02 INFO input.FileInputFormat: Total input paths to process : 12
Spent 330ms computing base-splits.
Spent 4ms computing TeraScheduler splits.
Computing input splits took 335ms
Sampling 10 splits of 204
Making 12 from 100000 sampled records
Computing parititions took 522ms
Spent 858ms computing partitions.
17/12/20 10:31:02 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-15-85.us-west-2.compute.internal/172.31.15.85:8032
17/12/20 10:31:03 INFO mapreduce.JobSubmitter: number of splits:204
17/12/20 10:31:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1513773980733_0002
17/12/20 10:31:03 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.10.43:8020, Ident: (token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513776662042, maxDate=1514381462042, sequenceNumber=6, masterKeyId=14)
17/12/20 10:31:03 INFO impl.YarnClientImpl: Submitted application application_1513773980733_0002
17/12/20 10:31:03 INFO mapreduce.Job: The url to track the job: http://ip-172-31-15-85.us-west-2.compute.internal:8088/proxy/application_1513773980733_0002/
17/12/20 10:31:03 INFO mapreduce.Job: Running job: job_1513773980733_0002
17/12/20 10:31:11 INFO mapreduce.Job: Job job_1513773980733_0002 running in uber mode : false
17/12/20 10:31:11 INFO mapreduce.Job:  map 0% reduce 0%
17/12/20 10:31:19 INFO mapreduce.Job:  map 1% reduce 0%
17/12/20 10:31:20 INFO mapreduce.Job:  map 2% reduce 0%
17/12/20 10:31:23 INFO mapreduce.Job:  map 4% reduce 0%
17/12/20 10:31:26 INFO mapreduce.Job:  map 5% reduce 0%
17/12/20 10:31:27 INFO mapreduce.Job:  map 6% reduce 0%
17/12/20 10:31:29 INFO mapreduce.Job:  map 11% reduce 0%
17/12/20 10:31:30 INFO mapreduce.Job:  map 12% reduce 0%
17/12/20 10:31:33 INFO mapreduce.Job:  map 13% reduce 0%
17/12/20 10:31:34 INFO mapreduce.Job:  map 14% reduce 0%
17/12/20 10:31:36 INFO mapreduce.Job:  map 15% reduce 0%
17/12/20 10:31:37 INFO mapreduce.Job:  map 16% reduce 0%
17/12/20 10:31:40 INFO mapreduce.Job:  map 17% reduce 0%
17/12/20 10:31:41 INFO mapreduce.Job:  map 22% reduce 0%
17/12/20 10:31:43 INFO mapreduce.Job:  map 23% reduce 0%
17/12/20 10:31:44 INFO mapreduce.Job:  map 24% reduce 0%
17/12/20 10:31:47 INFO mapreduce.Job:  map 25% reduce 0%
17/12/20 10:31:50 INFO mapreduce.Job:  map 26% reduce 0%
17/12/20 10:31:51 INFO mapreduce.Job:  map 27% reduce 0%
17/12/20 10:31:54 INFO mapreduce.Job:  map 31% reduce 0%
17/12/20 10:31:55 INFO mapreduce.Job:  map 33% reduce 0%
17/12/20 10:31:58 INFO mapreduce.Job:  map 34% reduce 0%
17/12/20 10:31:59 INFO mapreduce.Job:  map 35% reduce 0%
17/12/20 10:32:02 INFO mapreduce.Job:  map 37% reduce 0%
17/12/20 10:32:05 INFO mapreduce.Job:  map 38% reduce 0%
17/12/20 10:32:06 INFO mapreduce.Job:  map 43% reduce 0%
17/12/20 10:32:08 INFO mapreduce.Job:  map 44% reduce 0%
17/12/20 10:32:09 INFO mapreduce.Job:  map 45% reduce 0%
17/12/20 10:32:11 INFO mapreduce.Job:  map 46% reduce 0%
17/12/20 10:32:12 INFO mapreduce.Job:  map 47% reduce 0%
17/12/20 10:32:16 INFO mapreduce.Job:  map 49% reduce 0%
17/12/20 10:32:17 INFO mapreduce.Job:  map 50% reduce 0%
17/12/20 10:32:18 INFO mapreduce.Job:  map 52% reduce 0%
17/12/20 10:32:19 INFO mapreduce.Job:  map 54% reduce 0%
17/12/20 10:32:20 INFO mapreduce.Job:  map 55% reduce 0%
17/12/20 10:32:23 INFO mapreduce.Job:  map 56% reduce 0%
17/12/20 10:32:24 INFO mapreduce.Job:  map 57% reduce 0%
17/12/20 10:32:26 INFO mapreduce.Job:  map 58% reduce 0%
17/12/20 10:32:27 INFO mapreduce.Job:  map 59% reduce 0%
17/12/20 10:32:29 INFO mapreduce.Job:  map 60% reduce 0%
17/12/20 10:32:30 INFO mapreduce.Job:  map 64% reduce 0%
17/12/20 10:32:31 INFO mapreduce.Job:  map 65% reduce 0%
17/12/20 10:32:33 INFO mapreduce.Job:  map 66% reduce 0%
17/12/20 10:32:34 INFO mapreduce.Job:  map 67% reduce 0%
17/12/20 10:32:36 INFO mapreduce.Job:  map 68% reduce 0%
17/12/20 10:32:37 INFO mapreduce.Job:  map 69% reduce 0%
17/12/20 10:32:39 INFO mapreduce.Job:  map 70% reduce 0%
17/12/20 10:32:42 INFO mapreduce.Job:  map 73% reduce 0%
17/12/20 10:32:43 INFO mapreduce.Job:  map 75% reduce 0%
17/12/20 10:32:45 INFO mapreduce.Job:  map 76% reduce 0%
17/12/20 10:32:47 INFO mapreduce.Job:  map 77% reduce 0%
17/12/20 10:32:48 INFO mapreduce.Job:  map 78% reduce 0%
17/12/20 10:32:51 INFO mapreduce.Job:  map 80% reduce 0%
17/12/20 10:32:52 INFO mapreduce.Job:  map 81% reduce 0%
17/12/20 10:32:53 INFO mapreduce.Job:  map 82% reduce 0%
17/12/20 10:32:54 INFO mapreduce.Job:  map 84% reduce 0%
17/12/20 10:32:55 INFO mapreduce.Job:  map 86% reduce 0%
17/12/20 10:32:58 INFO mapreduce.Job:  map 88% reduce 0%
17/12/20 10:33:02 INFO mapreduce.Job:  map 89% reduce 0%
17/12/20 10:33:05 INFO mapreduce.Job:  map 90% reduce 0%
17/12/20 10:33:06 INFO mapreduce.Job:  map 91% reduce 0%
17/12/20 10:33:07 INFO mapreduce.Job:  map 92% reduce 0%
17/12/20 10:33:11 INFO mapreduce.Job:  map 92% reduce 3%
17/12/20 10:33:12 INFO mapreduce.Job:  map 93% reduce 10%
17/12/20 10:33:13 INFO mapreduce.Job:  map 94% reduce 10%
17/12/20 10:33:14 INFO mapreduce.Job:  map 95% reduce 13%
17/12/20 10:33:15 INFO mapreduce.Job:  map 95% reduce 26%
17/12/20 10:33:17 INFO mapreduce.Job:  map 96% reduce 26%
17/12/20 10:33:18 INFO mapreduce.Job:  map 98% reduce 26%
17/12/20 10:33:20 INFO mapreduce.Job:  map 98% reduce 27%
17/12/20 10:33:22 INFO mapreduce.Job:  map 99% reduce 27%
17/12/20 10:33:23 INFO mapreduce.Job:  map 100% reduce 27%
17/12/20 10:33:24 INFO mapreduce.Job:  map 100% reduce 30%
17/12/20 10:33:26 INFO mapreduce.Job:  map 100% reduce 33%
17/12/20 10:33:27 INFO mapreduce.Job:  map 100% reduce 45%
17/12/20 10:33:28 INFO mapreduce.Job:  map 100% reduce 51%
17/12/20 10:33:30 INFO mapreduce.Job:  map 100% reduce 62%
17/12/20 10:33:32 INFO mapreduce.Job:  map 100% reduce 64%
17/12/20 10:33:33 INFO mapreduce.Job:  map 100% reduce 72%
17/12/20 10:33:34 INFO mapreduce.Job:  map 100% reduce 80%
17/12/20 10:33:36 INFO mapreduce.Job:  map 100% reduce 89%
17/12/20 10:33:37 INFO mapreduce.Job:  map 100% reduce 91%
17/12/20 10:33:38 INFO mapreduce.Job:  map 100% reduce 95%
17/12/20 10:33:39 INFO mapreduce.Job:  map 100% reduce 96%
17/12/20 10:33:40 INFO mapreduce.Job:  map 100% reduce 99%
17/12/20 10:33:43 INFO mapreduce.Job:  map 100% reduce 100%
17/12/20 10:33:43 INFO mapreduce.Job: Job job_1513773980733_0002 completed successfully
17/12/20 10:33:43 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=2907421533
                FILE: Number of bytes written=5786194509
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=6553630192
                HDFS: Number of bytes written=6553600000
                HDFS: Number of read operations=648
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=24
        Job Counters
                Launched map tasks=204
                Launched reduce tasks=12
                Data-local map tasks=204
                Total time spent by all maps in occupied slots (ms)=1572044
                Total time spent by all reduces in occupied slots (ms)=441827
                Total time spent by all map tasks (ms)=1572044
                Total time spent by all reduce tasks (ms)=441827
                Total vcore-milliseconds taken by all map tasks=1572044
                Total vcore-milliseconds taken by all reduce tasks=441827
                Total megabyte-milliseconds taken by all map tasks=1609773056
                Total megabyte-milliseconds taken by all reduce tasks=452430848
        Map-Reduce Framework
                Map input records=65536000
                Map output records=65536000
                Map output bytes=6684672000
                Map output materialized bytes=2846244178
                Input split bytes=30192
                Combine input records=0
                Combine output records=0
                Reduce input groups=65536000
                Reduce shuffle bytes=2846244178
                Reduce input records=65536000
                Reduce output records=65536000
                Spilled Records=131072000
                Shuffled Maps =2448
                Failed Shuffles=0
                Merged Map outputs=2448
                GC time elapsed (ms)=27275
                CPU time spent (ms)=950620
                Physical memory (bytes) snapshot=117459451904
                Virtual memory (bytes) snapshot=345340637184
                Total committed heap usage (bytes)=125787176960
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=6553600000
        File Output Format Counters
                Bytes Written=6553600000
17/12/20 10:33:43 INFO terasort.TeraSort: done

real    2m43.996s
user    0m7.229s
sys     0m0.361s

Terasort (tried mapred.map.output.records with no luck so far):

time hadoop jar /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hadoop-0.20-mapreduce/hadoop-examples.jar terasort -D mapred.map.output.records=10000000 /user/haley/tgen /user/haley/tsort1

Result:

17/12/20 10:56:12 INFO terasort.TeraSort: starting
17/12/20 10:56:13 INFO hdfs.DFSClient: Created token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513778173455, maxDate=1514382973455, sequenceNumber=7, masterKeyId=14 on 172.31.10.43:8020
17/12/20 10:56:13 INFO security.TokenCache: Got dt for hdfs://ip-172-31-10-43.us-west-2.compute.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.10.43:8020, Ident: (token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513778173455, maxDate=1514382973455, sequenceNumber=7, masterKeyId=14)
17/12/20 10:56:13 INFO input.FileInputFormat: Total input paths to process : 12
Spent 295ms computing base-splits.
Spent 4ms computing TeraScheduler splits.
Computing input splits took 299ms
Sampling 10 splits of 204
Making 12 from 100000 sampled records
Computing parititions took 558ms
Spent 860ms computing partitions.
17/12/20 10:56:14 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-15-85.us-west-2.compute.internal/172.31.15.85:8032
17/12/20 10:56:14 INFO mapreduce.JobSubmitter: number of splits:204
17/12/20 10:56:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1513773980733_0003
17/12/20 10:56:14 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.10.43:8020, Ident: (token for haley: HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=, issueDate=1513778173455, maxDate=1514382973455, sequenceNumber=7, masterKeyId=14)
17/12/20 10:56:15 INFO impl.YarnClientImpl: Submitted application application_1513773980733_0003
17/12/20 10:56:15 INFO mapreduce.Job: The url to track the job: http://ip-172-31-15-85.us-west-2.compute.internal:8088/proxy/application_1513773980733_0003/
17/12/20 10:56:15 INFO mapreduce.Job: Running job: job_1513773980733_0003
17/12/20 10:56:22 INFO mapreduce.Job: Job job_1513773980733_0003 running in uber mode : false
17/12/20 10:56:22 INFO mapreduce.Job:  map 0% reduce 0%
17/12/20 10:56:30 INFO mapreduce.Job:  map 1% reduce 0%
17/12/20 10:56:31 INFO mapreduce.Job:  map 2% reduce 0%
17/12/20 10:56:34 INFO mapreduce.Job:  map 4% reduce 0%
17/12/20 10:56:37 INFO mapreduce.Job:  map 5% reduce 0%
17/12/20 10:56:38 INFO mapreduce.Job:  map 6% reduce 0%
17/12/20 10:56:40 INFO mapreduce.Job:  map 7% reduce 0%
17/12/20 10:56:41 INFO mapreduce.Job:  map 12% reduce 0%
17/12/20 10:56:44 INFO mapreduce.Job:  map 13% reduce 0%
17/12/20 10:56:45 INFO mapreduce.Job:  map 14% reduce 0%
17/12/20 10:56:48 INFO mapreduce.Job:  map 16% reduce 0%
17/12/20 10:56:51 INFO mapreduce.Job:  map 17% reduce 0%
17/12/20 10:56:52 INFO mapreduce.Job:  map 18% reduce 0%
17/12/20 10:56:53 INFO mapreduce.Job:  map 22% reduce 0%
17/12/20 10:56:56 INFO mapreduce.Job:  map 24% reduce 0%
17/12/20 10:56:58 INFO mapreduce.Job:  map 25% reduce 0%
17/12/20 10:57:02 INFO mapreduce.Job:  map 27% reduce 0%
17/12/20 10:57:05 INFO mapreduce.Job:  map 28% reduce 0%
17/12/20 10:57:06 INFO mapreduce.Job:  map 33% reduce 0%
17/12/20 10:57:09 INFO mapreduce.Job:  map 34% reduce 0%
17/12/20 10:57:10 INFO mapreduce.Job:  map 35% reduce 0%
17/12/20 10:57:12 INFO mapreduce.Job:  map 36% reduce 0%
17/12/20 10:57:13 INFO mapreduce.Job:  map 37% reduce 0%
17/12/20 10:57:16 INFO mapreduce.Job:  map 38% reduce 0%
17/12/20 10:57:17 INFO mapreduce.Job:  map 42% reduce 0%
17/12/20 10:57:18 INFO mapreduce.Job:  map 43% reduce 0%
17/12/20 10:57:19 INFO mapreduce.Job:  map 44% reduce 0%
17/12/20 10:57:20 INFO mapreduce.Job:  map 45% reduce 0%
17/12/20 10:57:24 INFO mapreduce.Job:  map 47% reduce 0%
17/12/20 10:57:26 INFO mapreduce.Job:  map 48% reduce 0%
17/12/20 10:57:27 INFO mapreduce.Job:  map 49% reduce 0%
17/12/20 10:57:28 INFO mapreduce.Job:  map 50% reduce 0%
17/12/20 10:57:29 INFO mapreduce.Job:  map 51% reduce 0%
17/12/20 10:57:30 INFO mapreduce.Job:  map 54% reduce 0%
17/12/20 10:57:31 INFO mapreduce.Job:  map 55% reduce 0%
17/12/20 10:57:33 INFO mapreduce.Job:  map 56% reduce 0%
17/12/20 10:57:34 INFO mapreduce.Job:  map 57% reduce 0%
17/12/20 10:57:37 INFO mapreduce.Job:  map 58% reduce 0%
17/12/20 10:57:38 INFO mapreduce.Job:  map 59% reduce 0%
17/12/20 10:57:40 INFO mapreduce.Job:  map 61% reduce 0%
17/12/20 10:57:41 INFO mapreduce.Job:  map 64% reduce 0%
17/12/20 10:57:42 INFO mapreduce.Job:  map 65% reduce 0%
17/12/20 10:57:45 INFO mapreduce.Job:  map 66% reduce 0%
17/12/20 10:57:46 INFO mapreduce.Job:  map 67% reduce 0%
17/12/20 10:57:48 INFO mapreduce.Job:  map 68% reduce 0%
17/12/20 10:57:49 INFO mapreduce.Job:  map 69% reduce 0%
17/12/20 10:57:51 INFO mapreduce.Job:  map 70% reduce 0%
17/12/20 10:57:52 INFO mapreduce.Job:  map 72% reduce 0%
17/12/20 10:57:53 INFO mapreduce.Job:  map 73% reduce 0%
17/12/20 10:57:54 INFO mapreduce.Job:  map 74% reduce 0%
17/12/20 10:57:55 INFO mapreduce.Job:  map 75% reduce 0%
17/12/20 10:57:56 INFO mapreduce.Job:  map 76% reduce 0%
17/12/20 10:57:59 INFO mapreduce.Job:  map 78% reduce 0%
17/12/20 10:58:01 INFO mapreduce.Job:  map 79% reduce 0%
17/12/20 10:58:02 INFO mapreduce.Job:  map 80% reduce 0%
17/12/20 10:58:03 INFO mapreduce.Job:  map 82% reduce 0%
17/12/20 10:58:05 INFO mapreduce.Job:  map 84% reduce 0%
17/12/20 10:58:06 INFO mapreduce.Job:  map 86% reduce 0%
17/12/20 10:58:09 INFO mapreduce.Job:  map 87% reduce 0%
17/12/20 10:58:12 INFO mapreduce.Job:  map 88% reduce 0%
17/12/20 10:58:14 INFO mapreduce.Job:  map 89% reduce 0%
17/12/20 10:58:15 INFO mapreduce.Job:  map 90% reduce 0%
17/12/20 10:58:19 INFO mapreduce.Job:  map 91% reduce 0%
17/12/20 10:58:20 INFO mapreduce.Job:  map 91% reduce 5%
17/12/20 10:58:21 INFO mapreduce.Job:  map 92% reduce 5%
17/12/20 10:58:22 INFO mapreduce.Job:  map 92% reduce 10%
17/12/20 10:58:23 INFO mapreduce.Job:  map 93% reduce 15%
17/12/20 10:58:24 INFO mapreduce.Job:  map 94% reduce 15%
17/12/20 10:58:25 INFO mapreduce.Job:  map 94% reduce 18%
17/12/20 10:58:26 INFO mapreduce.Job:  map 95% reduce 26%
17/12/20 10:58:28 INFO mapreduce.Job:  map 96% reduce 26%
17/12/20 10:58:29 INFO mapreduce.Job:  map 97% reduce 26%
17/12/20 10:58:30 INFO mapreduce.Job:  map 98% reduce 26%
17/12/20 10:58:32 INFO mapreduce.Job:  map 98% reduce 27%
17/12/20 10:58:33 INFO mapreduce.Job:  map 99% reduce 27%
17/12/20 10:58:34 INFO mapreduce.Job:  map 100% reduce 27%
17/12/20 10:58:37 INFO mapreduce.Job:  map 100% reduce 30%
17/12/20 10:58:38 INFO mapreduce.Job:  map 100% reduce 44%
17/12/20 10:58:40 INFO mapreduce.Job:  map 100% reduce 52%
17/12/20 10:58:41 INFO mapreduce.Job:  map 100% reduce 58%
17/12/20 10:58:43 INFO mapreduce.Job:  map 100% reduce 64%
17/12/20 10:58:44 INFO mapreduce.Job:  map 100% reduce 73%
17/12/20 10:58:46 INFO mapreduce.Job:  map 100% reduce 81%
17/12/20 10:58:47 INFO mapreduce.Job:  map 100% reduce 85%
17/12/20 10:58:48 INFO mapreduce.Job:  map 100% reduce 94%
17/12/20 10:58:49 INFO mapreduce.Job:  map 100% reduce 98%
17/12/20 10:58:50 INFO mapreduce.Job:  map 100% reduce 100%
17/12/20 10:58:51 INFO mapreduce.Job: Job job_1513773980733_0003 completed successfully
17/12/20 10:58:51 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=2906318809
                FILE: Number of bytes written=5785091778
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=6553630192
                HDFS: Number of bytes written=6553600000
                HDFS: Number of read operations=648
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=24
        Job Counters
                Launched map tasks=204
                Launched reduce tasks=12
                Data-local map tasks=204
                Total time spent by all maps in occupied slots (ms)=1548516
                Total time spent by all reduces in occupied slots (ms)=443076
                Total time spent by all map tasks (ms)=1548516
                Total time spent by all reduce tasks (ms)=443076
                Total vcore-milliseconds taken by all map tasks=1548516
                Total vcore-milliseconds taken by all reduce tasks=443076
                Total megabyte-milliseconds taken by all map tasks=1585680384
                Total megabyte-milliseconds taken by all reduce tasks=453709824
        Map-Reduce Framework
                Map input records=65536000
                Map output records=65536000
                Map output bytes=6684672000
                Map output materialized bytes=2846244178
                Input split bytes=30192
                Combine input records=0
                Combine output records=0
                Reduce input groups=65536000
                Reduce shuffle bytes=2846244178
                Reduce input records=65536000
                Reduce output records=65536000
                Spilled Records=131072000
                Shuffled Maps =2448
                Failed Shuffles=0
                Merged Map outputs=2448
                GC time elapsed (ms)=26251
                CPU time spent (ms)=946520
                Physical memory (bytes) snapshot=117397381120
                Virtual memory (bytes) snapshot=345217998848
                Total committed heap usage (bytes)=123740356608
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=6553600000
        File Output Format Counters
                Bytes Written=6553600000
17/12/20 10:58:51 INFO terasort.TeraSort: done

real    2m40.756s
user    0m7.248s
sys     0m0.378s

Thanks in advance!!!

1

1 Answers

0
votes

Is there any parameter to change the amount of output records using terasort?

As far as I understand the source code of TeraSort.java, it seems to implement a custom partitioner, partitioning and sorting the full input. So there is no parameter to change that behavior.