用 Hadoop 进行分布式数据处理,第 1 部分: 入门
2010-06-08 00:00:00 来源:WEB开发网清单 6. 执行计算单词频率的 MapReduce 作业
# hadoop-0.20 jar /usr/lib/hadoop-0.20/hadoop-0.20.2+228-examples.jar \
wordcount input output
10/04/29 17:36:49 INFO input.FileInputFormat: Total input paths to process : 2
10/04/29 17:36:49 INFO mapred.JobClient: Running job: job_201004291628_0009
10/04/29 17:36:50 INFO mapred.JobClient: map 0% reduce 0%
10/04/29 17:37:00 INFO mapred.JobClient: map 100% reduce 0%
10/04/29 17:37:06 INFO mapred.JobClient: map 100% reduce 100%
10/04/29 17:37:08 INFO mapred.JobClient: Job complete: job_201004291628_0009
10/04/29 17:37:08 INFO mapred.JobClient: Counters: 17
10/04/29 17:37:08 INFO mapred.JobClient: Job Counters
10/04/29 17:37:08 INFO mapred.JobClient: Launched reduce tasks=1
10/04/29 17:37:08 INFO mapred.JobClient: Launched map tasks=2
10/04/29 17:37:08 INFO mapred.JobClient: Data-local map tasks=2
10/04/29 17:37:08 INFO mapred.JobClient: FileSystemCounters
10/04/29 17:37:08 INFO mapred.JobClient: FILE_BYTES_READ=47556
10/04/29 17:37:08 INFO mapred.JobClient: HDFS_BYTES_READ=111598
10/04/29 17:37:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=95182
10/04/29 17:37:08 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=30949
10/04/29 17:37:08 INFO mapred.JobClient: Map-Reduce Framework
10/04/29 17:37:08 INFO mapred.JobClient: Reduce input groups=2974
10/04/29 17:37:08 INFO mapred.JobClient: Combine output records=3381
10/04/29 17:37:08 INFO mapred.JobClient: Map input records=2937
10/04/29 17:37:08 INFO mapred.JobClient: Reduce shuffle bytes=47562
10/04/29 17:37:08 INFO mapred.JobClient: Reduce output records=2974
10/04/29 17:37:08 INFO mapred.JobClient: Spilled Records=6762
10/04/29 17:37:08 INFO mapred.JobClient: Map output bytes=168718
10/04/29 17:37:08 INFO mapred.JobClient: Combine input records=17457
10/04/29 17:37:08 INFO mapred.JobClient: Map output records=17457
10/04/29 17:37:08 INFO mapred.JobClient: Reduce input records=3381
更多精彩
赞助商链接