I have 6 node datastax cassandra cluster(3 cassandra- 3 analytics). I am using hive to generate reports.
The issue is when I run a hive job with count(*) or group by query, mappers get completed but the reducers get stuck to a particular percent. And they never change.
(All the nodes have 60GB of data. The column family on which I am running hive has around 40GB on each node and has 15 columns, each with data size of 50,60 characters. I am using default hive settings.)
Also is their information available on time taken vs data size of hive job(normal query). I could find on net.
This is what happens
Query is very simple - select count(*) from table; It has 300k rows.
Starting Job = job_201305210728_0005, Tracking URL = http://something:50030/jobdetails.jsp?jobid=job_201305210728_0005
Kill Command = ./bin/dse hadoop job -Dmapred.job.tracker=10.70.41.62:8012 -kill job_201305210728_0005
Hadoop job information for Stage-1: number of mappers: 7; number of reducers: 1
2013-05-22 03:52:58,811 Stage-1 map = 0%, reduce = 0%
2013-05-22 03:53:02,841 Stage-1 map = 14%, reduce = 0%, Cumulative CPU 1.87 sec
2013-05-22 03:53:03,863 Stage-1 map = 14%, reduce = 0%, Cumulative CPU 1.87 sec
2013-05-22 03:53:04,874 Stage-1 map = 29%, reduce = 0%, Cumulative CPU 2.59 sec
2013-05-22 03:53:05,882 Stage-1 map = 29%, reduce = 0%, Cumulative CPU 2.59 sec
2013-05-22 03:53:06,894 Stage-1 map = 43%, reduce = 0%, Cumulative CPU 3.42 sec
2013-05-22 03:53:07,901 Stage-1 map = 43%, reduce = 0%, Cumulative CPU 3.42 sec
2013-05-22 03:53:08,908 Stage-1 map = 57%, reduce = 0%, Cumulative CPU 5.13 sec
2013-05-22 03:53:09,919 Stage-1 map = 57%, reduce = 0%, Cumulative CPU 5.13 sec
2013-05-22 03:53:10,929 Stage-1 map = 71%, reduce = 0%, Cumulative CPU 5.51 sec
2013-05-22 03:53:11,936 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 7.79 sec
2013-05-22 03:53:12,943 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 7.79 sec
2013-05-22 03:53:13,951 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 7.79 sec
2013-05-22 03:53:14,958 Stage-1 map = 100%, reduce = 19%, Cumulative CPU 10.25 sec
2013-05-22 03:53:15,965 Stage-1 map = 100%, reduce = 19%, Cumulative CPU 10.25 sec
2013-05-22 03:53:16,972 Stage-1 map = 100%, reduce = 19%, Cumulative CPU 10.25 sec
2013-05-22 03:53:17,979 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:18,986 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:19,993 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:20,999 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:22,007 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:23,014 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:24,021 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:25,028 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:26,035 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:27,042 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:28,049 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:29,055 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:30,063 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:31,070 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:32,077 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:33,084 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:34,091 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:35,097 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:36,105 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:37,112 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:38,118 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:39,125 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:40,132 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:41,139 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:42,145 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:43,153 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:44,160 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:45,166 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:46,172 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:47,180 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:48,186 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:49,193 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:50,201 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:51,208 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:52,214 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:53,221 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:54,227 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:55,235 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:56,241 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:57,249 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
2013-05-22 03:53:58,256 Stage-1 map = 100%, reduce = 24%, Cumulative CPU 10.25 sec
and continues....
I am attaching hive logs
Counters plan="{"queryId":"root_20130523051919_5e968691-9119-47fa-9b45-f61a90e86452","queryType":null,"queryAttributes":{"queryString":"select count(*) from suborder_issue"},"queryCounters":"null","stageGraph":{"nodeType":"STAGE","roots":"null","adjacencyList":"]"},"stageList":[{"stageId":"Stage-1","stageType":"MAPRED","stageAttributes":"null","stageCounters":{"CNTR_NAME_Stage-1_REDUCE_PROGRESS":"14","CNTR_NAME_Stage-1_MAP_PROGRESS":"100"},"taskList":[{"taskId":"Stage-1_MAP","taskType":"MAP","taskAttributes":"null","taskCounters":"null","operatorGraph":{"nodeType":"OPERATOR","roots":"null","adjacencyList":[{"node":"TS_0","children":["SEL_1"],"adjacencyType":"CONJUNCTIVE"},{"node":"SEL_1","children":["GBY_2"],"adjacencyType":"CONJUNCTIVE"},{"node":"GBY_2","children":["RS_3"],"adjacencyType":"CONJUNCTIVE"}]},"operatorList":[{"operatorId":"TS_0","operatorType":"TABLESCAN","operatorAttributes":"null","operatorCounters":"}","done":"true","started":"true"},{"operatorId":"SEL_1","operatorType":"SELECT","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"},{"operatorId":"GBY_2","operatorType":"GROUPBY","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"},{"operatorId":"RS_3","operatorType":"REDUCESINK","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"}],"done":"true","started":"true"},{"taskId":"Stage-1_REDUCE","taskType":"REDUCE","taskAttributes":"null","taskCounters":"null","operatorGraph":{"nodeType":"OPERATOR","roots":"null","adjacencyList":[{"node":"GBY_4","children":["SEL_5"],"adjacencyType":"CONJUNCTIVE"},{"node":"SEL_5","children":["FS_6"],"adjacencyType":"CONJUNCTIVE"}]},"operatorList":[{"operatorId":"GBY_4","operatorType":"GROUPBY","operatorAttributes":"null","operatorCounters":"}","done":"false","started":"true"},{"operatorId":"SEL_5","operatorType":"SELECT","operatorAttributes":"null","operatorCounters":"null","done":"false","started":"true"},{"operatorId":"FS_6","operatorType":"FILESINK","operatorAttributes":"null","operatorCounters":"null","done":"false","started":"true"}],"done":"false","started":"true"}],"done":"false","started":"true"}],"done":"false","started":"true"}" TIME="1369303703813"
TaskProgress TASK_HADOOP_PROGRESS="2013-05-23 06:08:24,818 Stage-1 map = 100%, reduce = 14%, Cumulative CPU 23.04 sec" TASK_NUM_REDUCERS="1" TASK_NAME="org.apache.hadoop.hive.ql.exec.MapRedTask" TASK_NUM_MAPPERS="7" TASK_COUNTERS="org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter.CREATED_FILES:0,Job Counters .Launched reduce tasks:1,Job Counters .SLOTS_MILLIS_MAPS:41820,Job Counters .Rack-local map tasks:8,Job Counters .Launched map tasks:8,File Input Format Counters .Bytes Read:0,FileSystemCounters.FILE_BYTES_WRITTEN:327828,FileSystemCounters.CFS_BYTES_READ:3527,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce Framework.Map output materialized bytes:203,Map-Reduce Framework.Map input records:351783,Map-Reduce Framework.Reduce shuffle bytes:78,Map-Reduce Framework.Spilled Records:6,Map-Reduce Framework.Map output bytes:79,Map-Reduce Framework.Total committed heap usage (bytes):2057568256,Map-Reduce Framework.CPU time spent (ms):23040,Map-Reduce Framework.Map input bytes:351783,Map-Reduce Framework.SPLIT_RAW_BYTES:3527,Map-Reduce Framework.Combine input records:0,Map-Reduce Framework.Reduce input records:0,Map-Reduce Framework.Reduce input groups:0,Map-Reduce Framework.Combine output records:0,Map-Reduce Framework.Physical memory (bytes) snapshot:2163638272,Map-Reduce Framework.Reduce output records:0,Map-Reduce Framework.Virtual memory (bytes) snapshot:7556251648,Map-Reduce Framework.Map output records:6" TASK_ID="Stage-1" QUERY_ID="root_20130523051919_5e968691-9119-47fa-9b45-f61a90e86452" TASK_HADOOP_ID="job_201305220529_0009" TIME="1369303704818"
Counters plan="{"queryId":"root_20130523051919_5e968691-9119-47fa-9b45-f61a90e86452","queryType":null,"queryAttributes":{"queryString":"select count(*) from suborder_issue"},"queryCounters":"null","stageGraph":{"nodeType":"STAGE","roots":"null","adjacencyList":"]"},"stageList":[{"stageId":"Stage-1","stageType":"MAPRED","stageAttributes":"null","stageCounters":{"CNTR_NAME_Stage-1_REDUCE_PROGRESS":"14","CNTR_NAME_Stage-1_MAP_PROGRESS":"100"},"taskList":[{"taskId":"Stage-1_MAP","taskType":"MAP","taskAttributes":"null","taskCounters":"null","operatorGraph":{"nodeType":"OPERATOR","roots":"null","adjacencyList":[{"node":"TS_0","children":["SEL_1"],"adjacencyType":"CONJUNCTIVE"},{"node":"SEL_1","children":["GBY_2"],"adjacencyType":"CONJUNCTIVE"},{"node":"GBY_2","children":["RS_3"],"adjacencyType":"CONJUNCTIVE"}]},"operatorList":[{"operatorId":"TS_0","operatorType":"TABLESCAN","operatorAttributes":"null","operatorCounters":"}","done":"true","started":"true"},{"operatorId":"SEL_1","operatorType":"SELECT","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"},{"operatorId":"GBY_2","operatorType":"GROUPBY","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"},{"operatorId":"RS_3","operatorType":"REDUCESINK","operatorAttributes":"null","operatorCounters":"null","done":"true","started":"true"}],"done":"true","started":"true"},{"taskId":"Stage-1_REDUCE","taskType":"REDUCE","taskAttributes":"null","taskCounters":"null","operatorGraph":{"nodeType":"OPERATOR","roots":"null","adjacencyList":[{"node":"GBY_4","children":["SEL_5"],"adjacencyType":"CONJUNCTIVE"},{"node":"SEL_5","children":["FS_6"],"adjacencyType":"CONJUNCTIVE"}]},"operatorList":[{"operatorId":"GBY_4","operatorType":"GROUPBY","operatorAttributes":"null","operatorCounters":"}","done":"false","started":"true"},{"operatorId":"SEL_5","operatorType":"SELECT","operatorAttributes":"null","operatorCounters":"null","done":"false","started":"true"},{"operatorId":"FS_6","operatorType":"FILESINK","operatorAttributes":"null","operatorCounters":"null","done":"false","started":"true"}],"done":"false","started":"true"}],"done":"false","started":"true"}],"done":"false","started":"true"}" TIME="1369303704818"