0
votes

I have created the table samp_emp and loaded the data but when i use analyse command i'm not able to see any output for the analyze command hive> analyze table sample_emp COMPUTE STATISTICS FOR COLUMNS;

Query ID = cloudera_20160323042222_18ef699e-9ba1-4da9-9fff-84c9f2fa3925

Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapreduce.job.reduces= Starting Job = job_1458726033020_0002, Tracking URL =

http://quickstart.cloudera:8088/proxy/application_1458726033020_0002/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1458726033020_0002 Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 1 2016-03-23 04:22:35,984 Stage-0 map = 0%, reduce = 0% 2016-03-23 04:23:05,861 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.02 sec 2016-03-23 04:23:16,705 Stage-0 map = 100%, reduce = 100%, Cumulative CPU 2.3 sec MapReduce Total cumulative CPU time: 2 seconds 300 msec Ended Job = job_1458726033020_0002

MapReduce Jobs Launched: Stage-Stage-0: Map: 1 Reduce: 1 Cumulative CPU: 2.3 sec

HDFS Read: 13245 HDFS Write: 72 SUCCESS Total MapReduce CPU Time Spent: 2 seconds 300 msec

OK Time taken: 63.787 seconds

2

2 Answers

0
votes

analyze command is basically used for gathering statistics for a table, columns and partitions.

For existing tables and/or partitions, the user can issue the ANALYZE command to gather statistics and write them into Hive MetaStore not just to display data of the table.

source:- https://cwiki.apache.org/confluence/display/Hive/StatsDev

0
votes

When you compute statistics in Hive you don't really get any output letting you know it finished successfully so you can go off of the logs and the fact that your job didn't have any failures. A way to validate that the stats are up to date would be to show the table properties which will give the value of true if column stats are accurate.

show tblproperties yourTableName;