Because I've noticed the question is quite popular, I've decided to answer it as I've undrestood it quite well since I asked it
So, first of all since Hadoop 2.0 HCatalog and Hive are treated as one product. Hive creates tables in HCatalog by default. It means that natural interface for HCatalog is Hive. So you can use all SQL-92 DMLs ( DDLs (, starting from create/alter/drop database
, through create/alter/drop table
ending with select, insert into
etc... The only exception is that insert works only as insert into ... as select from.
For typical insert we have to use:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
Tables can have partitions, indexes (but from my experioence it doesn't work well), but you it is not a relational database, so you cannot use foreign keys.
With HBase is quite different. This is one of noSQL databases (but as answered in previous post, Hive can be HBase interface fro SQL queries)
It has key-> value organized tables.
Lets compare a few commands (create table, insert into table, select from table, drop table
create table table_name (
id int,
value1 string,
value2 string
partitioned by (date string)
LOAD DATA INPATH 'filepath' ] INTO TABLE table_name [PARTITION (partcol1=val1, partcol2=val2 ...)]
INSERT INTO table_name as select * from othertable
SELECT * FROM table_name
DROP TABLE table_name
hbase> create 'test', 'cf'
hbase> put 'test', 'row1', 'cf:a', 'value1'
hbase> get 'test', 'row1'
hbase> disable 'test'
hbase> drop 'test'
As you can see the syntax is completely different. For SQL users, working with HCatalog is natural, ones working with noSQL databases will feel comfortabe with HBase.