0
votes

I am running cassandra cluster having 10 node and uploading huge tsv[tab separated value] file daily, now I wanna move my project into google bigtable for better performance and low latency .
I installed google cloud big table 3 node cluster ,install hbase plugins on cloud compute server [1 node] , now don't knew how can I start upload these tsv file into bigtable.

below is my tsv format ,
col1 col2 col3 col4 col5 col6 . .
here col1 is primary key and col2 and col3 is cluster key in cassandra table .
now how can i create similar table in bigtable , and what are the methods available for upload tsv file in bigtable.

1

1 Answers

2
votes

In Bigtable, you have one row key. That key is used as a fast look up key. Bigtable stores all data in sorted order based on the row key. Bigtable "columns" need to be added to column families. You configure the column families up front, and you can add random columns/qualifiers when you send a mutation. Here's more info: https://cloud.google.com/bigtable/docs/schema-design.

You can also use Google Dataflow for importing any type of data: https://cloud.google.com/bigtable/docs/dataflow-hbase. You have to write a small amount of Java code, and Google creates a cluster of machines and executes your code on it. You have a UI to view your progress and logs.

Bigtable is also accessible through an API compatible with hbase. That's allows tools like hbase's import via hadoop to work out of the box: https://cloud.google.com/bigtable/docs/exporting-importing

My preference has been Dataflow.