0
votes

I'm wondering if I should use a update query to update my row data or use maxversions and enable the versioning and just insert.

I do understand it may depend on what kind of data I need to store, but just wanted to know if there is a performance difference between querying (selecting) a data witch has versioning or non-versioning. Or has a performance difference between insert and update.

1
What do you mean by insert? Would you set an explicit version on every cell that write to Cloud Bigtable?Solomon Duskis
By putting a row in the table with out any versions. Just like the one here!kei
Putting a row without an explicit version means use "now()" as the time.Solomon Duskis

1 Answers

2
votes

Performance is impacted by the size of the row and the amount of data returned from the server.

Bigtable has to read an entire row for every request. That will be a limiting factor on reads. At some size (100s+ of MB), systemic performance will degrade any time the tablet with that row is loaded. When the row size reaches GBs, you'll have major problems.

At query time, performance is also impacted by how much data is returned from the server. You can still get decent performance lower range of "large rows" if you limit your Get or Scan to a small subset of the row. Limits like cells per row, and/or retrieving only a few qualifiers would help with the network costs.

In general, it's better to keep your rows smaller, if you can. That is generally done with a combination of "insert" and some sort of age/version restriction on the column family.