Performance is impacted by the size of the row and the amount of data returned from the server.
Bigtable has to read an entire row for every request. That will be a limiting factor on reads. At some size (100s+ of MB), systemic performance will degrade any time the tablet with that row is loaded. When the row size reaches GBs, you'll have major problems.
At query time, performance is also impacted by how much data is returned from the server. You can still get decent performance lower range of "large rows" if you limit your Get
or Scan
to a small subset of the row. Limits like cells per row, and/or retrieving only a few qualifiers would help with the network costs.
In general, it's better to keep your rows smaller, if you can. That is generally done with a combination of "insert" and some sort of age/version restriction on the column family.