1
votes

According to the Google Cloud Datastore documentation, fast writing to a new entity kind is bad practice. It says:

Cloud Datastore prepends the namespace and the kind of the root entity group to the Bigtable row key. You can hit a hotspot if you start to write to a new namespace or kind without gradually ramping up traffic.

What about writing an initial set of records at fast rate to an event kind in a namespace, but neither the kind nor the namespace is new? The number of records will be between 5 to 10 millions.

For example, assume I have a namespace "ns1", and an entity kind "ek1". (Mentally, it is the "ns1.ek1" entity to me.) If I already have many other entity kinds populated in that namespace ("ns1.ek2", "ns1.ek3", ..., "ns1.ekX"), and I already have entities of this kind in other namespaces ("ns2.ek1", "ns3,ek1", ... "nsX.ek1"), will I still run into performance problems due to hotspot updates if I write fast into "ns1.ek1"?

I plan to prime a brand new entity kind for fast writes into the customer specific namespaces by inserting a ton of artificial records with random keys into it in a private namespace. I am wondering if that is a valid technique.

1

1 Answers

0
votes

That won't work. Cloud Datastore splits by key range to cater for increasing load and namespace is prefixed to the key.

For example, if you have ns1.ek1 and ns2.ek2, and sustain a high load rate to ns1.ek1, it won't affect splitting to handle new load to ns2.ek2.