2
votes

After reading about the GAE Datastore API, I am still unsure if I need to duplicate key names and parents as properties for an entity.

Let's say there are two kinds of entities: Employee and Division. Each employee has a division as its parent, and is identified by an account name. I use the account name as the key name for employees. But when modeling Employee, I would still keep these two as properties:

division = db.ReferenceProperty(Division)
account_name = db.StringProperty()

Obviously I have to manually keep division consistent with its parent, and account_name with its key name. The reasons I am doing this extra work are:

  1. I am afraid GQL/Datastore API may not support parent and key name as well as normal property. Is there anything I can do about a property but not parent or key name (or are they essentially reference properties)? How do I use key names in GQL queries?
  2. The meaning of key name and parent is not particularly clear. As the names are not self-descriptive, I have to inform other contributors that we use account name as key name...

But this is really unnecessary work, wasting time and storage space. I cannot get rid of the SQL-thinking that - why doesn't Google just let us define a property to be the key? and another to be the parent? Then we could name them and use as normal properties...

What's the best practice here?

2
It seems not possible to query key names in GQL like a property, when you don't have the key itself (i.e. you don't know the ancestor). See link, especially its tip in English (removed in the linked Google documentation, but still present in some translations). Still, I can't get the point of this design... - klkh

2 Answers

5
votes

Keep in mind that in the GAE Datastore you can never change the parent or key_name of an entity once it has been created. These values are permanent for the life of the entity.

If there is even a small chance that the account_name of an Employee could change then you can not use it as a key_name. If it never changes then it could be a very good key_name and will allow you to do cheap gets for Employees using Employee.get_by_key_name() instead of expensive queries.

Parent is not meant to be equivalent to a foreign key. A better equivalent to a foreign key is a reference property.

The main reason you use parent is so that the parent and child entities are in the same entity group which allows you to operate on them both in a single transaction. If you just need a reference to the division from the Employee then just use a reference property. I suggest getting familiar with how entity groups work as this is very important on GAE data modeling:

Using parent can also cause write performance issues as there is a limit to how quickly you can write to a single entity group (approximately one write per second). When deciding whether to use parent or a reference property you need to think about which entities need to be modified in the same transaction. In many cases you can use Cross Group (XG) transactions instead. It is all about which trade-offs you want to make.

So my suggestions are:

  • If your account_name for an employee will absolutely never change then use it as a key_name. Otherwise just make it a basic property.
  • If you need to modify the Employee and the Division in the same transaction (and you can't get this to work with XG transactions) and you will never change the Division of an Employee then make the Division the parent of the Employee. Otherwise just model this relationship with a reference property.
3
votes

When you create a new Employee object with a Divison as a parent, it would go something like:

div = Division()
... #Complete the division properties
div.put()
emp = Employee(key_name=<account_name>, parent=div)
... #Complete the employee properties
emp.put()

Then, when you want to get a reference to the Division an Employee is part of:

div = emp.parent()
#Get the Employee account_name (which is the employees's key name):
account_name = emp.key().name()

You don't have to store a RefrenceProperty to the Division an Employee is part of since it's already done in the parent. Additionally, you can get the account_name from the Employee entity's key as needed.

To query on the key:

emp = Employee.get_by_key_name(<account_name>, parent=<division>)
#OR 
div = Division.get_by_key_name(<keyname>)

#Get all employees in a division
emps = Employee.all().ancestor(div)