1
votes

So I'm working on a project in Java, but really the language doesn't matter here.

So I want to create and store users in the datastore and I'm trying to work out the best way to do this, such that I can ensure an email is not used more than once. So the normal way to do it on a relational ACID database would be during a transaction. Lock the database, look up if the email exists, if it does then unlock and fail, else insert and unlock.

Now this as a concept would work in Appengine as well as you can use transactions. However, because the entry might have only been inserted milliseconds before, it might not be present in the datastore yet due to the strong / eventual consistency.

So things I've thought about:

  • using a global parent for all users such that I can then do an ancestor query in my transaction, therefore forcing it to be the latest data queried. However this then causes issues with the limit of 1 XG update per second. (This is an approach I have decided to not go with)

  • storing the emails that are inserted into the memcache in a separate list, because even if it were to get cleared, it probably wouldn't get cleared before the entry is inserted into the datastore, so we could then search both the cache and datastore, and if it's not present in either, we can assume it's not going to be in the datastore. However, this memcache lookup wouldn't be part of the transaction so there would still be issues

So my main issue is that neither of these approaches would use an ancestor query so could not be done as part of a transaction.

Thanks

Edit: After thinking about the structure, I am considering something like this. Will have to test it when I get home later, and will mark this as my accepted answer if it works.

UserBean
    @id Long id;        
    //All child elements will use UserBean as their parent


Login
    @id String id; //This will be the a hashed/base64 version of the  email address
    @Parent UserBean user;
    String emailAddress
    String hashedPassword;


start transaction

    Login login = ofy()
        .load()
        .type(Login.class)
        .key(hashEmail(emailAddress)).now();
    if (login == null) {
        fail transaction - email already in use
    }
    Insert UserBean and Login objects into datastore
2
Thought #1: for the reason you named this would be a bad idea indeed. Thought #2: However unlikely it is that the memcache is evicted just in the wrong/right moment - if it does, the s*** really hits the fan. So don't make usual case assumptions where a unusual case will break your neck. My Thought: Have you considered creating a table where the email is the @Id? If i got my facts straight, queries by id are always strongly consistent. - konqi
But they cannot be done within a transaction, unless I had a child entity as well that ALSO had the email, and then I can do an ancestor query to verify that if the child element is present. But this does cause a lot of db operations if the user wants to change their email address, as it'd require all child elements to be updated as the parents id has changed, no? - branks
Have a look at stackoverflow.com/questions/24085216/… in this we used "unique" usernames, but the approach is applicable for email. - Tim Hoffman
I have added an edit to my original post with a preliminary design for my objects that I am considering - branks
if @id is email you can use this single entity within a transaction. get by Key, then put if not exists. - Igor Artamonov

2 Answers

0
votes

I use the Python flavor of App Engine, but I suspect there is similar functionality in Java.

You can do the following:

  1. Use the email address as the key_name of the entity, and
  2. Create the entity using get_or_insert(key_name) (docs).

By using get_or_insert you guarantee that you will not create the same entity more than once.

0
votes

So I have a working solution which I will happily share here.

My two POJO's :

@Entity
public class UserAccount {
    @Id Long _id;
    public UserAccount(){

    }
    public Long get_id() {
        return _id;
    }
}


@Entity
public class LoginBean {
    @Id String emailHash;
    //I don't make this an actual @Parent because this would affect the Id
    Ref<UserAccount> parent; 
    String email;
    String hashedPassword;
    public LoginBean(){

    }
    public LoginBean(String emailHash, Ref<UserAccount> parent, String email, String hashedPassword){
        this.parent = parent;
        this.emailHash = emailHash;
        this.email = email;
        this.hashedPassword = hashedPassword;
    }
    //All the rest of the getters and setters you want
}

And then inside of my utility class :

final String emailHash = getEmailHash(email);
final String passwordHash = getPasswordHash(password);

UserAccount savedUser = ofy().transact(new Work<UserAccount>() {
    public UserAccount run() {
        if (lookupByEmail(email) != null) {
            return null;
        }

        UserAccount user = new UserAccount();
        Key<UserAccount> userKey = ofy().save().entity(user).now();

        LoginBean login = new LoginBean(emailHash, Ref.create(userKey), email, passwordHash);
        ofy().save().entity(login).now();

        return user;
    }
});

And further down:

public LoginBean lookupByEmail(String email) {
    String emailhash = getEmailHash(email);
    LoginBean r = ofy().load().type(LoginBean .class).id(emailhash).now();
    return r;
}