1
votes

I have an application with a model named appointment. On this model, there is a column with the name event_uid and a validation like the following:

validates :event_uid, uniqueness: true, allow_nil: true

The unique validation is only on rails application and not in the database (postgresql).

I am using background job with sidekiq on heroku to sync some remote calendars. I am not sure what happened, but it seems like I got multiple records with duplicate event_uid values. They have been created in the exact same second.

My guess is that something happened on the workers and for some reason they got invoked at the same time or the queue frozen and when it got back it ran the same job twice. I don't understand why rails let the above to pass (maybe because workers run on different threads plays a role?). I added the following migration:

add_index :appointments, [:event_uid], unique: true

With the hope that it won't happen again. Ok so now the questions:

  • What do you think, will this be enough?
  • Is it dangerous to allow unique / presence validations to exist only on application level if you are using create / update with background jobs?
  • Any guess what could have caused the workers to run the same job more than one and exactly the same second?
2

2 Answers

2
votes

The Rails uniqueness validation has been reason for confusion a long time.

When you persist a user instance, Rails will validate your model by running a SELECT query to see if any user records already exist with the provided email. Assuming the record proves to be valid, Rails will run the INSERT statement to persist the user.

https://thoughtbot.com/blog/the-perils-of-uniqueness-validations

This means, if you have several workers / threads selecting at the same time they will all return false and insert the record.

Most of the time it is desirable to have an index on database level to avoid these race conditions too. However, you need to now also handle any ActiveRecord::RecordNotUnique exception.

What do you think, will this be enough?

Yes, adding an index is a good idea but now you need to also handle ActiveRecord::RecordNotUnique.

Is it dangerous to allow unique / presence validations to exist only on application level if you are using create / update with background jobs?

This depends on the application but most of the time you want to have an index on db level too.

Any guess what could have caused the workers to run the same job more than one and exactly the same second?

Most background job libraries only guarantee that at least one job gets enqueued but not exactly one. Your jobs should always be idempotent (can run several times). A good read is this guide about ActiveJob design, especially the part about idempotency.

1
votes

Usually, validations take place in rails during callbacks only (sometimes before_commit the record on the DB), and yes if you added a unique index this will not happen again because the DB will take charge this time so even if you run into the same flow/issue again the result is likely an error saying that you can't duplicate that index value.

Given the nature of the validator (usually are called during callbacks and there are not thread-safe) meaning that they can run into race conditions, how common this can happens depends on your application, you should add always the validation on the DB as well.

Related to your workers I ran into the same issue due to the retry flow of Sidekiq a few months ago, the solution was to validate on the DB side as well and make a fix to run the workers/jobs after_commit callback (not sure if you are using Sidekiq, but you can always use the after_commit callback, I was using my job after certain operation took place over a particular object).

Hope the above helps! 👍