I have a system where we need to run a simple workflow. Example:
- On Jan 1st 08:15 trigger task A for object Z
- When triggered then run some code (implementation details not important)
- Schedule task B for object Z to run at Jan 3rd 10:25 (and so on)
The workflow itself is simple, but I need to run 500.000+ instances and that's the tricky part.
I know Windows Workflow Foundation and for that very same reason I have chosen not to use that.
My initial design would be to use Azure Table Storage and I would really appreciate some feedback on the design.
The system will consist of two tables
Table "Jobs"
PartitionKey: ObjectId
Rowkey: ProcessOn (UTC Ticks in reverse so that newest are on top)
Attributes: State (Pending, Processed, Error, Skipped), etc...
Table "Timetable"
PartitionKey: YYYYMMDD
Rowkey: YYYYMMDDHHMM_<GUID>
Attributes: Job_PartitionKey, Job_RowKey
The idea is that the runs table will have the complete history of jobs per object and the Timetable will have a list of all jobs to run in the future.
Some assumptions:
- A job will never span more than one Object
- There will only ever be one pending job per Object
- The "job" is very lightweight e.g. posting a message to a queue
The system must be able to perform these tasks:
Execute pending jobs
- Query for all records in "Timetable" with a "partition <= Today" and "RowKey <= today"
- For each record (in parallel)
- Lookup job in Jobs table via PartitionKey and RowKey
- If "not exists" or State != Pending then skip
- Execute "logic". If fails => log and maybe do some retry logic
- Submit "Next run date in Timetable"
- Submit "Update State = Processed" and "New Job Record (next run)" as a single transaction
- When all are finished => Delete all processed Timetable records
Concern: Only two of the three records modifications are in a transaction. Could this be overcome in any way?
Stop workflow Stop/pause workflow for Object Z
- Query top 1 jobs in Jobs table by PartitionKey
- If any AND State == Pending then update to "Cancelled"
- (No need to bother cleaning Timetable it will clean itself up "when time comes")
Start workflow
- Create Pending record in Jobs table
- Create record in Timetable
In terms of "executing the thing" I would be using a Azure Function or Scheduler-thing to execute the pending jobs every 5 minutes or so.
Any comments or suggestions would be highly appreciated.
Thanks!