0
votes

I have a system where we need to run a simple workflow. Example:

  1. On Jan 1st 08:15 trigger task A for object Z
  2. When triggered then run some code (implementation details not important)
  3. Schedule task B for object Z to run at Jan 3rd 10:25 (and so on)

The workflow itself is simple, but I need to run 500.000+ instances and that's the tricky part.

I know Windows Workflow Foundation and for that very same reason I have chosen not to use that.

My initial design would be to use Azure Table Storage and I would really appreciate some feedback on the design.

The system will consist of two tables

Table "Jobs"
  PartitionKey: ObjectId
  Rowkey: ProcessOn (UTC Ticks in reverse so that newest are on top)
  Attributes: State (Pending, Processed, Error, Skipped), etc...

Table "Timetable"
  PartitionKey: YYYYMMDD
  Rowkey: YYYYMMDDHHMM_<GUID>
  Attributes: Job_PartitionKey, Job_RowKey

The idea is that the runs table will have the complete history of jobs per object and the Timetable will have a list of all jobs to run in the future.

Some assumptions:

  • A job will never span more than one Object
  • There will only ever be one pending job per Object
  • The "job" is very lightweight e.g. posting a message to a queue

The system must be able to perform these tasks:

  • Execute pending jobs

    1. Query for all records in "Timetable" with a "partition <= Today" and "RowKey <= today"
    2. For each record (in parallel)
      1. Lookup job in Jobs table via PartitionKey and RowKey
      2. If "not exists" or State != Pending then skip
      3. Execute "logic". If fails => log and maybe do some retry logic
      4. Submit "Next run date in Timetable"
      5. Submit "Update State = Processed" and "New Job Record (next run)" as a single transaction
    3. When all are finished => Delete all processed Timetable records

    Concern: Only two of the three records modifications are in a transaction. Could this be overcome in any way?

  • Stop workflow Stop/pause workflow for Object Z

    1. Query top 1 jobs in Jobs table by PartitionKey
    2. If any AND State == Pending then update to "Cancelled"
    3. (No need to bother cleaning Timetable it will clean itself up "when time comes")
  • Start workflow

    1. Create Pending record in Jobs table
    2. Create record in Timetable

In terms of "executing the thing" I would be using a Azure Function or Scheduler-thing to execute the pending jobs every 5 minutes or so.

Any comments or suggestions would be highly appreciated.

Thanks!

1

1 Answers

1
votes

How about using Service Bus instead? The BrokeredMessage class has a property called ScheduledEnqueueTimeUtc. You can just schedule when you want your jobs to run via the ScheduledEnqueueTimeUtc property, and then fuggedabouddit. You can then have a triggered webjob that monitors the Service Bus messaging queue, and will be triggered very near when the job message is enqueued. I'm a big fan of relying on existing services to minimize the coding needed.