3
votes

I'm looking to implement a process that will occasionally pull all "new" records from a DocumentDb, where new is "all documents added or modified since the last time the process was run."

SQL Server has rowversion for this, which is guaranteed unique and monotonically increasing across all rows and columns in a database.

I see DocumentDb has _ts, which (according to the documentation) used as a high water mark for Azure Search indexing, but how does that work? If multiple documents are inserted at the same time as a read takes place, it's possible that all of them have the same _ts value. On the next read, if the comparison against _ts is strictly greater than, then some documents will be missed; if it's greater-than-or-equals, some documents will be pulled a second time.

Is _ts safe to use for this?

1

1 Answers

1
votes

The _ts property is specific to a document, not a collection of documents. It represents the time that a particular document was updated (in seconds, since Jan 1 1970).

The _ts property will not give you a high water mark across all documents in a collection. Each document has its own independent _ts property (which may have the same value as another document's _ts property).

See this answer for a bit more detail.