0
votes

I'm using Azure Stream Analytics to stream events from Event Hubs to DocumentDB. I have configured the input, query and output as documented, tested it with sample data and it managed to return results as expected.

But when I start the streaming job and send the same payload as the sample data earlier, I got this error message:

There was a problem formatting the document [id] column as per DocumentDB constraints for DocumentDB db:[my-database-name], and collection:[my-collection-name].

My sample data is an array of JSON:

[
 { "Sequence": 1, "Tenant": "T1", "Status": "Started" },
 { "Sequence": 2, "Tenant": "T1", "Status": "Ended" }
]

I've configured the input as follows:

  • Input alias: eventhubs-events
  • Source Type: Data stream
  • Source: Event Hub
  • Subscription: same subscription as where I create the Analytics job
  • Service bus namespace: an existing Event Hub namespace
  • Event hub name: events (existing event hub in the namespace)
  • Event hub policy name: a policy with read access
  • Event hub consumer group: blank
  • Event serialization format: JSON
  • Encoding: UTF-8

And the output as follows:

  • Output alias: documentdb-events
  • Sink: DocumentDB
  • Subscription: same subscription as where I create the Analytics job
  • Account id: an existing DocumentDB account
  • Database: records (an existing database in the account)
  • Collection name pattern: collection (an existing collection in the database)
  • Document id: id

My query is as simple as:

SELECT
    event.Sequence AS id,
    event.Tenant,
    event.Status
INTO [documentdb-events]
FROM [eventhubs-events] AS event
1

1 Answers

4
votes

Turns out all field names in the output are automatically lower-cased.

In my DocumentDB collection, I've configured the collections in Partitioned mode, with "/Tenant" as the Partition Key.

Since the case didn't match that of the output, it failed the constraint.

Changing the Partition Key to "/tenant" fixed the issue.

Hope by sharing the outcome of my findings could save some trouble for people who bump into this.

2nd Option

Instead of changing partition key in lower case, now we can change compatibility-Level in Stream analytics.

1.0 versions: Field names were changed to lower case when processed by the Azure Stream Analytics engine.

1.1 version: case-sensitivity is persisted for field names when they are processed by the Azure Stream Analytics engine.