2
votes

Hi I wonder if it is possible to select certain items from the array property of a JSON input of Stream Analytics and return them as an array property of a JSON output.

My example to make it more clear - I send a list of OSGI bundles running on a device with name, version and state of the bundle. (I leave out rest of the content.) Sample message:

{"bundles":[{"name":"org.eclipse.osgi","version":"3.5.1.R35x_v20090827","state":32},{"name":"slf4j.log4j12","version":"1.6.1","state":4}]}

Via Stream Analytics I want to create one JSON output (event hub) for active bundle (state == 32) and put the rest in the different output. Content of those event hubs will be processed later. But in the processing I also need the original Device ID so I fetch it from the IoTHub message properties.

So my query looks like this:

WITH Step1 AS
(
SELECT
    IoTHub.ConnectionDeviceId AS deviceId,
    bundles as bundles
FROM 
    iotHubMessages
)

SELECT
    messages.deviceId AS deviceId,
    bundle.ArrayValue.name AS name,
    bundle.ArrayValue.version AS version
INTO
    active
FROM 
    Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
WHERE 
    bundle.ArrayValue.state = 32

SELECT
    messages.deviceId AS deviceId,
    bundle.ArrayValue.name AS name,
    bundle.ArrayValue.version AS version
INTO
    other
FROM 
    Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
WHERE 
    bundle.ArrayValue.state != 32

This way there is a row for every item of the original array containing deviceId, name and version properties in the active output. So the deviceId property is copied several times, which means additional data in a message. I'd prefer a JSON with one deviceId property and one array property bundles, similar to the original JSON input.

Like active:

{"deviceid":"javadevice","bundles":[{"name":"org.eclipse.osgi","version":"3.5.1.R35x_v20090827"}]}

And other:

{"deviceid":"javadevice","bundles":[{"name":"slf4j.log4j12","version":"1.6.1"}]}

Is there any way to achieve this? - To filter items of array and return it back as an array in the same format as is in the input. (In my code I change number of properties, but that is not necessary.)

Thanks for any ideas!

1

1 Answers

0
votes

I think you can achieve this using the Collect() aggregate function. The only issue I see is that the deviceId property will be outputted in the bundle array as well.

WITH Step1 AS
(
SELECT
    IoTHub.ConnectionDeviceId AS deviceId,
    bundles as bundles
FROM 
    iotHubMessages
),
Step2 AS 
(
SELECT
    messages.deviceId AS deviceId,
    bundle.ArrayValue.name AS name,
    bundle.ArrayValue.version AS version
    bundle.ArrayValue.state  AS state
FROM 
    Step1 as messages
CROSS APPLY GetArrayElements(messages.bundles) AS bundle
)

SELECT deviceId, Collect() AS bundles
FROM Step2
GROUP BY deviceId, state, System.Timestamp
WHERE state = 32