1
votes

I have a documentDB collection that looks like this sample:

{
    "data1": "hello",
    "data2": [ 
        {
            "key": "key1",
            "value": "value1"
        },
        {
            "key": "key2",
            "value": "value2"
        }
}

In reality the data has a lot of other fields and the embedded array has some fields where the data is quite large. I need to query the data and I care about the small "key" field in the data2 array but I do not need the large "value". I am finding returning all the value data is causing performance problems, but if I exclude the array data from the SELECT all together it is fast (so the data size is the issue).

I cannot figure out a way to return only the "key" but exclude the "value" in the embedded array.

I basically want SELECT r.data1, r.data2.key and to have it return as:

{
    "data1": "hello",
    "data2": [ 
        {
            "key": "key1"
        },
        {
            "key": "key2"
        }
}

but it doesn't seem possible to SELECT r.data2.key because it is in an array

A JOIN will cause it to return a copy of each document for each "data2" array element, which does not work for me. My only other option would be to migrate the data and put the data I want into its own array so I can select the whole object.

Is this possible some how that I have not been able to figure out?

1

1 Answers

2
votes

Mike,

As you have surmised, this is not possible without a custom UDF until DocumentDB supports sub-queries. If you would like to go down that route, see the following answer for an example of how the UDF may have to look:

Good luck!