4
votes

I have a Snowflake table with one variant column (raw).

Every row in this table is complex (both dictionaries and arrays) and nested (multiple hierarchies).

What I want to do is to be able to update a specific item in some array.

It will be easier to understand it using an example so consider this as a row in the table:

{
  "id": "1234"
  "x_id": [
    {
      "y_id": "790437306684007491",
      "y_state": "some_state"
    }
  ],
  "comments": {
    "1": [
      {
        "comment_id": "bb288743-3b73-4423-b76b-f26b8c37f7d4",
        "comment_timestamp": "2021-02-10 14:53:25.667564",
        "comment_text": "Hey"
      },
      {
        "comment_id": "7378f332-93c4-4522-9f73-3b6a8a9425ce",
        "comment_text": "You",
        "comment_timestamp": "2021-02-10 14:54:21.337046"
      }
    ],
    "2": [
      {
        "comment_id": "9dd0cbb0-df80-4b0f-b399-9ee153161462",
        "comment_text": "Hello",
        "comment_timestamp": "2021-02-09 09:26:17.987386"
      },
      {
        "comment_id": "1a3bf1e8-82b5-4a9c-a959-a1da806ce7e3",
        "comment_text": "World",
        "comment_timestamp": "2021-02-09 09:28:32.144175"
      }
    ]
  }
}

And what I want is to update the comment text of a specific comment.

I know that I can update the whole JSON programmatically and update the whole object using PARSE_JSON, but this approach isn't sufficient because there could be other updates that will override other comments so this approach will fail (because these update will override each other).

So first, I've tried the naive approach (which I knew wouldn't work but I had to try):

update table1
set raw['comments']['1'][0]["comment_text"] = 'please work'

And not surprisingly I'm getting the following error:

SQL compilation error: syntax error line 2 at position 7 unexpected '['.

Next, I tried OBJECT_INSERT which should allow a way to update an object but it fails because of the nested key ('1'):

UPDATE table1
SET raw = OBJECT_INSERT(raw:comments:1, "comment_test", 'please work') 

with the error

SQL compilation error: syntax error line 1 at position 99 unexpected '1'.

(I've also tried several permutations of this approach with raw:comments:"1" or raw:comments:1[0] or raw['comments']['1'] and some others)

I also tried to refactor the object so instead of having the comments as dictionary, to flat the comments into an array, something like:

{
  "id": "1234"
  "x_id": [
    {
      "y_id": "790437306684007491",
      "y_state": "some_state"
    }
  ],
  "comments": [
      {
        "comment_id": "bb288743-3b73-4423-b76b-f26b8c37f7d4",
        "comment_timestamp": "2021-02-10 14:53:25.667564",
        "comment_text": "Hey"
        "comment_key": "1"
      },
      {
        "comment_id": "7378f332-93c4-4522-9f73-3b6a8a9425ce",
        "comment_text": "You",
        "comment_timestamp": "2021-02-10 14:54:21.337046"
        "comment_key": "1"
      }
      {
        "comment_id": "9dd0cbb0-df80-4b0f-b399-9ee153161462",
        "comment_text": "Hello",
        "comment_timestamp": "2021-02-09 09:26:17.987386",
        "comment_key": "2"
      },
      {
        "comment_id": "1a3bf1e8-82b5-4a9c-a959-a1da806ce7e3",
        "comment_text": "World",
        "comment_timestamp": "2021-02-09 09:28:32.144175",
        "comment_key": "2"
      }
    ]
}

But this doesn't get me any closer to a solution. I've looked for some ARRAY_REPLACE function that replace an item in array, but it doesn't look that such function exists (all semi-structured related functions)

I've also considered using JavaScript UDF's to do it, but I didn't find any source to UDF's that can actually update a row (they're all used to get data and not update it, as far from what I saw).

Is there any way to achieve what I want?

Thanks a lot!

2
You're using wrong syntax, there are basically 2 ways to traverse semi-structured data, see here. So you can do raw:"comments"."1"."comment_text" as example.Sergiu

2 Answers

2
votes

You can update complex JSON structures using JavaScript UDFs. Here's a sample. Note that both of your JSON samples have errors. I used the second one and fixed the missing commas.

-- Create a temp table with a sigle variant. By convention, I uses "v" as the name of any
-- column in a single-column table. You can change to "raw" in your code.
create or replace temp table foo(v variant);

-- Create a UDF that updates the exact key you want to update.
-- Unfortunately, JavaScript treats the object path as a constant so you can't make this 
-- a string that you pass in dynamically. There are ways around this possibly, but 
-- library restrictions would require a raw JavaScript parser function. Just update the
-- path you need in the UDF.
create or replace function update_json("v" variant, "newValue" string)
returns variant
language javascript
as
$$
   v.comments[0].comment_text = newValue;
   return v;
$$;

-- Insert the corrected JSON into the variant field
insert into foo select parse_json('{
    "id": "1234",
    "x_id": [{
        "y_id": "790437306684007491",
        "y_state": "some_state"
    }],
    "comments": [{
            "comment_id": "bb288743-3b73-4423-b76b-f26b8c37f7d4",
            "comment_timestamp": "2021-02-10 14:53:25.667564",
            "comment_text": "Hey",
            "comment_key": "1"
        },
        {
            "comment_id": "7378f332-93c4-4522-9f73-3b6a8a9425ce",
            "comment_text": "You",
            "comment_timestamp": "2021-02-10 14:54:21.337046",
            "comment_key": "1"
        },
        {
            "comment_id": "9dd0cbb0-df80-4b0f-b399-9ee153161462",
            "comment_text": "Hello",
            "comment_timestamp": "2021-02-09 09:26:17.987386",
            "comment_key": "2"
        },
        {
            "comment_id": "1a3bf1e8-82b5-4a9c-a959-a1da806ce7e3",
            "comment_text": "World",
            "comment_timestamp": "2021-02-09 09:28:32.144175",
            "comment_key": "2"
        }
    ]
}');

-- Show how the change works without updating the row
select update_json(v, 'please work') from foo;

-- Now update the row using the output. Note that this is updating the 
-- whole variant field, not a portion of it.
update foo set v = update_json(v, 'please work');

-- Show the updated key
select v:comments[0].comment_text::string from foo;

Finally, if you want to modify a property that you have to look through the keys to find what you need first, you can do that in a loop in JavaScript. For example, if it's not the 1st comment you need but the one with a particular UUID or comment_text, etc., you can loop through to find it and update the comment_key on the same iteration of the loop.

0
votes

Thanks, it works!

I've kinda managed to get it to work using built-in functions -

Assuming we know the position of the comment (in this example, position=3):

UPDATE table1 SET 
raw = object_construct(
  'id', raw:id,
  'x_id', raw:x_id,
  'comments', array_cat(array_append(array_slice(raw:comments ,0 ,2), parse_json('{"id": "3", "comment_text": "please work"}')) , ARRAY_SLICE(raw:comments,3,array_size(raw:comments)))
)
WHERE raw['id'] = 'some_id' 

But I'm still thinking which approach will do the work better.

Anyway, thanks, helped a lot.