0
votes

I have a nested JSON array which is a string object which has been stored into variant type stage table and I want to extract particular string object value and populate with pipe separated values if more than one object found. Can someone help me to achieve the desired output format please.

Sample JSON data

    {"issues": [
    {
    "expand": "",
    "fields": {
        "customfield_10010": [
        "com.atlassian.xxx.yyy.yyyy.Sprint@xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]",
        "com.atlassian.xxx.yyy.yyyy.Sprint@abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"
        ],
        "customfield_10011": "obcd",
        "customfield_10024": null,
        "customfield_10034": null,
        "customfield_10035": null,
        "customfield_10037": null,
    },
    "id": "123456",
    "key": "SUE-1234",
    "self": "xyz"
    }]}

I don't have any idea on how to separate the string objects inside an array with snowflake. By using the below query I can get whole string converted into pipe separated values.

select
    a.value:id::number as ISSUE_ID,
    a.value:key::varchar as ISSUE_KEY,
    array_to_string(a.value:fields.customfield_10010, '|') as CF_10010_Data
from
    ABC.VARIANT_TABLE,
    lateral flatten( input => payload_json:issues) as a;

But I need to extract particular string object value. Say for example id value such as 1234 & 1239 to be populated as pipe separated as shown below.

ISSUE_ID    ISSUE_KEY   SPRINT_ID
123456      SUE-1234    1234|1239

Any idea on this to get desired result is much appreciated. Thanks..

1
If I'm reading your sample data correctly, it doesn't look like the values inside your customfield_10010 array are JSON structured, so you'll lose the ability to use Snowflake's JSON notation to get those values. You may need to either get the data into a real JSON format or parse it using SQL.Mike Walton
Thanks for your suggestion @MikeWalton, I'm getting the JSON data from API response and copying the same into variant Stage table. Is there any chance we can customize the response to get the desired output else need to check with Source to modify the API config to have JSON structured properly? Appreciate your help on this.Maran

1 Answers

0
votes

It looks like the data within [...] for your sprints are just details about that sprint. I think it would be easiest for you to actually populate a separate sprints table with data on each sprint, and then you can join that table to the Sprint ID values parsed from the API response you showed with issues data.

with
    jira_responses as (
        select
            $1 as id,
            $2 as body
        from (values
            (1, '{"issues":[{"expand":"","fields":{"customfield_10010":["com.atlassian.xxx.yyy.yyyy.Sprint@xyz456[completeDate=2020-07-20T20:19:06.163Z,endDate=2020-07-17T21:48:00.000Z,goal=,id=1234,name=SPR-SPR 8,rapidViewId=239,sequence=1234,startDate=2020-06-27T21:48:00.000Z,state=CLOSED]","com.atlassian.xxx.yyy.yyyy.Sprint@abc123[completeDate=<null>,endDate=2020-08-07T20:33:00.000Z,goal=,id=1239,name=SPR-SPR 9,rapidViewId=239,sequence=1239,startDate=2020-07-20T20:33:26.364Z,state=ACTIVE]"],"customfield_10011":"obcd","customfield_10024":null,"customfield_10034":null,"customfield_10035":null,"customfield_10037":null},"id":"123456","key":"SUE-1234","self":"xyz"}]}')
        )
    )
select
    issues.value:id::integer as issue_id,
    issues.value:key::string as issue_key,
    get(split(sprints.value::string, '['), 0)::string as sprint_id
from jira_responses,
lateral flatten(input => parse_json(body):issues) issues,
lateral flatten(input => parse_json(issues.value):fields:customfield_10010) sprints

Based on your sample data, the results would look like the following.

lateral flatten twice to get Sprint ID

See Snowflake reference docs below.