0
votes

Experts,

Need recommendations for the below scenario:

We are dynamically parsing JSON records from the JSON files stored in the S3 by reading through External tables using stored procedure ( set of logic using Lateral Flatten query). I need recommendations on the below.

  1. Store multiple JSON files as is from Source Separately in S3 without merging and access through external table.

  2. Merge Multiple JSON files from Source into single JSON (each 500 JSON to Single JSON) and access through external table.

  3. Same as step 2 - however zip and store in S3 and and access through external table.

Also the same old questions External Stage Vs Internal Stage - Which is recommended for above scenario.

Thanks

1

1 Answers

0
votes

External tables are not materialized tables hence the data is fetched from stage when queried.

Choosing between Internal and External Stage: It totally depends on your requirement. If you already have a S3 stage where you're keeping the files then better create an external stage on top of it. If you don't have the need to keep history files for auditing then internal stage is a good option for better performance by using PUT command to place files in stage.

Since data from files will only be fetched when external table is queried so there is no impact by having multiple files or consolidating all. However, zipping will incur you less storage cost.