2
votes

I have a staging table with around 200 columns in Redshift. I first copy data from S3 to this table and then copy data from this table to another table using a large insert into select from query. Most of the fields in staging table are varchar, which I convert to the proper datatype in the query.

I am getting some field in the staging table which is causing a numeric overflow -

org.postgresql.util.PSQLException: ERROR: Numeric data overflow (addition)
Detail:
-----------------------------------------------
error:  Numeric data overflow (addition)
code:      1058
context:
query:     9620240
location:  numeric.hpp:112
process:   query1_194 [pid=680]

how can I find, which field is causing this overflow, so that I can sanitize my input or correct my query.


1

1 Answers

0
votes

I use Netezza which also can use regex functions to grep out rows. Fortunately redshift supports regexp as well. Please take a look at

http://docs.aws.amazon.com/redshift/latest/dg/REGEXP_COUNT.html

So the idea in your case is to use the regexp in the where clause and in this way you can find which values are exceeding the numeric cast occurring during the insert. The issue will be finding identifying data that allows you to determine which rows in the physical file are causing the issue. You could create another copy of the data and create row numbers in a temporary table. Use the temporary table as your source of analysis. How large is the numeric field you are going into ? You may need to do this analysis against more than 1 column if you have multiple columns being cast to numeric.