sqoop import fails with numeric overflow

Question

sqoop import job failed caused by: java.sql.SQLException: Numeric Overflow I have to load Oracle table, it has column type NUMBER in Oracle,without scale, and it's converted to DOUBLE in hive. This is the biggest possible size for both, Oracle and Hive numeric values. The question is how to overcome this error?

"NUMBER without scale" >> meaning without neither precision nor scale i.e. NUMBER i.e. NUMBER(38,*)? — Samson Scharfrichter
Yes, just NUMBER. It would be nice to solve this on hadoop side without talking to Oracle people to fix the value. I can't find anything. Sqoop doesn't allow to change data type during ingestion. — T_man
"Sqoop doesn't allow to change data type" -- really? Can you point to a source that contradicts the official documentation at sqoop.apache.org/docs/1.4.6/…? — Samson Scharfrichter
I tried free sql to select this column with to_char(mycolumn) and got an error : caused by : java.sql.SQLSyntaxErrorException: ORA-01722: invalid nuber — T_man
I tried free sql to select this column with to_char(mycolumn) and got an error : caused by : java.sql.SQLSyntaxErrorException: ORA-01722: invalid number. According to Oracle: "ORA-01722 invalid number Cause: The attempted conversion of a character string to a number failed because the character string was not a valid numeric literal. Only numeric fields or character fields containing numeric data may be used in arithmetic functions or expressions...." It failed on all mappers, not just on one mapper with bad value. So I concluded... — T_man

Samson Scharfrichter Samson Scharfrichter · Accepted Answer · 2017-05-01T11:55:28

OK, my first answer assumed that your Oracle data was good, and your Sqoop job needed specific configuration to cope with NUMBER values.

But now I suspect that your Oracle data contains shit, and specifically NaN values, as a result of calculation errors.
See that post for example: When/Why does Oracle adds NaN to a row in a database table

And Oracle even has distinct "Not-a-Number" categories to represent "infinity", to make things even more complicated.

But on Java side, BigDecimal does not support NaN -- from the documentation, in all conversion methods...

Throws:
NumberFormatException - if value is infinite or NaN.

Note that the JDBC driver masks that exception and displays NumericOverflow instead, to make things more complicated to debug...

Solr Numeric Overflow (from Oracle)

In the end, you will have to "mask" these NaN values with Oracle function NaNVL, using a free-form query in Sqoop:

$ sqoop import --query 'SELECT x, y, NANVL(z, Null) AS z FROM wtf WHERE $CONDITIONS'

sqoop import fails with numeric overflow

2 Answers