0
votes

I am reading a hive table and writing it to a Teradata table (column to column, no transformations)

 try {     
 val df=spark.table("Hive Table")
 df.write.mode(SaveMode.Append).jdbc(jdbcURL, "TD Table", properties)
 }catch {case ex: Exception =><print error by calling getNextException repeatedly>

It runs for a while and fails with Teradata Database] [TeraJDBC 16.20.00.06] [Error 6706] [SQLState HY000] The string contains an untranslatable character

If I just insert the date/numeric columns, it works fine.

I have tried making Teradata table columns as UNICODE with no success.

Question is, how do I identify the errant record/column? There are hundreds of millions of rows and hundreds of columns, so running one at a time is not a viable solution. I have to either a)Identify the record/column or b) force a translation using whatever (junk) characters

1
I'm far from a Spark expert, but can't you encode your string in spark however you want? In theory, you should be able to encode it as UTF-8. It will probably fail on the rows you can't insert into Teradata, but maybe you can print them out as part of your exception handling or something? - Andrew
you can add a trigger on insert and write a partial record (with an id, for example) to a log table. This will allow to zoom in on to the record in question. - access_granted
In general you cannot identify errors - some things continue without duplicate key inserts being mentioned, for example - thebluephantom

1 Answers

0
votes

you can at least try column by column. But from my experience in the past this error was caused for 80% due to pandas NaN - Null - None. The other "usual suspect" ist that you mixed up with the columns and therefore (unintentionally) trying to put a strict into an numeric.

To check which column is the bad one, it's worth to test it via "binary search" -ish.

Beside this, Teradata Studio delivers a nice GUI to move data from Hadoop to Teradata called SmartLoader. Right here you see the column mappings and can define some Null-handling (http://downloads.teradata.com/tools/articles/smart-loader-for-hadoop).

Horst