2
votes

I am importing data from teradata(RDBMS) to hive using apache sqoop. The usual delimiters used for import like ",", "|", "~" are present in the tables. Is there a way to use multiple characters as delimiters in apache sqoop.

To avoid it, I have used --escaped-by "\t" and --fields-terminated-by "," parameters in sqoop import command. So is there a way to 'unescape' the "\t" I used in sqoop import.

1
you are having issue while importing from teradata or exporting data from hive to tera data.. - sandeep rawat
--escaped-by \\ --enclosed-by '\"' - sandeep rawat
Is there any specific format you want ..Because this for escape character . If there any requirement of your like . - Indrajit Swain

1 Answers

3
votes

I use the '\b' delimiter whenever I get challenging tables that contain large data fields containing text that might have TABS and CR/LF characters. '\b' is as BACKSPACE which is very difficult to insert into a character firld in most databases.

Here is an example of the sqoop command I use:

            sqoop import 
              --connect "jdbc:sqlserver://myserver;DatabaseName=MyDB;user=MyUser;password=MyPassword;port=1433"
              --warehouse-dir=/user/MyUser/Import/MyDB 
              --fields-terminated-by '\b' --num-mappers 8
              --table training_deficiency 
              --hive-table stage.training_deficiency 
              --hive-import --hive-overwrite
              --hive-delims-replacement '<newline>' 
              --split-by Training_Deficiency_ID 
              --outdir /home/MyUser/sqoop/java
              --where "batch_update_dt > '2016-12-09 23:06:44.69'"