2
votes

I am importing several DB2 database tables with an Oozie workflow that uses Sqoop to import to Hive. Currently I have to map each column with an unsupported data type manually with "--map-column-hive".

Is there any way to permanently store mappings for specific data types? I am importing several tables that contain DB2-"Character" columns which all have to be mapped to HIVE-"STRING" manually. For ~50 tables there are ~200 columns that use the datatype "Character" for FKs which have to be mapped manually.

I want to permanently save that DB2-"Character" is mapped to the datatype HIVE-"STRING". Can this be done?

regards

1

1 Answers

1
votes

As far as I can see, Sqoop does not provide ability to pass type-to-type mappings as parameters.
They are all hardcoded explicitly:

switch (sqlType) {
  case Types.INTEGER:
  case Types.SMALLINT:
      return HIVE_TYPE_INT;
  case Types.VARCHAR:
  case Types.CHAR:
  case Types.LONGVARCHAR:
  case Types.NVARCHAR:
  case Types.NCHAR:
  case Types.LONGNVARCHAR:
  case Types.DATE:
  case Types.TIME:
  case Types.TIMESTAMP:
  case Types.CLOB:
      return HIVE_TYPE_STRING;
  case Types.NUMERIC:
  case Types.DECIMAL:
  case Types.FLOAT:
  case Types.DOUBLE:
  case Types.REAL:
      return HIVE_TYPE_DOUBLE;
  case Types.BIT:
  case Types.BOOLEAN:
      return HIVE_TYPE_BOOLEAN;
  case Types.TINYINT:
      return HIVE_TYPE_TINYINT;
  case Types.BIGINT:
      return HIVE_TYPE_BIGINT;
  default:
  // TODO(aaron): Support BINARY, VARBINARY, LONGVARBINARY, DISTINCT,
  // BLOB, ARRAY, STRUCT, REF, JAVA_OBJECT.
      return null;
}

Also there's a specific case for XML columns in DB2:

if (colTypeName.toUpperCase().startsWith("XML")) {
  return XML_TO_JAVA_DATA_TYPE;
}

If your column type is not recognized by this mapping + user-defined mappings via --map-column-hive parameter, you'll get an exception.

What I'd do in your case (if not considering manual column mapping):

  1. Make sure once again that type mapping for DB2-"Character" does not work
  2. Download the sources of your Sqoop version, add a new if-branch in Db2Manager.toDbSpecificHiveType, build and test with some your tables
  3. Create a PR and wait for the next release OR use the customized version of Sqoop (might be painful when you want to upgrade Sqoop version)