I am trying to transfer my data which is in the oracle database to my HBase table using Sqoop. I am successfully able to do that using Java Sqoop client.
However in this case, I am doing just the transfer and always using hbase_row_key as "COL1, COL2".
Now I want to do is before I put in the data in the hbase table, I want to decide on the hbase_row_key which should be "COl1,COL2" if COL2 is present, if it is absent hbase_row_key should be ""COl1,COL3" ( assuming COL3 is always present).
I think using a custom mapper instead of default mapper should do it but I am not sure how to do it with Sqoop. How to make Sqoop use custom mapper before inserting data into HBase.
Any help in this regards would be highly appreciated.
Thanks again!..
Below is my Java sqoop client code:
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.tool.ImportTool;
public class TestSqoopClient {
public static void main(String[] args) throws Exception {
SqoopOptions options = new SqoopOptions();
options.setConnectString("my_database_connection_tring");
options.setUsername("my_user");
options.setPassword("my_password");
options.setNumMappers(2); // Default value is 4
//options.setSqlQuery("SELECT * FROM user_logs WHERE $CONDITIONS limit 10");
options.setTableName("my_tablename");
options.setWhereClause("my_where_condition");
options.setSplitByCol("my_split_column");
// HBase options
options.setHBaseTable("my_hbase_table_name");
options.setHBaseColFamily("my_column_family");
options.setCreateHBaseTable(false); // Create HBase table, if it does not exist
options.setHBaseRowKeyColumn("COL1,COL2");
int ret = new ImportTool().run(options);
}
}