0
votes

I am using DataStax Enterprise 4.0.2. I am trying to use Sqoop which comes bundled with DSE, to import data from MySQL to Cassandra. Sqoop command is :-

dse sqoop import --connect jdbc:mysql://192.168.10.98/mydb --username user1 
--password password --outdir /root/dev/output/dir/ --query "SELECT tab1.col1 AS 
COL1, tab1.col2 AS COL2, tab1.col3 AS COL3 FROM table1 AS tab1 WHERE \$CONDITIONS
AND tab1.col1 != 'XYZ' AND tab1.col2 != 2 GROUP BY tab1.col1, tab1.col2" 
--target-dir /root/dev/cassdir --split-by tab1.col1 --cassandra-keyspace csks 
--cassandra-column-family cscf --cassandra-thrift-host localhost 
--cassandra-create-schema --verbose

Keyspace and Columnfamily are getting created, but there is no data. Also the structure of columnfamily is like a dummy one :-

cqlsh> DESC KEYSPACE csks

CREATE KEYSPACE ga WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '1'
};

USE csks;

CREATE TABLE cscf (
  key text,
  column1 text,
  value text,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='NONE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

There is no error in the Sqoop output. Where should I look for detailed logs?

1

1 Answers

0
votes

Because you used --target-dir the output is going to end up in the "cfs" file system, not in a table.

From:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/ana/anaSqpVld.html

The "hadoop fs" command can be used to see the import.

./dse hadoop fs -ls /root/dev/cassdir

Since you also specified the cassandra column family and create options, those got made, which is confusing things here. In DSE 4.0 the sqoop table creation code still uses cql2, so the table will look a little strange from cqlsh which uses cql3. To get the data into a cql3 table, after putting it into cfs you can use hive to insert that into cassandra:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/ana/anaSqpMgrate.html

You will want to drop the already created cql2 table before doing that.