I can successfully do an incremental import from MySQL to HDFS using Sqoop by
sqoop job -create JOBNAME ... --incremental append --check-column id --last-value LAST
sqoop job -exec JOBNAME
That finishes with log messages like
INFO tool.ImportTool: Saving incremental import state to the metastore
INFO tool.ImportTool: Updated data for job: JOBNAME
And inspecting the job reveals that incremental.last.value was updated correctly.
If I attempt the same procedure, but add "--hive-import" to the definition of my job, it will execute successfully, but won't update incremental.last.value.
Is this a bug? Intended behavior? Does anyone have a procedure for incrementally importing data from MySQL and making it available via Hive?
I basically want my Hadoop cluster to be a read slave of my MySQL database, for fast analysis. If there's some other solution than Hive (Pig would be fine), I'd love to hear that too.