I need to periodically archive/cold store rows form C* tables to CSV. For example: export Jan-Jun 2016's rows in C* table my_table to a CSV my_table.2016_06-30.csv, export Jul-Dec 2016's rows in my_table to my_table.2016-12-31.csv, and so on.
I considered CQL to do this, but not all my tables have timestamp columns for my rows. It has been suggested that I use Spark Dataframes to do this (so I can, get to metadata like writeTime available from the Spark Cassandra Connector).
I'm new to the Spark Cassandra Connector and this sort of Spark use.
Can anyone point me to a good example of how to write a Cassandra Dataframe to a CSV, where I am 'filtering' the Dataframe on writeTime or something like it?
Thanks in advance for any help, guidance, etc.