I have data that looks like this
{super-row-key1 [{ts1 {version-ts1 value, version-ts2 value}}
{ts2 {version-ts1 value}}]
super-row-key2 ...}
These keys and values look something like
{"4447c9a6-9912-44d7-a6b5-cef40735f92c:2011-06"
[{1291180500000 {1351709255098 -0.008084167000000001}}
{1291184100000 {1351709255098 -0.004395833}}
{1291185000000 {1351709255098 -0.003075}}]
...}
So I am trying to figure out if ClojureWerks Cassandra Cascading tap already supports operations across all of the rows. As you can see, the super-row keys, the super-rows, and the super-columns are all generated (uuids, dates, timestamps, etc). In the examples and the code I have seen I am led to believe that fixed names identifying column names, column field names, key column names, and field mappings are needed to be specified in advance.
At the Hadoop level of Cassandra's support for MapReduce it appears Cassandra does support fetching all rows of data from a given column family. From the documentation:
"Cassandra rows or row fragments (that is, pairs of key + SortedMap of columns) are input to Map tasks for processing by your job, as specified by a SlicePredicate that describes which columns to fetch from each row."
So it appears that it is definitely possible at a low level, but it is unclear how to accomplish what I'm trying to do at the Cascading level.
Does this requires adapting or creating a variant of the existing tap, or can it be done somehow with the existing one?