0
votes

I’m working on updating an old thrift-based code to CQL3.

One part of the code is walking through the entire dataset of a table consisting of 20M+ rows. This part was initially crashing the program due to memory usage, so I created a RowIterator class which iterated through the column family using TokenRanges (and Hector).

When trying to rewrite this using CQL3, I’m having trouble paging through the data. I found some info over at http://www.datastax.com/documentation/cql/3.0/cql/cql_using/paging_c.html, but when trying this code for the first "page"

resultSet = session.execute("select * from " + TABLE + " where token(key) <= token(" + offset + ")");

I get the error

com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 0 of CQL type varchar, expecting class java.lang.String but class java.lang.Integer provided

Granted, the example at the link uses numerical keys. Is there a way to do this with varchar (UTF8Type) keys?

It seems that there is now a built-in functionality for this (https://issues.apache.org/jira/browse/CASSANDRA-4415), but I can’t find examples that get me going. Besides, I have to solve it for Cassandra 1.2.9 for now.

1

1 Answers

1
votes

So the easy answer is to upgrade to Cassandra 2.0.X and use the new built in paging functionality. But to get it done on Cassandra 1.2 you are on the right path. Your syntax should be working, if you run the query you are trying in cqlsh do you get the same error? When paging like this it is best to use ">" like in the example, that might be the issue. You want to start with select * from table limit 100 then go to select * from table where token(key)>token('last key') limit 100

Also I would try it with a prepared statement. The string manipulations may be doing something funny to the offset.