0
votes

I am using spring-boot , datastax-java-cassandra-connector_2.11-2.4.1.jar and java8.

I have scenario where I need to read/load the data from C* table, but this table might have million of records.

I need to load this data from C* table, is there anyway in java/spring-boot using datastax-java-cassandra-connector API I can pull the data partition by partition?

1
SELECT * FROM table? the java driver (that it uses) returns an iterator that lazily fetches data as iterating through it. - Chris Lohfink

1 Answers

1
votes

while select * from table may work, more effective way could be to read data by token ranges with query like select * from table where token(part_key) > beginRange and token(part_key) <= endRange. The Spark Cassandra connector works the same way - it gets the list of all available token ranges, and then fetch data from every token range, but send it directly to the node that holds this token range (as opposite to select * from table that retrieves all data via coordinator node).

You need to be careful in calculation of the token boundaries, especially for begin & end of the full range. You can find an example of the Java code in my repository (it's too long to paste it here).