2
votes

I'm running a query that fetches millions of rows (5.000.000 or so). My nodes seem to be quite busy, as the coordinator returns a com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded) exception. (I don't really know if the nodes are busy or something else is going on).

So far I've tried setting a higher read_request_timeout_in_millis in every Cassandra node, and executing the query like this

new SimpleStatement("SELECT * FROM where date = ? ",param1)
    .setFetchSize(pageSize).setConsistencyLevel(ConsistencyLevel.ONE)
    .setReadTimeoutMillis(ONE_DAY_IN_MILLIS);
ResultSet resultSet = this.session.execute(statement);

But the exception is still being thrown. My next move is to try a custom RetryPolicy, but can someone tell me if a readTimeout retry will execute the whole query again or will retry from the current page that failed?

I was trying something like this:

@Override
public RetryDecision onReadTimeout(Statement statement, ConsistencyLevel cl, int requiredResponses, int receivedResponses, boolean dataRetrieved, int nbRetry) {
    if (dataRetrieved) {
        return RetryDecision.ignore();
    } else if (nbRetry < readRetries) {
        LOGGER.info("Retry attemp {} out of {} ",nbRetry,readRetries);
        return RetryDecision.retry(cl);
    } else {
        return RetryDecision.rethrow();
    }
}

where readReatries is the number of retries that I will attemp to fetch the data.

1
What is your page size? - fuggy_yama
@fuggy_yama I'm working with a page size of 100 rows. - juliccr

1 Answers

4
votes

When you use fetch size on query driver will never issue whole query up front. Even when you do not specify fetch size driver will use 5000 as fetch size to prevent overloading the memory with many objects. What is happening, is that chunk of results are fetched by issuing query with limit and while you iterate over results, when you get to end of chunk driver will issue query for following number of results and so on. All in all if result number is bigger that fetch size multiple queries will get issued from driver to cluster. Nice sequence diagram along with other explanations can be seen on official datastax driver page.

That being said RetryPolicy works on single statement, and does not know nothing about fetch size, so that statement will get retried number of times you define (meaning only that chunk will get retried on timeout).