0
votes

I am using the Scroll API to get more than 10,000 documents from our Elastic Search, however, whenever I the code tries to query past 10k, I get the below error:

Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]

This is my code:

        try {
        // 1. Build Search Request
        final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
        SearchRequest searchRequest = new SearchRequest(eventId);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        searchSourceBuilder.query(queryBuilder);
        searchSourceBuilder.size(limit);

        searchSourceBuilder.profile(true); // used to profile the execution of queries and aggregations for a specific search

        searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); // optional parameter that controls how long the search is allowed to take

        if(CollectionUtils.isNotEmpty(sortBy)){
            for (int i = 0; i < sortBy.size(); i++) {
                String sortByField = sortBy.get(i);
                String orderByField = orderBy.get(i < orderBy.size() ? i : orderBy.size() - 1);
                SortOrder sortOrder = (orderByField != null && orderByField.trim().equalsIgnoreCase("asc")) ? SortOrder.ASC : SortOrder.DESC;
                if(keywordFields.contains(sortByField)) {
                    sortByField = sortByField + ".keyword";
                } else if(rawFields.contains(sortByField)) {
                    sortByField = sortByField + ".raw";
                }
                searchSourceBuilder.sort(new FieldSortBuilder(sortByField).order(sortOrder));
            }
        }
        searchSourceBuilder.sort(new FieldSortBuilder("_id").order(SortOrder.ASC));

        if (includes != null) {
            String[] excludes = {""};
            searchSourceBuilder.fetchSource(includes, excludes);
        }

        if (CollectionUtils.isNotEmpty(aggregations)) {
            aggregations.forEach(searchSourceBuilder::aggregation);
        }

        searchRequest.scroll(scroll);
        searchRequest.source(searchSourceBuilder);

        SearchResponse resp = null;
        try {
            resp = client.search(searchRequest, RequestOptions.DEFAULT);
            String scrollId = resp.getScrollId();
            SearchHit[] searchHits = resp.getHits().getHits();

            // Pagination - will continue to call ES until there are no more pages
            while(searchHits != null && searchHits.length > 0){
                SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
                scrollRequest.scroll(scroll);
                resp = client.scroll(scrollRequest, RequestOptions.DEFAULT);
                scrollId = resp.getScrollId();
                searchHits = resp.getHits().getHits();
            }

            // Clear scroll request to release the search context
            ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
            clearScrollRequest.addScrollId(scrollId);
            client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);

        } catch (Exception e) {
            String msg = "Could not get search result. Exception=" + ExceptionUtilsEx.getExceptionInformation(e);
            
            throw new Exception(msg);
        

I am implementing the solution from this link: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search-scroll.html

Can anyone tell me what I am doing wrong and what I need to do to get past 10,000 with the scroll api?

1
Can you show the full error you get?Val
This is the error: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]babycoder
That's not the full error. Can you check in the ES server logs please?Val
This running on localhost. This is the exception I get from the breakpoints. It gave the reason that there was "no search context for id {id}"babycoder
Is an iteration lasting more than 60 seconds?Val

1 Answers

1
votes

If your iterations take more than 5 minutes, then you need to adapt the scroll time. Change this line to make sure the scroll context doesn't disappear after 1 minute

final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(10L));

And remove this one:

searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); // optional parameter that controls how long the search is allowed to take