0
votes

I need to call the rest paginated api using webclient and flux. I have tried in blocking way (one by one), but I want to make it parallel.Lets say 10 parallel calls at a time. Every call fetches 1000 records.
I am already calling the 0th request to get the total records count from the header. After the completion of request I need to call POST api to sent this response (1000 records).

If any request completed then 11th request will be sent and so on. I have already seen other examples of asyncRestTemplate and listenable futures but asyncRestTemplate is already deprecated and alternative is spring-webflux

Also
As rest template is going to be deprecated

What I have done.

  1. Divide the total-count/1000 -> gives the total pages
  2. loop in till 5 ( if i change to totalpages count then its giving me 500 Internal server error)
  3. call the service which returns the Mono>
  4. subscribe each request
ObjectMapper objmapper = new ObjectMapper();

HttpHeaders headers = partsService.getHeaders();
long totalCount = Long.parseLong(headers.get("total-count").get(0));
log.info(totalCount);
long totalPages = (long) Math.ceil((double) totalCount / 1000);
log.info(totalPages);
// List<Mono<List<Parts>>> parts = new ArrayList<>();
for (long i = 1; i <= 5; i++) {
    partsService.fetchAllParts(1000L, i).log().subscribe(partList -> {
        try {
            // post each request response to another API
            log.info(objmapper.writeValueAsString(partList));
        } catch (JsonProcessingException ex) {
            ex.printStackTrace();
        }

    });
    log.info("Page Number:" + i);
}

I want to execute in parallel without any outOfmemoryerror and not to put much burden to calling api. Also I have tried to fetch all the pages at once but I am getting 500 Internal server error.

I am new to Flux (project reactor)

Implemented below solution

Its not running parallel ,single request is taking ~2min time which means all 10(concurrency level) should complete at same time.

try {
        fetchTotalCount().log()
                .flatMapMany(totalCount -> createPageRange(totalCount, 1000)).log()
                .flatMap(pageNumber -> fetch(1000, pageNumber), 10).log()
                .flatMap(response -> create(response))
                .subscribe();

        } catch (Exception e) {
            e.printStackTrace();
        }

Logs

2019-07-29T09:00:14,477 INFO  [scheduling-1] r.u.Loggers$Slf4JLogger: request(10)
2019-07-29T09:00:14,478 INFO  [scheduling-1] r.u.Loggers$Slf4JLogger: request(10)
2019-07-29T09:00:14,479 INFO  [scheduling-1] r.u.Loggers$Slf4JLogger: request(unbounded)
2019-07-29T09:00:14,679 INFO  [scheduling-1] c.o.q.l.Logging: fetch() execution time: 546 ms
2019-07-29T09:00:17,028 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(74577)
2019-07-29T09:00:17,042 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(1)
2019-07-29T09:00:17,068 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,1) execution time: 24 ms
2019-07-29T09:00:17,078 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(2)
2019-07-29T09:00:17,080 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,2) execution time: 2 ms
2019-07-29T09:00:17,083 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(3)
2019-07-29T09:00:17,087 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,3) execution time: 2 ms
2019-07-29T09:00:17,096 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(4)
2019-07-29T09:00:17,098 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,4) execution time: 1 ms
2019-07-29T09:00:17,100 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(5)
2019-07-29T09:00:17,101 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,5) execution time: 1 ms
2019-07-29T09:00:17,103 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(6)
2019-07-29T09:00:17,106 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,6) execution time: 3 ms
2019-07-29T09:00:17,108 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(7)
2019-07-29T09:00:17,110 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,7) execution time: 2 ms
2019-07-29T09:00:17,113 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(8)
2019-07-29T09:00:17,115 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,8) execution time: 1 ms
2019-07-29T09:00:17,116 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(9)
2019-07-29T09:00:17,118 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,9) execution time: 1 ms
2019-07-29T09:00:17,119 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(10)
2019-07-29T09:00:17,121 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,10) execution time: 1 ms
2019-07-29T09:00:17,123 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onComplete()
2019-07-29T09:09:03,295 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-29T09:09:03,296 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(11)
2019-07-29T09:09:03,296 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,11) execution time: 0 ms
2019-07-29T09:09:03,730 INFO  [reactor-http-nio-1] c.o.q.s.Scheduler: 200 OK
2019-07-29T09:09:03,730 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-29T09:09:05,106 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(// data print)
2019-07-29T09:09:05,196 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-29T09:09:05,196 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(12)
2019-07-29T09:09:05,198 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,12) execution time: 1 ms
2019-07-29T09:09:05,466 INFO  [reactor-http-nio-1] c.o.q.s.Scheduler: 200 OK
2019-07-29T09:09:05,466 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-29T09:09:09,565 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(//  data print)
2019-07-29T09:09:09,730 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-29T09:09:09,730 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(13)
2019-07-29T09:09:09,731 INFO  [reactor-http-nio-1] c.o.q.l.Logging: fetch(1000,13) execution time: 0 ms
2019-07-29T09:09:10,049 INFO  [reactor-http-nio-1] c.o.q.s.Scheduler: 200 OK

Update

After correcting the calling API ,the records are coming but after getting the last page(75) i am getting the 404 Not found error.

2019-07-30T14:07:50,071 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(75)
2019-07-30T14:07:50,075 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onComplete()
2019-07-30T14:07:50,322 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(200 OK)
2019-07-30T14:07:50,323 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-30T14:07:51,973 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(//data)
2019-07-30T14:07:52,440 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(200 OK)
2019-07-30T14:07:52,440 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-30T14:07:54,522 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(//data)
2019-07-30T14:07:54,699 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(//data)
2019-07-30T14:07:55,075 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(200 OK)
2019-07-30T14:07:55,076 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-30T14:07:55,371 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onNext(200 OK)
2019-07-30T14:07:55,371 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: request(1)
2019-07-30T14:07:55,471 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: cancel()
2019-07-30T14:07:55,472 INFO  [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: cancel()
2019-07-30T14:07:55,473 ERROR [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: onError(java.lang.Exception: 4XX received from API)
2019-07-30T14:07:55,473 ERROR [reactor-http-nio-1] r.u.Loggers$Slf4JLogger: 
java.lang.Exception: 4XX received from API
2

2 Answers

1
votes

Flux.flatMap has a parameter to set the concurrency level which lets you coordinate the parallelization.

In the below example I used dummy URLs, some fragments from your example and some additional simple code to demonstrate how you can achieve this:

public static void main(String[] args)
{
    fetchTotalCount()
            .flatMapMany(totalCount -> createPageRange(totalCount))
            .flatMap(pageNumber -> fetch(pageNumber), 5) // 5 is the concurrency level = how many pages we query concurrently
            .flatMap(response -> process(response))
            .subscribe();
}

private static Mono<Integer> fetchTotalCount()
{
    return webClient.get()
                    .uri("http://www.example.com/get-total-count")
                    .exchange()
                    .map(ClientResponse::headers)
                    .map(headers -> headers.asHttpHeaders().get("total-count").get(0))
                    .map(Integer::valueOf);
}

private static Flux<Integer> createPageRange(int totalCount)
{
    int totalPages = (int) Math.ceil((double) totalCount / 1000);

    return Flux.range(1, totalPages);
}

private static Mono<Response> fetch(int pageNumber)
{
    return webClient.get()
                    .uri("http://www.example.com/fetch?page=" + pageNumber)
                    .retrieve()
                    .bodyToMono(Response.class);
}

private static Mono<Response> process(Response response)
{
    // todo send other http request for the post api here
    return Mono.just(response);
}

private static class Response
{
}
0
votes

Use the overloaded version of .flatMap() in which you can specify the concurrency level. Normally, .flatMap() subscribes eagerly to all the inner streams - it won't wait for any stream to complete before subscribing to the next one (unlike .concatMap()). But one you specify the concurrency level, it will subscribe only to that many inner streams eagerly at max (at once). In your case, the 11th inner stream will be subscribed to only when at least one of the initial 10 inner streams has finished. This solves exactly your problem.