2
votes

I want to call independent request simultaneously with WebClient. My previous approch with RestTemplate was blocking my threads while waiting for the response. So I figured out, that WebClient with ParallelFlux could use one thread more efficient because it is supposed to schedule multiple requests with one thread.

My endpoint requests an tupel of id and a location.

The fooFlux method will be called a few thousand times in a loop with different parameters. The returned map will be asserted against stored reference values.

Previous attemps ofthis code resulted in duplicated API calls. But there is still a flaw. The size of the keyset of mapping is often less than the size of Set<String> location. In fact, the size of the resulting map is changing. Furthermore it is correct every now and then. So there might be an issue with the subscripton finishing after the method has returned the map.

public Map<String, ServiceDescription> fooFlux(String id, Set<String> locations) {
    Map<String, ServiceDescription> mapping = new HashMap<>();
    Flux.fromIterable(locations).parallel().runOn(Schedulers.boundedElastic()).flatMap(location -> {
        Mono<ServiceDescription> sdMono = getServiceDescription(id, location);
        Mono<Mono<ServiceDescription>> sdMonoMono = sdMono.flatMap(item -> {
            mapping.put(location, item);
            return Mono.just(sdMono);
        });
        return sdMonoMono;
    }).then().block();
    LOGGER.debug("Input Location size: {}", locations.size());
    LOGGER.debug("Output Location in map: {}", mapping.keySet().size());
    return mapping;
}

Handle Get-Request

private Mono<ServiceDescription> getServiceDescription(String id, String location) {
    String uri = URL_BASE.concat(location).concat("/detail?q=").concat(id);
    Mono<ServiceDescription> serviceDescription =
                    webClient.get().uri(uri).retrieve().onStatus(HttpStatus::isError, clientResponse -> {
                        LOGGER.error("Error while calling endpoint {} with status code {}", uri,
                                        clientResponse.statusCode());
                        throw new RuntimeException("Error while calling Endpoint");
                    }).bodyToMono(ServiceDescription.class).retryBackoff(5, Duration.ofSeconds(15));
    return serviceDescription;
}
2
why do you use JsonNode.class and not serialize/deserialize into a concrete object? and why use reactive programming for something that can be solved using the @Async. Reactive programming is not async-programming. They are two different things that complement each other. - Toerktumlare
I used JsonNode.class because the recieved JSON-Model is huge and I just need a tiny bit of it. I came up with reactive programming because of a baeldung article (baeldung.com/spring-webclient-resttemplate). I want to archive a gain in download speed. The RestTemplate approch within ParallelStreams got me between 10Mbit/s and 200 Mbit/s Download on my network-card. Depending on the amount of locations per id. But this varies from 1 to ~4000 - froehli
RestClients dont affect download speeds. Network bandwidth affect speeds. So what client you use will not affect any download speed. And if you only need a little piece, who says you need to declare the entire object? just create a class with the little piece you need. Yes use WebClient, but you need to know the difference between reactive programming and concurrent programming. - Toerktumlare
Reactive programming as all about not blocking, and utilising threads as much as possible to do both serial and parallell tasks. While concurrent programming is to do what you want to do, spawn threads and fetch things concurrently. Reactive programming can do things in serial and concurrently to solve the task you give them. But it is not used predominantly to do asynchronous tasks. - Toerktumlare
well if you need to gather upp all results and get the concrete values to return to the calling client in one big blob, and not stream results to a calling client, then in your application that is not reactive, block needs to be used. But as you have done, placed on its own scheduler. But i would suggest using a boundedElastic scheduler, the one you have chosen will use up threads into infinity and in worst case scenario run into thread starvation and crash the application. - Toerktumlare

2 Answers

3
votes
public Map<String, ServiceDescription> fooFlux(String id, Set<String> locations) {
    return Flux.fromIterable(locations)
               .flatMap(location -> getServiceDescription(id, location).map(sd -> Tuples.of(location, sd)))
               .collectMap(Tuple2::getT1, Tuple2::getT2)
               .block();
}

Note: flatMap operator combined with WebClient call gives you concurrent execution, so there is no need to use ParallelFlux or any Scheduler.

1
votes

The reactive code gets executed when you subscribe to a producer. Block does subscribe and since you call block twice (once on the Mono, but return the Mono again and then call block on the ParallelFlux), the Mono gets executed twice.

    List<String> resultList = listMono.block();
    mapping.put(location, resultList);
    return listMono;

Try something like the following instead (untested):

    listMono.map(resultList -> {
       mapping.put(location, resultList);
       return Mono.just(listMono);
    });

That said, the Reactive Programming model is quite complex, so consider to work with @Async and Future/AsyncResult instead, if this is only about calling the remote call in parallel, as others suggested. You can still use WebClient (RestTemplate seems to be on the way to get deprecated), but just call block right after bodyToMono.