2
votes

Given is a SpringBoot application which hosts a graphQL server. It uses many DataLoaders and they seem to perform fine themselves, but I noticed a very big performance leak related to (or in-between) them I can't clearly narrow down - but measure it.

Problem

  1. Some (UI) client calls a query of graphQL API from the service, which will trigger a fetch of x elements (x > 10_000).
  2. GraphQLQueryResolver is called inside SpringBoot service, which fetches x elements.
  3. Functions of field getters are called, which return CompletionStage<T> from loaded DataLoader<K,T>
  4. Something that takes very long time
  5. DataLoader<K,T> implementation class is called with batched keys and returns results.

The example log looks like this:

2020-06-19T18:25:14.196Z [http-nio-80-exec-10] ~ Shopping ~ INFO  ~ It took |> PT0.095S <| for 'orders query with filter:[OrderFilter(col=createdAt, operator=BETWEEN, value=2020-01-05T00:00:00Z AND 2020-05-31T00:00:00Z)]'
2020-06-19T18:25:18.686Z [DefaultDispatcher-worker-6] ~ Shopping ~ INFO  ~ It took |> PT0.001S <| for 'orderKpiDataLoader' (PT0.000000095S on average for #10476 executions)
2020-06-19T18:25:23.229Z [DefaultDispatcher-worker-19] ~ Shopping ~ INFO  ~ Start 'priceForOrderReferences'
2020-06-19T18:25:24.840Z [DefaultDispatcher-worker-41] ~ Shopping ~ WARN  ~ It took |> PT1.613S <| for 'orderDepositDataLoader' (PT0.00015397S on average for #10476 executions)

Explanation to the case of the log:

  1. 18:25:14.196 : "orders query with filter" is called and returns in 95ms #10476 elements
  2. + 4.49 S : all its fields "orderKpi" are returned from their DataLoader. It took 1ms to create and return them.
  3. + 4.54 S : field "priceForOrderReferences" of "orderKpi" is loaded over its own DataLoader. It took 1.613S to create and return them.

Solving questions

  • What happens in the time between DataLoaders are being called? (those +4.49S & +4.54S)
  • How can I reduce it?

Stuff I did and my assumptions

  • I tried to make DataLoaders singletons, because I thought that object creation takes so much time - it wasn't the issue.
  • I tried to introduce manual batching by overriding the batch size - it reduced the times a little but increased the overall runtime (various batch sizes work different for various element sizes)
  • Since batching reduced the time (very little), I assume that the performance leak comes from "collecting" the element Keys to make the a List and pass them to the DataLoader. I also measured manually the average-performance-leak-time per item and it always seems to be ~1ms per element. I didn't find any code part how they are collected, or how to override it myself.

New info:

  • I experimented with different data-loader libraries, like :
<dependency>
    <groupId>com.graphql-java-kickstart</groupId>
    <artifactId>graphql-kickstart-spring-boot-starter-tools</artifactId>
    <version>7.0.1</version>
</dependency>

and

<dependency>
    <groupId>com.graphql-java-kickstart</groupId>
    <artifactId>graphql-java-tools</artifactId>
    <version>6.0.2</version>
</dependency>

and also disabled caching - still the same outcome. I put more logs (of course carefully, so they don't slow the performance themselves) and my new clue is, that the graphql library is the problem itself, not the DataLoader. I removed in one case the DataLoader for a field - and it took the same amount of time until it was called!

Stats

Spring

<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.7.RELEASE</version>

My graphQL dependencies

<dependency>
    <groupId>com.graphql-java-kickstart</groupId>
    <artifactId>graphql-spring-boot-starter</artifactId>
    <version>7.0.1</version>
</dependency>
<dependency>
    <groupId>com.apollographql.federation</groupId>
    <artifactId>federation-graphql-java-support</artifactId>
    <version>0.4.1</version>
</dependency>
<dependency>
    <groupId>com.graphql-java-kickstart</groupId>
    <artifactId>playground-spring-boot-starter</artifactId>
    <version>7.0.1</version>
</dependency>
<dependency>
    <groupId>com.graphql-java</groupId>
    <artifactId>graphql-java-extended-scalars</artifactId>
    <version>1.0.1</version>
</dependency>
<dependency>
    <groupId>com.graphql-java</groupId>
    <artifactId>java-dataloader</artifactId>
    <version>2.2.3</version>
</dependency>

DataLoader

All DataLoaders implement this abstract:

import com.smark.shopping.utils.FunctionPerformance
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.future.future
import org.dataloader.DataLoader

private val dataLoaderSingleScopeIO = CoroutineScope(Dispatchers.IO)

interface DataLoaderEntryInterface<T, R> {
    val key: String
    val dataLoader: DataLoader<T, R>
}

abstract class DataLoaderEntryAbstract<T, R>(
    override val key: String,
    private val loader: suspend (List<T>) -> List<R>
) : DataLoaderEntryInterface<T, R> {

    private val performance = FunctionPerformance()

    override val dataLoader: DataLoader<T, R>
        get() = DataLoader.newDataLoader { ids ->
            dataLoaderSingleScopeIO.future {
                performance.executeMeasuringSuspend(key, ids.size) { loader(ids) }
            }
        }
}

@Component
class DataLoaderOrderDeposit(private val orderTotalPriceService: OrderTotalPriceService) : DataLoaderEntryAbstract<Int, Int>(
    key = ORDER_DEPOSIT_DATA_LOADER,
    loader = { orderReferences -> orderTotalPriceService.priceForOrderReferences(orderReferences).map { it.deposit } }
)

Resolver

@Component
class OrderResolver : GraphQLResolver<ShopOrder> {

     fun kpi(shopOrder: ShopOrder, dfe: DataFetchingEnvironment): CompletionStage<OrderKpi> =
        DataLoaderFuture<OrderKpi>(dfe, ORDER_KPI_DATA_LOADER).loadBy(shopOrder)
}

@Component
class OrderKpiResolver : GraphQLResolver<OrderKpi> {

    fun deposit(orderKpi: OrderKpi, dfe: DataFetchingEnvironment): CompletionStage<Int> =
        dfe.getDataLoader<Int, Int>(ORDER_DEPOSIT_DATA_LOADER).load(orderKpi.orderReference)
}

Context Builder

@Component
class CustomGraphQLContextBuilder(
    private val dataLoadersSummelsarium: DataLoadersSummelsarium
) : GraphQLServletContextBuilder {

    override fun build(req: HttpServletRequest, response: HttpServletResponse): GraphQLContext =
        DefaultGraphQLServletContext.createServletContext(buildDataLoaderRegistry(), null)
                .with(req)
                .with(response)
                .build()

    override fun build(session: Session, request: HandshakeRequest): GraphQLContext =
        DefaultGraphQLWebSocketContext.createWebSocketContext(buildDataLoaderRegistry(), null)
                .with(session)
                .with(request)
                .build()

    override fun build(): GraphQLContext = DefaultGraphQLContext(buildDataLoaderRegistry(), null)

    private fun buildDataLoaderRegistry(): DataLoaderRegistry =
        DataLoaderRegistry().apply {
            dataLoadersSummelsarium.dataLoaders().forEach { register(it.key, it.dataLoader) }
        }
}
1
Performance leak? Slowdown may be a better word (leak is more commonly associated with resource, as in memory). - tucuxi
It would be much easier to answer if there were some kind of small, self-contained example reproducing the problem. Right now you have an awful lot of stack to look through, and the question only covers some of the parts. - tucuxi
@tucuxi Yes, you're right. I'll adapt the name when bounty is over. Considering the " awful lot of stack to look through" - this is already a simplified version. If I leave out more, it may not represent my case anymore - and I don't know what CAN be left out since I don't know whats wrong. - Neo
A profiler could also help you pinpoint where time is being spent, without doing a lot of manual instrumentation. - rud

1 Answers

0
votes

My own solution:


What happens in the time between DataLoaders are being called? (those +4.49S & +4.54S)

I'm still not sure what exactly slows it down, but it seems to be a problem of the graphql-java dependency. When executing similar query in graphql java-script implementation, the time delays were around ~60ms.

How can I reduce it?

  1. I found a good article concerning graphQL performance where I took the advice to load more data in advance, on top level. This way it is also possible to create custom Classes which hold more data and may reduce redundant repository paths on lower levels.
  2. To avoid waiting times of DataLoaders to "collect" keys before executing the function, you may choose certain paths to look-ahead and execute the loaders yourself - on upper levels.
  3. Play around batching sizes of different DataLoaders

If you would simplify the answer, it could be: "If you want to be fast, don't go deep".


I am still open for other solutions.