0
votes

I'm using Caffeine v2.8.5 and I want to create a cache with a variable expiry based on:

  • the creation/update of the value and
  • the last access (read) of this value.

Whatever comes first should trigger the removal of that entry.


The cache will be part of a three-layered resolution of values:

  1. The key is present in the Caffeine cache
    • use this value
    • refresh access/read expiry
  2. The key is present in the Redis database
    • use this value
    • store this value in the Caffeine cache with the remaining TTL (Time to live) of the Redis key
  3. The key was neither present in the internal cache nor Redis
    • request the value from an external REST API
    • store this value in the Redis database with a fixed expiration of 30 days
    • store this value in the Caffeine cache with a fixed expiration of 30 days

Redis is used as a global cache, so that multiple applications/instances can share the cached data, but this resolution happens so often, that it cannot be used for every request, so another caching layer is necessary.

The requested data has varying TTLs, based on the time of request. So while the expiry time may be fixed when we request the REST API and that expiry is set in Redis, the time will be dynamic in Caffeine, as the expiry is based on the remaining TTL of the Redis Key.

Cases (2) and (3) are already solved within my CacheLoader for the Caffeine cache (I use the cache in read-through mode). To control the expiration I already found out, that I'll have to make use of the advanced Expiry API and I've also looked into similar issues like (Specify expiry for an Entry) and (Expire cached values after creation time). So I came up with a wrapper object for my keys like this:

import lombok.Value;
import org.jetbrains.annotations.NotNull;
import org.jetbrains.annotations.Nullable;

import java.time.Instant;

@Value
public class ExpiringValue<ValueType> {

    @Nullable
    private final ValueType value;
    @NotNull
    private final Instant validUntil;
}

and an Expiry like this:

import com.github.benmanes.caffeine.cache.Expiry;
import org.jetbrains.annotations.NotNull;

import java.time.Duration;
import java.time.Instant;

public final class ValueBasedExpiry<KeyType, ValueType extends ExpiringValue<?>> implements Expiry<KeyType, ValueType> {

    @Override
    public long expireAfterCreate(
        @NotNull final KeyType key,
        @NotNull final ValueType value,
        final long currentTime
    ) {
        return Duration.between(Instant.now(), value.getValidUntil()).toNanos();
    }

    @Override
    public long expireAfterUpdate(
        @NotNull final KeyType key,
        @NotNull final ValueType value,
        final long currentTime,
        final long currentDuration
    ) {
        return currentDuration;
    }

    @Override
    public long expireAfterRead(
        @NotNull final KeyType key,
        @NotNull final ValueType value,
        final long currentTime,
        final long currentDuration
    ) {
        return currentDuration;
    }
}

What's different in my use case is, that I'd like to have a second expiry criterion based on the last access of the value. So I'd like to remove the entry early, if it has not been requested for an hour. And if it is frequently accessed, it will be eventually removed after the TTL reaches zero.

How would I implement this second criterion? I don't know how I would get the last time that an entry was accessed. The interface does not seem to provide such a value. I also looked into this question. Is it correct that the methods will be called/re-evaluated periodically, based on the scheduler bucket, that the entry has been sorted into?

1
It sounds like you could implement expireAfterRead that stores the last access as another timestamp on ExpiringValue and returns the minimum duration between validUntil and lastAccessDuration. Then the entry has a 1 hr TTL by access time, up until it becomes invalid due to Redis TTL. Or am I missing something?Ben Manes
@BenManes Hey, thanks for the quick response! I guess I'm being a bit blind, but how would I even get the last access time at all? The interface only seems to provide me with the consistent, but arbitrary nano time (from the Ticker) and the previously set duration. Also: How do the three durations work together? Are all of them required to reach zero in order to envict/invalidate the entry or is only one of them needed?Scrayos
You can write to a volatile field in ExpiringValue within expireAfterRead to maintain your own state for calculations. The cache only stores one timestamp, which the 3 methods assist in calculating. Since you know the max valid time, maybe you can simply calculate the window as Math.min(TimeUnit.HOURS.toNanos(1), Duration.between(Instant.now(), value.getValidUntil()).toNanos())?Ben Manes
@BenManes Ah, that code example made it much clearer! So as you've written TimeUnit.HOURS.toNanos(1), I do not have to return the remaining duration, but the initial duration? Because what I found confusing, is that I thought, I have to always return the remaining duration each time the callback was triggered. And I thought that it would never expire if I would for example always return one hour, as it would always be "refreshed" and therefore never reach zero. Could you tell me if I understood it correctly now?Scrayos
You do return the remaining duration. Notice the Math.min which would shrink it until expired when reaching your invalid timeBen Manes

1 Answers

3
votes

My big misconception about how Expiries work, was that I thought, that the methods of the Expiry would be periodically triggered and re-evaluated. I'm answering my own question in case someone may get the same impression from their research.

The methods within the Expiry are only called (and the values therefore only updated) once the action of the corresponding method name, has been performed. So for example expireAfterRead(K, V, long, long) will only be called each time there has been a read for this key-value-mapping in the cache.

So if there would never be any action for a mapping after its creation (no reads or updates), only the expireAfterCreate(K, V, long) method will be called once. That is why all methods should always return the remaining duration, but don't have to consider the last time an entry was read for example, as that moment is the present (as in Instant.now()), the expireAfterRead(K, V, long, long) is called.

And as @BenManes pointed out in the comments, the correct solution for my initial question is returning

Math.min(TimeUnit.HOURS.toNanos(1), Duration.between(Instant.now(), value.getValidUntil()).toNanos())

in all three methods of the Expiry.


And to answer my other two questions in the post:

How would I get the last time that an entry was accessed? Call (for example) Instant.now() in the expireAfterRead(K, V, long, long) method. If you also want to have that value externally or in the other expire-methods, there is always the option to store this value in the ExpiringValue with a volatile field.

Is it correct that the methods will be called/re-evaluated periodically, based on the scheduler bucket, that the entry has been sorted into? No. As explained above, the methods within Expiry will only be called once the corresponding action was performed. The methods will not be triggered or re-evalutated periodically.