Django - vary_on_cookie or user cache results in misses for anonymous users

Question

I'm attempting to implement a file-based per-view cache on some specific report pages. I'll focus on one in particular as an example, called ScorecardSummary.

In urls.py:

url(r'^(?P<institution_slug>[^/]+)/report/(?P<submissionset>[^/]+)/$',
    ScorecardSummary.as_view(), name='scorecard-summary'),

In views.py:

class ScorecardSummary(ScorecardView):
    template_name = 'institutions/scorecards/summary.html'

To implement the caching, I altered the view like so:

class ScorecardSummary(ScorecardView):
    template_name = 'institutions/scorecards/summary.html'

    @method_decorator(vary_on_cookie)
    @method_decorator(cache_page(86400 * 30, cache="filecache"))
    def dispatch(self, request, *args, **kwargs):
        if request.user.is_anonymous():
            response = super(ScorecardSummary, self).dispatch(request, *args, **kwargs)
            patch_cache_control(response, public=True)
        else:
            response = super(ScorecardSummary, self).dispatch(request, *args, **kwargs)
            patch_cache_control(response, private=True)

        return response

I've tried a few different permutations of these decorators both before the method and within the logic using patch_cache_control() or patch_vary_headers().

Without vary_on_cookie, every user sees the original user's information at the top corner of the page (i.e. "email address | Log Out Link"). This is also true of anonymous users, and this is where this problem began.

I found the above solution. As written, this works in that it creates a new cached version of the page for each user or anon. This is fantastic for each

My problem is that EVERY time an anon accesses the page, regardless of an existing cache file, it will create a new cache rather than loading the existing one, which sort of defeats the purpose. If I fix it to cache for anonymous, then we're back to EVERY user being served the exact same version and I'm right back where I started.

I've also tried using vary_on_headers('User-Agent'), but it still differentiates every anonymous user rather than treating them as identical. Setting private or public to True or not seems to have no effect.

So my question is this:

Is there some permutation of these decorators to control the caching such that BOTH authenticated and anonymous are cached, and if an anonymous user has already triggered the creation of a cache, that version is served to all anons rather than always missing.

Alternatively, is there a better way to implement this per-view caching that would bypass this issue and accomplish the same end? (i.e. caching regardless of authentication but without sharing user-specific data).

Thanks!

After some digging into the nuts any bolts of the user-agent idea, it looks like that's a header string that refers to either the authenticated user object or an AnonymousUser object instance created for that anon. So my problem is that it's varying on the whole unique instance. However, all of these instances will have the same "id" attribute set to "None". The answer is probably "not without a custom decorator", but is there any way to vary the cache based on an attribute of this object rather than the entire instance? — baronvonvaderham

baronvonvaderham baronvonvaderham · Accepted Answer · 2015-08-13T20:00:26

UPDATE: FURTHER INFORMATION AT BOTTOM OF THIS ANSWER

I did not find the solution to this specific problem (if anyone has one I'd love to hear it to learn more!), but I found an acceptable workaround.

I had initially considered a custom template tag, but it wasn't worth writing the compilation and render functions and stuff, especially when these per-view decorators seemed so much simpler.

I revisited this idea because obviously only caching MOST of the page, leaving out the top header block, is ideal. Then the same cache can be served to all users without risking including user-specific data.

In doing so I accidentally stumbled upon: https://github.com/twidi/django-adv-cache-tag

This allows me to create a custom template tag and specify a back-end other than default.

It's not ideal, I'd like to write something like this that simply creates a new {% adv_cache %} tag that takes options rather than taking over the default {% cache %} tag, eliminating the need to extend it with writing custom tags that just call this thing's methods.

Anyway, that's my two cents, I'll probably contribute something a little more automated in the future to replace this, or perhaps look at modifying the native cache tag to accept arguments to specify a non-default back-end.

UPDATE:

My final solution did indeed require making use of django-adv-cache-tag and creating a custom template tag. Since this is probably a pretty common problem, and the docs for that package are SEVERELY lacking in easy-to-follow examples, I wrote my own package that extends it:

https://github.com/baronvonvaderham/django-file-cache-tag

I'll be registering that with pypi soon for easier installation.

That's written in such a way that you can very easily install it, modify a few settings.py lines, and go right into using the tag. I think it's a readable example one could use as a model to further extend this with their own custom tags for other backends other than file-based.

I also demonstrated how to interface with the django caching API with multiple back-ends for django<1.8, as I've found the official docs for those earlier versions are actually inaccurate and self-contradictory. Example: It says to load the specific cache by:

from django.core.cache import cache
my_cache = cache.get_cache('backend-name-from-settings')

This throws an error that 'cache' has no attribute or method get_cache(). For 1.7+, they added "caches" to this library, so importing works more easily without any method calls:

from django.core.cache import caches
my_cache = caches['backend-name-from-settings']

Easy. Just access the appropriate dict key.

Further, I wanted to demonstrate use of their key generation functions outside of the context of the cache tag class. I think a flaw of this package was making those class methods so that they can only be invoked if you spin up an instance of it...which can't be done as far as I can tell (or at least you can't make a bogus "test-cache = CacheTag()" call to make a dummy instance and access its methods).

I abstracted the way django-adv-cache-tag makes cache keys into a much more logical "generate_cache_key()" function that is accessible globally. This was critical to manually invalidating the cache on certain pages if a user submitted a correction that updated a report. The associated reports needed to be updated, and this sort of long-term caching was incompatible with that.

I simply tacked on an invalidation function. Then in whatever view you have that's altering data, you can just use generate_cache_key() to make any keys there might be for that page (I loop through all permutations of my vary_on parameters), and call the invalidate_filecache() function to nuke those pages.

That said, it's WAY easier in 1.8+, as all of this is included in the native {% cache %} tag (where the back-end key is a default argument now), and the new django.core.cache.caches you can import to use multiple back ends in your views.

But if you're like me and stuck with legacy code written pre-1.8, I hope this can help you so you don't have to go through the whole process I did to figure out what should be a really simple caching variation in your templates (and obviously the django core devs agree, since this was added as default functionality).

Django - vary_on_cookie or user cache results in misses for anonymous users

1 Answers

UPDATE: FURTHER INFORMATION AT BOTTOM OF THIS ANSWER

UPDATE: