12
votes

I'm beginning to work on the caching infrastructure for my ASP.NET MVC site. The problem is, I can't seem to find a reasonable place for data caching (other than 'everywhere')

Right now my architecture looks like this:

Controller -> Service Layer -> Repository. The repository uses Linq to SQL for data access.

The repository exposes generic methods like Insert, GetById, and GetQueryable, which returns an IQueryable that the service layer can further refine.

I like the idea of putting caching in the repository layer, since the service layer shouldn't really care where the data comes from. The problem though is with cache invalidation. The service layer has more information about when data becomes stale than the repository. For instance:

Suppose we have a Users table and an Orders table (the canonical example). The service layer offers methods like GetOrder(int id), which would call the repository layer:

public Order GetOrder(int id)
{
    using(var repo = _repoFactory.Create<Order>())
    {
        return repo.GetById(id)
    }
}

or

repo.GetQueryable(order => order.Id == id && order.HasShipped == false).Single();

If we cache in the repository layer, it seems like it would be very limited in knowing when that order data has changed. Suppose the user was deleted, causing all their orders to be deleted with a CASCADE. The service layer could invalidate the Orders cache, since it knew the user was just removed. The repository though (since it's a Unit of Work), wouldn't be aware. (Ignore the fact that we shouldn't be querying orders for a deleted user, since it's just an example).

There's other situations where I think this shows itself. Suppose we want to fetch all the users orders:

repo.GetQueryable(order => order.UserId == userId).ToList()

The repository can cache the results of this query, but, if another order is added, this query is no longer valid. Only the service layer is aware of this though.

It's also possible my understanding of the repository layer is wrong. I sort of view it as a facade around the data source (i.e. changing from L2SQL to EF to whatever, the service layer is unaware of the underlying source).

2
Looks like a correct assessment of caching and it's perils.gt124

2 Answers

8
votes

Realistically, you will need another layer; the data caching layer. It will be used by your service layer when requesting data. Upon such a request, it will decide if it has the data in cache or if it needs to query the appropriate repository. Likewise, your service layer can tell this new data caching layer of an invalidation (the deletion of a particular user, etc.).

What this can mean for your architecture though, is that your data caching layer will implement the same interface(s) your repositories do. A fairly simple implementation would cache the data by entity type and key. However, if you are using a more sophisticated ORM behind the scenes (NHibernate, EF 4, etc.), it should have caching as an option for you.

2
votes

You could put an event on the objects returned by your repositories, and have the repository subscribe the cache invalidation to a handler.

For example,

  public class SomethingRepository{
        public Something GetById(int id){
            var something = _table.Single(x=>x.id==id);
            something.DataChanged += this.InvalidateCache;
            return something;
        }

        public void InvalidateCache(object sender, EventArgs e){ 
            // invalidate your cache 
        }
  }

And your Something object needs to have a DataChanged event and some public method for your service layer to call to trigger it. Like,

  public class Something{
       private int _id;
       public int Id{
         get { return _id; }
         set {
            if( _id != value ) 
            {
              _id = value;
              OnDataChanged();
            }
         }
       }
       public event EventHandler DataChanged;
       public void OnDataChanged(){
            if(DataChanged!=null)
                 DataChanged(this, EventArgs.Empty);
       }
  } 

So, all your service layer needs to know is that the data is being changed, and the repository handles the cache invalidation.

I also suggest you take ventaur's advice and put the cache invalidation logic in a separate service. You don't need to go so far as to create a separate "data caching layer", but the logic would be cleaner if kept in a different class.