DDD Repository EF Performance

Question

I was wondering how people who follow DDD get around potential performance issues with using EF and the repository pattern with returning an aggregate root with children.

e.g. Parent ----- Child A

Or even e.g. Parent ----- Child A ------- Child A2

If I bring back the aggregate root's data from the repository and use a navigational property EF then fires off another query because it is utlising lazy loading. This is a problem because we are experiencing 100+ queries when we are in a loop.
If I bring back the aggregate root's data from the repository with the children's data as well by using the 'Include' statements, this will bring back the childrens data from the repository with its parent. Then when I use the navigational properties no queries fire off because that data is already in memory.

The problem with the second approach is that some of our data for the child object can be quite big e.g. 100,000+ records. Obviously I don't want to store 100,000+ records in memory for the child. We decided to use paging to select 10 at a time to get around this, but another issue is when we are trying to use calculations on the children like sum, total count etc but we can only do that in memory on the 10 records we have pulled back.

I know the DDD way is to pull back the object graph with all of its data in memory and then you traverse through the objects for the data you need to display.

There is a split in our team with some believing we should pull back the aggregate root and it's children together and some feel we should have a method on the aggregate root's repository that queries the childrens data directly and pulls back the child object.

I Just wondered how other people have solved the performance issues with large amounts of data being stored in memory with the parent/child.

maybe you have to re-model / split up your entities in certain cases. Also, dont forget that you might have to ditch LINQ queries in complex scenarios and map a stored procedure to your entity data model. This can increase performance as well. — hoetz

Ladislav Mrnka Ladislav Mrnka · Accepted Answer · 2012-07-10T13:42:37

If you have to deal with performance you must use the second approach with special method exposed on repository - that is the point of repository to provide you such methods otherwise you can use EF context / set directly.

Theory is nice if you work with theoretical data - once you have real data you must tweak theory to work in real world scenarios.

You can also check this article (there are three following articles on the blog). It does the second way but it pretends to be the first way. It works for Count but maybe you can use the idea for some other scenarios as well.

DDD Repository EF Performance

2 Answers