Repository pattern - how to correctly handle JOINs and complex queries?

Question

I have a problem with Repository pattern - how to perform JOIN operations between several repositories. In this project, we use MVC, EF, DDD. I'm aware that this kind of question was here several times, I reference these questions later in this one.

Between generic repository model (IRepository) and specific repository model, I chose specific option, since I consider ORM (in our case EF) as a generic repository pattern itself, so it doesn't make sense to add another generic repository and we'd rather tailor the repository to the domain needs.

The problem is that I have several (~ 10) tables, each with many rows (millions), and I need to perform JOINs, so using IList or IEnumerable isn't viable option.

My understanding (and my perspective) is that IQueryable shouldn't leave the repository ("What happens in DAL, should stay in DAL."). It would be way simpler to expose IQueryable and use it in LINQ in service, but it strongly violates separation of concerns and undermines role of repositories - in such case, service will be doing the same thing as repository does. To pick a few, these articles back up this perspective (or rather conviction):

To return IQueryable<T> or not return IQueryable<T>

Should I return IEnumerable<T> or IQueryable<T> from my DAL?

http://www.shawnmclean.com/blog/2011/06/iqueryable-vs-ienumerable-in-the-repository-pattern/

http://blog.ploeh.dk/2012/03/26/IQueryableTisTightCoupling/

There are also similar questions and solutions, e.g. How to join Multiple tables using Repository Pattern & Entity Framework? that suggest .Include(), but this is not an option for heavy loaded tables and joins across many tables - with each JOIN, we use subselects to limit what's actually joined).

This question (the answer and comments) - How can I query cross tables with Repository Pattern? - basically suggests task-based differentiation: creating one Repository for queries with JOINS, and "regular" Repositories for manipulation with each entity.

I see we have these options:

Exposing IQueryable and performing JOIN complex queries in services; I sincerely feel it's anti-pattern, I don't like that.
Don't use Repository for these ~ 10 tables and perform queries in services; some articles suggested that using EF is enough (e.g. Is it okay to bypass the repository pattern for complex queries?), I don't concur with that.
Use task-based differentiation, don't limit repositories 1:1 repo:entity (I'm in favor of this option)
Something completely different?

So - what would you suggest? Again and again, thank you.

The point of dealing of a repository is to ignore the db all together. Db engines, tables, sql, joins etc are implementation details of a repository which never pass the persistence boundary. SO the app never knows about EF or joins and never access the db directly. The app knows only about the repository and it tells it what to do/get not how to do it — MikeSW
Hi Robert, I'm struggling with the same issue. What I've found from so far took me to the same conclusion that you had with the options. Which approach are you going for in the end and why? — Adelia Benalius
Hi, for complex queries we use some specific repositories only for queries, and in our case, queries inside these repositories are usually performed by stored procedures that return multiple record sets (when join isn't enough) and the repository translates that into specific (usually complex) domain object (which isn't entity per se, but I believe it still isn't abuse of DDD). — Robert Goldwein
And for the non-complex queries? For example GetOrderLineByOrderId or GetOrderLineByCustomerId — Adelia Benalius
For standard queries we use normal repository around aggregate root and EF6, in mentioned case that would be something like OrderRepository that deals with order domain. — Robert Goldwein

Bartłomiej Szypelow Bartłomiej Szypelow · Accepted Answer · 2014-01-21T12:26:38

means leaking persistence into application - antipattern, spaghetti-code.
how it differs from (1) exactly? Still the same issue.
a bit closer...
Use Query Object pattern. Encapsulate your complex query in a task-based object that resides alongside repositories. It can return DTOs optimized for view rather than domain objects.
Relying heavily on QO will lead you to an architecture called CQRS - Command-Query Responsibility Segregation.

One more thing. There is not 1:1 match for entity:repo. Only Aggregates should have a repository, not every single entity.

Repository pattern - how to correctly handle JOINs and complex queries?

1 Answers