13
votes

I have Table1 with the following relationships (they are not enforced they only create the relationship for the navigation properties)

Table1 (*)->(1) Table2
Table1 (*)->(1) Table3
Table1 (*)->(1) Table4
Table1 (*)->(1) Table5

Using eager loading code looks like

IQueryable<Table1> query = context.Table1s;

query = query.Include(Table1 => Table1.Table2);
query = query.Include(Table1 => Table1.Table3);
query = query.Include(Table1 => Table1.Table4);
query = query.Include(Table1 => Table1.Table5);

query = query.Where(row => row.Table1Id == table1Id);

query.Single();

Every way I try to organize the Include() statements, the first table included has an Inner Join in its generated TSQL and the remaining are Left Outer Join (I expect Left Outer for all of them). I am not Entity Splitting, they are just plain tables with FKs.

If DefaultIfEmpty() is the only solution, can someone explain the reason why when all but the first table included provide the SQL expected?

My understanding is that default behavior for a Navigation Property is LEFT OUTER but I cannot get ALL properties to generate the default.

Any help would be MUCH appreciated.

Thank you in advance!

----- Created TSQL (modified for brevity but structure the same) -------

(@p__linq__0 int)SELECT 
[Limit1].[Table1Id] AS [Table1Id], 
[Limit1].[OtherData] AS [OtherData]
FROM ( SELECT TOP (2) 
    [Extent1].[Table1Id] AS [Table1Id], 
    [Extent1].[OtherData] As [OtherData]
    FROM       [dbo].[Table1] AS [Extent1]
    INNER JOIN [dbo].[Table2] AS [Extent2] ON [Extent1].[Table2Id] = [Extent2].[Table2Id]
    LEFT OUTER JOIN [dbo].[Table3] AS [Extent3] ON [Extent1].[Table3Id] = [Extent3].[Table3Id]
    LEFT OUTER JOIN [dbo].[Table4] AS [Extent4] ON [Extent1].[Table4Id] = [Extent4].[Table4Id]
    LEFT OUTER JOIN [dbo].[Table5] AS [Extent5] ON [Extent1].[Table5Id] = [Extent5].[Table5Id]
    WHERE [Extent1].[Table1Id] = @p__linq__0
)  AS [Limit1]
3
What is the problem? Doesn't the query return the expected result with all included entities? I get the same with two Includes: An Inner Join and an Outer Join (and a Union of both). I've never noticed that but I don't see a wrong result.Slauma
We are not enforcing the data to exist in Table2, thus the inner join is causing an empty result when searching for Table1Id with a valid Id. None of the tables are having the relationship enforced so inner join on ANY of the eager loading will cause a "Sequence not found" unless the matching data were to have been put in the query for the joined table.Jamie Altizer
Why does it do the the SELECT TOP 2? That seems odd.mattruma
@mattruma: SELECT TOP 2 is always created when you use Single (to check if there is more than 1 record). It's normal.Slauma
If you are not enforcing data on Table2 you are doing it wrong because your relation says that every record in Table1 must have related record in Table2.Ladislav Mrnka

3 Answers

3
votes

in EF when doing IQueryable.Include() if none of the navigation properties are based on an enforced relationship then EF will use the first table. It expects that at least one of the relationships is enforced in the schema and that one should be coded with the IQueryable.Include() first, then add the other tables with Include()

23
votes

EF seems to use INNER JOIN for including a required and LEFT OUTER JOIN for including an optional navigation property. Example:

public class Order
{
    public int Id { get; set; }
    public string Details { get; set; }
    public Customer Customer { get; set; }
}

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; }
}

If I define Customer as a required property on Order...

public class MyContext : DbContext
{
    public DbSet<Order> Orders { get; set; }
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Order>()
            .HasRequired(o => o.Customer)
            .WithMany();
    }
}

...and issue this query...

using (var ctx = new MyContext())
{
    var result = ctx.Orders
        .Include(o => o.Customer)
        .Where(o => o.Details == "Peanuts")
        .FirstOrDefault();
}

...I get this SQL:

SELECT TOP (1) 
[Extent1].[Id] AS [Id], 
[Extent1].[Details] AS [Details], 
[Extent2].[Id] AS [Id1], 
[Extent2].[Name] AS [Name]
FROM  [dbo].[Orders] AS [Extent1]
INNER JOIN [dbo].[Customers] AS [Extent2] 
    ON [Extent1].[Customer_Id] = [Extent2].[Id]
WHERE N'Peanuts' = [Extent1].[Details]

If I change in the model configuration .HasRequired(o => o.Customer) to...

.HasOptional(o => o.Customer)

... I get exactly the same query except that INNER JOIN [dbo].[Customers] AS [Extent2] is replaced by:

LEFT OUTER JOIN [dbo].[Customers] AS [Extent2]

From model viewpoint it makes sense because you are saying that there can never be an Order without a Customer if you define the relationship as required. If you circumvent this requirement by removing the enforcement in the database and if you actually have then orders without a customer you violate your own model definition.

Only solution is likely to make the relationship optional if you have that situation. I don't think it is possible to control the SQL that is created when you use Include.

1
votes

How to force Entity Framework to do Inner Joins if you have a table structure such that:

  • Students can have Schedules, but don't have to
  • Schedules can have classes, but don't have to
  • Classes MUST have Curriculums
  • Curriculums MUST have Tests

When you want to look up Students who have passed particular Tests, you would logically do something like:

var studentsWhoPassed = context.Set<StudentEntity>()
    .Where(x => x.Something)
    .Include(x => x.Schedules.Select(y => y.Classes.Select(z => z.Tests)))
    .Etc().Etc()

The point being that you start with a StudentEntity and put in some conditions based on the joins down the chain. But because a Student to Schedule is optional, E.F. generates LEFT OUTER Joins.

Instead, you should start lower down the chain and build up. For example:

var studentsWhoPassed = context.Set<ClassEntity>()
    .Where(class => class.Tests.Any(test => test.Status == Status.Passed)
        && class.Schedule.Student.Something == studentSomething)
    .Include(class => class.Schedule.Student)

It is weird to start with a Class when you are trying to query for Students with Test Criteria. But it actually makes the LINQ simpler.

Because of the Student does not have to have a Schedule, but... a Class has to have Test(s), and a Class must have a ScheduleID, and Schedules must have a StudentID, you get Inner Joins all the way around.

Granted, this school example is abstract, but the idea holds true to other examples I have worked with the same types of relationships.