I've found the variety of questions related to loading using .Include, etc, but it seems like SQL shapes that query into a huge join which means if I'm getting Customer information, and that customer owns 1000 items and I do
context.Customers.Include(c=> c.Inventory).Where(c=>c.ID == id);
//side note, WHY can't I use .Find() after Include?
I'll get 1000 rows of the same customer information duplicated along with the item information rather than 1 customer record and 1000 items in a multi-table set. This seems really inefficient and leads to some really slow queries.
So if I have a set of customers:
//get all customers in TX
var texasCustomers = context.Customers.Where(c=> c.State == "TX");
And I want to loop over them in an export to XLSX:
foreach (var c in texasCustomers) {
//compile row per item SKU
foreach(var sku in c.Inventory.GroupBy(i=>i.SKU)) {
xlsx.SetCellValue(row, col, c.Name);
//output more customer info here
xlsx.SetCellValue(row, col, sku.Key);
xlsx.SetCellValue(row, col, sku.Sum(i=>i.Quantity));
}
}
This generates a query to the 'Inventory' table PER customer. Which is a fast query, but when you do the SAME query 1000 times, it gets annoyingly slow.
So I've done things like this:
//get customer key
var customerids = texasCustomers.Select(c=> c.ID).ToArray();
var inventories = context.Inventories.Where(i=> customerids.Contains(i.CustomerID)).ToList();
...and now my export loop looks more like this... the inner loop operating on the nav property from the first example becomes an in-memory linq filter against the prebuilt list of inventory objects:
foreach (var c in texasCustomers) {
//compile row per item SKU
foreach(var sku in inventories.Where(i=> i.CustomerID == c.ID)) {
xlsx.SetCellValue(row, col, c.Name);
//output more customer info and then the sku info
xlsx.SetCellValue(row, col, sku.Key);
xlsx.SetCellValue(row, col, sku.Sum(i=>i.Quantity));
}
}
This successfully gets around the 'query per loop' issue, but has obvious downsides... and just feels wrong.
So, what am I missing? Where's the secret EF features that let me do something like:
texasCustomers.LoadAll(c=> c.Inventories);
to "populate" all of the collection member's navigational properties in one go? Or am I approaching the problem from the wrong angle?
Is there a way to struture the query to get EF to generate SQL that doesn't turn into a single giant denormalized table?