5
votes

I have LINQ query that returns me the following error: "The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100".

All I need is to count all clients that have BirthDate that I have their ID's in list. My list of client ID's could be huge (millions of records).

Here is the query:

List<int> allClients = GetClientIDs();

int total = context.Clients.Where(x => allClients.Contains(x.ClientID) && x.BirthDate != null).Count();

When the query is rewritten this way,

int total = context
    .Clients
    .Count(x => allClients.Contains(x.ClientID) && x.BirthDate != null);

it causes the same error.

Also tried to make it in different way and it eats all memory:

List<int> allClients = GetClientIDs();

total = (from x in allClients.AsQueryable()
         join y in context.Clients
         on x equals y.ClientID
         where y.BirthDate != null
         select x).Count();
5
Where does GetClientIDs get its data from? If it's pulling it from the db you might want to combine that into your query instead of pulling all the ids to use in a separate query. - juharr
I think it is impossible to do in this form. allClients.Contains(x.ClientID) forces query to work locally. - Hamlet Hakobyan
@HamletHakobyan No, allClients can be an IQueryable that contains a linq-to-sql query. If so, it would be incorporated as SQL in the total query. - Gert Arnold
@GertArnold First casting allClients.AsQueryable() not guaranty that allClients is IQueryable. Anyway, even allClients is IQueryable it not guaranty possibly to translate whole LINQ query to SQL query. - Hamlet Hakobyan
@HamletHakobyan Maybe. But juharr's suggestion is worth the try. Konstantin, if that's impossible I think your only option is to do the query in chunks. - Gert Arnold

5 Answers

1
votes

We ran into this same issue at work. The problem is that list.Contains() creates a WHERE column IN (val1, val2, ... valN) statement, so you're limited to how many values you can put in there. What we ended up doing was in fact do it in batches much like you did.

However, I think I can offer you a cleaner and more elegant piece of code to do this with. Here is an extension method that will be added to the other Linq methods you normally use:

public static IEnumerable<IEnumerable<T>> BulkForEach<T>(this IEnumerable<T> list, int size = 1000)
{
    for (int index = 0; index < list.Count() / size + 1; index++)
    {
        IEnumerable<T> returnVal = list.Skip(index * size).Take(size).ToList();
        yield return returnVal;
    }
}

Then you use it like this:

foreach (var item in list.BulkForEach())
{
    // Do logic here. item is an IEnumerable<T> (in your case, int)
}  

EDIT
Or, if you prefer, you can make it act like the normal List.ForEach() like this:

public static void BulkForEach<T>(this IEnumerable<T> list, Action<IEnumerable<T>> action, int size = 1000)
{
    for (int index = 0; index < list.Count() / size + 1; index++)
    {
        IEnumerable<T> returnVal = list.Skip(index * size).Take(size).ToList();
        action.Invoke(returnVal);
    }
}

Used like this:

list.BulkForEach(p => { /* Do logic */ });
0
votes

Well as Gert Arnold mentioned before, making query in chunks solves the problem, but it looks nasty:

List<int> allClients = GetClientIDs();

int total = 0;

const int sqlLimit = 2000;

int iterations = allClients.Count() / sqlLimit;

for (int i = 0; i <= iterations; i++)
{
    List<int> tempList = allClients.Skip(i * sqlLimit).Take(sqlLimit).ToList();

    int thisTotal = context.Clients.Count(x => tempList.Contains(x.ClientID) && x.BirthDate != null);

    total = total + thisTotal;
}
0
votes

As has been said above, your query is probably being translated to:

select count(1)
from Clients
where ClientID = @id1 or ClientID = @id2 -- and so on up to the number of ids returned by GetClientIDs.

You will need to change your query such that you aren't passing so many parameters to it.

To see the generated SQL you can set the Clients.Log = Console.Out which will cause it to be written to the debug window when it is executed.

EDIT:

A possible alternative to chunking would be to send the IDs to the server as a delimited string, and create a UDF in your database which can covert that string back to a list.

var clientIds = string.Jon(",", allClients);

var total = (from client in context.Clients
            join clientIds in context.udf_SplitString(clientIds)
                on client.ClientId equals clientIds.Id
            select client).Count();

There are lots of examples on Google for UDFs that split strings.

0
votes

Another alternative and probably the fastest at query time is to add your numbers from the CSV file into a temporary table in your database and then do a join query.

Doing a query in chunks means a lot of round-trips between your client and database. If the list of IDs you are interested in is static or changes rarely, I recommend the approach of a temporary table.

0
votes

If you don't mind moving the work from the database to the application server and have the memory, try this.

int total = context.Clients.AsEnumerable().Where(x => allClients.Contains(x.ClientID) && x.BirthDate != null).Count();