4
votes

in this article, microsoft explains why there is a 5000 item limit. But when do I access more than 5000 items?

Example: I have a list with 50 000 items and I execute a caml query with a row limit of 50. Does sharepoint lock the database or just 50 rows? How does sharepoint know if just some rows, or the whole database should be locked? Or depends it on the caml itself?

Does a lock affect the whole farm or just the current list because sharepoint has not for each list a own table?

5

5 Answers

1
votes
  1. Reading all items in Large List

In SharePoint 2010, when you execute SPQuery on Large List, you get exception "The attempted operation is prohibited because it exceeds the list view threshold enforced by the administrator". To avoid this exception and read list items by batch we can use Content Iterator.

There are lot of methods available with ContentIterator, here we discuss about http://msdn.microsoft.com/en-us/library/ee560760%28v=office.14%29.aspx

To use ContentIterator include Microsoft.Office.Server.dll available in 14/ISAPI/ and include namespace Microsoft.Office.Server.Utilities.

Advantage: Fetches list items as a batch, load is reduced. If indexed column condition return more value than List View Threshold, it handles by batch. Normal SPQuery fails in this condition. We can stop batch processing any time. Disadvantage: You cannot include Non-Indexed column in SPQuery condition.

//Run as console application
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SharePoint;
using Microsoft.Office.Server.Utilities;

namespace ContentIteratorListItemCollBatch
{
    class Sample
    {
        static int NumberOfBatch = 0, NumberOfItemsRead = 0, NumberOfException = 0;

    static void Main(string[] args)
    {
        using (SPSite site = new SPSite("your site url"))
        {
            using (SPWeb web = site.OpenWeb())
            {
                SPList list = web.GetList("Lists/LargeList/AllItems.aspx"); //your list url
                ContentIterator ci = new ContentIterator("Reading All Items");

                SPQuery qry = new SPQuery();
                qry.QueryThrottleMode = SPQueryThrottleOption.Strict; //Ensuring that all users come under List View Threshold. (Including farm admins / box administrators).

                qry.RowLimit = 2000; //Number of Items read in a batch. But it should be less than List View Threshold.

                qry.Query = qry.Query + ContentIterator.ItemEnumerationOrderByID; //Not Required, Include for faster output.  
                //Don't use ContentIterator.ItemEnumerationOrderByNVPField, it gets into infinite loop.

                ci.ProcessListItems(list, qry, ProcessItemColl, ProcessErrorColl);
                Console.WriteLine("\nBatch count: " + NumberOfBatch + "\n\nTotal number of items read: " + NumberOfItemsRead);
                Console.ReadLine();
            }
        }
    }

    static public bool ProcessErrorColl(SPListItemCollection itemColl, Exception e)
    {
        // process the error
        NumberOfException++;
        return true;
    }
    static public void ProcessItemColl(SPListItemCollection itemColl)
    {
        //Work on the ListItem Collection object with your own condition
        //foreach (SPListItem item in itemColl)
        //{

        //}
        Console.WriteLine("Number of Items Read: " + itemColl.Count);
        NumberOfBatch++;
        NumberOfItemsRead += itemColl.Count;
    }
}

}

My large list contains 25,000 items. you can see from output, that it read 25,000 items by batch of 2000 items.

Output

Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 1000

Batch count: 13

Total number of items read: 25000
  1. Reading items in a Large List with condition

Ensure following conditions have been met. Only Indexed column is allowed in where condition. You should include ContentIterator.ItemEnumerationOrderByNVPField. In below code, Title is a indexed column. As soon as custom list is created, title is made as index column.

//Run as Console Application

using System; using System.Collections.Generic; using System.Linq; using System.Text; using Microsoft.SharePoint; using Microsoft.Office.Server.Utilities;

namespace ContentIteratorListItemCollBatch
{
    class Sample
    {
    static int NumberOfBatch = 0, NumberOfItemsRead = 0, NumberOfException = 0;

    static void Main(string[] args)
    {
        using (SPSite site = new SPSite("your site url"))
        {
            using (SPWeb web = site.OpenWeb())
            {
                SPList list = web.GetList("Lists/LargeList/AllItems.aspx"); //your list url
                ContentIterator ci = new ContentIterator("Reading All Items");

                SPQuery qry = new SPQuery();
                qry.QueryThrottleMode = SPQueryThrottleOption.Strict; //Ensuring that all users come under List View Threshold. (Including farm admins / box administrators).

                qry.RowLimit = 2000; //Number of Items read in a batch. But it should be less than List View Threshold.

                qry.Query = @"<Where><Contains><FieldRef Name='Title' /><Value Type='Text'>9</Value></Contains></Where>";

                qry.Query = qry.Query + ContentIterator.ItemEnumerationOrderByNVPField;
                //Have to include this line.

                ci.ProcessListItems(list, qry, ProcessItemColl, ProcessErrorColl);
                Console.WriteLine("\nBatch count: " + NumberOfBatch + "\n\nTotal number of items read: " + NumberOfItemsRead);
                Console.ReadLine();
            }
        }
    }

    static public bool ProcessErrorColl(SPListItemCollection itemColl, Exception e)
    {
        // process the error
        NumberOfException++;
        return true;
    }
    static public void ProcessItemColl(SPListItemCollection itemColl)
    {
        //Work on the ListItem Collection object with your own condition
        //foreach (SPListItem item in itemColl)
        //{

        //}
        Console.WriteLine("Number of Items Read: " + itemColl.Count);
        NumberOfBatch++;
        NumberOfItemsRead += itemColl.Count;
    }
}

}

In SPQuery, If indexed field fetches more than List View Threshold limit, it will fail. The ContentIterator handles it by batch processing.

Output
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 2000
Number of Items Read: 233

Batch count: 5

Total number of items read: 8233
0
votes

First, the question about "whole farm" - you can have multiple databases in your farm. When you create a new site collection (the one represented by an SPSite object), you can choose a database for its storage. Each database has its own AllUserData table. So, the maximum scope you can lock by a query, is this table. Of course, if all your sites are located in a single database, you could potentially lock everything.

Answering to the second question - it's complicated. There are cases when you don't intend to load more than 5000 rows, but the computation requires it. And SharePoint kills your query. For example, you have a large list of items. You apply a filter which returns small number of items. It works. Now, you add a "group by" on some of the fields - boom, it does not work. Also, for instance, the default "All items" view becomes problematic. You can use it, as it returns items in blocks of 30 items. BUT if you want to filter by a particular column - the filter does not work, as it exceeds the 5000 limit.

0
votes

When you are querying items across lists in a site collection you have a couple of options.

The fastest for retrieving data would be to index the lists and return the items using search. You can customize the ranking model and return your top 50 items. The downside is the data would only be a fresh as your last crawl.

The other option using custom code is to use a cross list query with the SPSiteDataQuery object.

This would allow you to return your top 50 items across different lists.

In terms of querying a large lists above the 5000 limit, it can be done either by increasing the threshold or by executing the query as admin but these are not really recommended. You can put indexes on columns so large list access is faster. When using the object model you can do what you like but you might want to measure the performance impact. Databases in SharePoint are really an abstract concept from a data point of view, you should concentrate more on how to do it the "SharePoint" way rather than how SharePoint itself is accessing the underlying data.

0
votes

I have a list with 20k records and threshold is set to be 200k(MS says 5k). I tested simultaneously with 4 users hitting listview page that returns 20 k records but with paging as 30 items per page. Nothing breaks. Is this 5k limit actually locks this entire list when read is going on? If that the case how come 3 users are able to visit same listview page at same time?

0
votes

Since you are returning only 30 rows, so list view threshold will never be hit. But if you try to apply order by, then for doing that orderby, it has to scan all 20 k records, although only 30 are displayed. So in this case, the threshold will be hit.