3
votes

I'm working on a new storage system for a business solution package that consists of around 40 applications. Some of those applications generate documents (mostly docx, some pdf) that are currently saved and organized in a network share folder.

Applications generate about 150.000-200.000 documents a year in average, and those documents should be persisted in a more consistent and reliable form (ie. separate SQL database).

Sharepoint is a leading candidate, since we plan on using it's other features eventually, other then the DMS capabilities. I've read about document library limitations ie. 2000 files per folder with up to 1.000.000 files in all the folders of a document library. I've also read that the 2000 limit can be bypassed BUT it affects performance. What I haven't found is the real world experience with such a large number of files in one library. And what will happen if I increase the folder limit to 50.000 for example, what impact would that have on performance (slower requests for reading/editing/writing documents through web services, especially writing if it checks for duplicate file names, indexing, searching etc.).

One important note: We will not be using sharepoint web portal at all if we don't have to, but instead do everything through our applications via web services, so data view slower rendering is not the issue.

2

2 Answers

6
votes

You can have as many items in a document library as you want, as long as your last paragraph is true (you wont access the information trough the portal itself)

We have a working test of our DMS system with 7 million files on the same document library and under the same folder too. But we never go trough the portal to see that content, we consume those files using the SPWeb.GetFile(guid) method, and we have all the information related to them on another SQL Database (which stores the GUID of the file)

3
votes

the 2000 limit is not a hard limit, it is the max amount of files that should be in a view on the list.

If a listview includes more than 2000 items performance will start to go down. By adding indexed columns and creating filtered additional views on a list that don't exceed that limit of 2000 (give or take), the use of the portal itself is still ok.

Also, be careful with setting permissions on the files. Giving each file it's own permission set will degrade performance also, seeing as internally sharepoint will start to perform massive joins (in sql server) to determine who is allowed to see what.

One note: plan your infrastructure very, very well. especially the sql server (cluster) the content database is running on.