0
votes

we have a bit of a problem. We've builded a GWT application on top of our two Alfresco instances. The application should work like this:

  • User search a document
  • Our web app spam two same queries against two repositories, wait for both results and expose a merged resultset.

This is true in case the search is for a specific documento (number id for example) or 10, 20, 50 documents (we don't know when this begins to act strange).

If the query is a consistent one (like all documents from last month, there should be about 30-60k/month) obviously the limit of cmis query (500) stops before. BUT, if the user hits "search" the first time, after a while, the resultset is composed of 2 documents. And if the users hits "search" right after that again, with the same query, the resultset is exposed almost immediately and there are 500 documents listed.

What the heck is wrong? Does CMIS caches results in some way? How do big CMIS queries work? Thanks A.

3
did you try to add a orderBy clause? - alfrescian
yes, even in that case it happens anyway. - Teqnology
Are you using Apache Chemistry in your GWT app? Which Alfresco version? - alfrescian
Yes, we're using Apache Chemistry and Alfresco 3.4.7 Enterprise, both. We can't make the upgrade because we don't have Alfresco 4 license. - Teqnology

3 Answers

2
votes

As you mentioned you're using Apache Chemistry. Chemistry has a clientside caching mechanism: http://chemistry.apache.org/java/how-to/how-to-tune-perfomance.html

2
votes

I suspect this is not CMIS related at all but is instead due to the Alfresco Lucene "max permission check" problem. At a high-level, there is a config setting for the maximum number of permission checks that Alfresco will do against a search result set. There is also a limit to the total amount of time it will spend performing such checks. These limits are configured in the repository properties file as:

# The maximum time spent pruning results

system.acl.maxPermissionCheckTimeMillis=10000

# The maximum number of results to perform permission checks against

system.acl.maxPermissionChecks=1000 

The first time you run a search the server begins performing these checks and hits the limit. It then returns the search results it was able to filter. Now the permission cache is populated so the next time you run the search the results come back much faster and the result set is larger.

Searches in Alfresco are non-deterministic--you cannot guarantee that, for large result sets, you will get back the exact same result set every time, regardless of how big you make those settings.

If you are able to upgrade at some point you may find that configuring Alfresco to use Solr rather than Lucene could help alleviate this, but I'm not 100% sure it will.

0
votes

To disable security checks replace public SearchService with searchService. Public services have enforced security so with searchService you can avoid security checking.