0
votes

We're experiencing some performance issues with Alfresco 4.0.e and a mysql 5.5 backend.

Through a performance monitor I can see a lot of requests (not sure what is triggering these) getting hung up on DB with a response time of nearly 100-200s. In the extreme case its 3000secs.

All these requests originate locally from the alfresco application (not share gui which we have on a separate server). The client ip shows 127.0.0.1. URL = /alfresco/service/api/solr/metadata On drilling down further there seem to be 2 queries that take more than 100s each that look like follows-

select
        assoc.id                    as id,
        parentNode.id               as parentNodeId,
        parentNode.version          as parentNodeVersion,
        parentStore.protocol        as parentNodeProtocol,
        parentStore.identifier      as parentNodeIdentifier,
        parentNode.uuid             as parentNodeUuid,
        childNode.id                as childNodeId,
        childNode.version           as childNodeVersion,
        childStore.protocol         as childNodeProtocol,
        childStore.identifier       as childNodeIdentifier,
        childNode.uuid              as childNodeUuid,
        assoc.type_qname_id         as type_qname_id,
        assoc.child_node_name_crc   as child_node_name_crc,
        assoc.child_node_name       as child_node_name,
        assoc.qname_ns_id           as qname_ns_id,
        assoc.qname_localname       as qname_localname,
        assoc.is_primary            as is_primary,
        assoc.assoc_index           as assoc_index
    from
        alf_child_assoc assoc
        join alf_node parentNode on (parentNode.id = assoc.parent_node_id)
        join alf_store parentStore on (parentStore.id = parentNode.store_id)
        join alf_node childNode on (childNode.id = assoc.child_node_id)
        join alf_store childStore on (childStore.id = childNode.store_id)

    where
        parentNode.id = ?

Running an explain plan on this query with a parameter value that seemed to be causing the most havoc (~3000secs) here's what I saw-

    select_type table       type    possible_keys                       key               key_len   ref                       rows  Extra
    SIMPLE      parentNode  const   PRIMARY,store_id,fk_alf_node_store  PRIMARY           8         const                     1 
    SIMPLE      parentStore const   PRIMARY                             PRIMARY           8         const                     1 
    SIMPLE      childStore  index   PRIMARY                             protocol          454       NULL                      6     Using index
    SIMPLE      childNode   ref     PRIMARY,store_id,fk_alf_node_store  store_id          8         alfrescomgr.childStore.id 162579    
    SIMPLE      assoc       ref     parent_node_id,fk_alf_cass_pnode,   fk_alf_cass_cnode 8         alfrescomgr.childNode.id  1     Using where
                                    fk_alf_cass_cnode  

Note the rows = 162k. For a couple of such queries, the parent node seemed to point to a folder that stores thousands of small sized quote documents.

Our application pushes documents and queries them by some metadata attributes like customer id. We use the apache chemistry cmis api for the interaction.

  1. What do you think is triggering the solr queries. Its trying to load information on all children nodes.
  2. How can we optimize it if we can't control the solr piece.

Would really appreciate your help.

2
It's not generally recommended to put many thousands of files into one folder in Alfresco. What happens if you shard up your files into subdirectories, so there's only a few hundred in each shard folder? - Gagravarr
Also, bit late now, but PostGreSQL normally works quite a bit better than MySQL! - Gagravarr
Thanks for your input. The sub directory thing is something we're thinking about, but need to figure out a way to be able to move those files into appropriate folder w/o creating a performance nightmare in production. I wonder how Aurora would work instead esp looking at the Alfresco perf. benchmark they released in an aws paper. I wonder if community version would work ok with Aurora. - Amit Kapoor
If you upgraded from 4.0 to 5.0 (or possibly even 5.1 preview), you'd get quite a bit of performance improvements without needing to move things around. There's been lots of work done since 4.0 came out something like 4 years ago! - Gagravarr

2 Answers

2
votes

I have experienced similar performance issues on a 4.0.d instance (with MySQL) on Solaris a while back, and it turned out the default engine for the MySQL version I had had poor performance!

The issue went away once I indexed each and every column referenced in a join / where clause, and I got 98+ % performance gain !

0
votes

We found out 2 reasons behind the slowness

  • The disk i/o on the data volume of the mysql database was much slower compared to other systems. We used sysbench random i/o test to validate that.
  • The folder in question above has a few million documents which seem to cause this query to get slower.
    • Nor can we load the folder contents page from share to display anything (results in a timeout related error).
    • We're planning to restructure this folder to have further sub folders so as to limit docs in each folder. There are indications that this would help, but does not sound right to be doing something like this. We may not really have this control for other similar cases that might come up in the future.
    • I'm wondering if this would still be a limitation in newer versions of Alfresco say 5.x