2
votes

Whenever I use LDAP in a web application it causes a classloader leak, and the strange thing is profilers don’t find any GC roots.

I’ve created a simple web application that demonstrates the leak, it only includes this class:

@WebListener
public class LDAPLeakDemo implements ServletContextListener {
    public void contextInitialized(ServletContextEvent sce) { 
        useLDAP();
    }

    public void contextDestroyed(ServletContextEvent sce) {}

    private void useLDAP() {
        Hashtable<String, Object> env = new Hashtable<String, Object>();
        env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.ldap.LdapCtxFactory");
        env.put(Context.PROVIDER_URL, "ldap://ldap.forumsys.com:389");
        env.put(Context.SECURITY_AUTHENTICATION, "simple");
        env.put(Context.SECURITY_PRINCIPAL, "cn=read-only-admin,dc=example,dc=com");
        env.put(Context.SECURITY_CREDENTIALS, "password");
        try {
            DirContext ctx = null;
            try {
                ctx = new InitialDirContext(env);
                System.out.println("Created the initial context");
            } finally {
                if (ctx != null) {
                    ctx.close(); 
                    System.out.println("Closed the context");
                }
            }
        } catch (NamingException e) {
            e.printStackTrace();
        }
    }
}

The source code is available here. I’m using a public LDAP test server for this example, so it should work for everyone if you want to try it. I tried it with the latest JDK 7 and 8 and Tomcat 7 and 8 with the same result – when I click on Reload in Tomcat Web Application Manager and then on Find leaks, Tomcat reports that there’s a leak and profilers confirm it.

The leak is barely noticeable in this example, but it causes OutOfMemory in a big web application. I didn’t find any open JDK bugs about it.

UPDATE 1

I've tried to use Jetty 9.2 instead of Tomcat and I still see the leak, so it's not Tomcat's fault. Either it's a JDK bug or I'm doing something wrong.

UPDATE 2

Even though my example demonstrates the leak, it doesn’t demonstrate the out of memory error, because it has very small PermGen footprint. I’ve created another branch that should be able to reproduce OutOfMemoryError. I just added Spring, Hibernate and Logback dependencies to the project to increase PermGen consumption. These dependencies have nothing to do with the leak and I could have used any others instead. The only purpose of those is to make PermGen consumption big enough to be able to get OutOfMemoryError.

Steps to reproduce OutOfMemoryError:

  1. Download or clone the outofmemory-demo branch.

  2. Make sure you have JDK 7 and any version of Tomcat and Maven (I used the latest versions - JDK 1.7.0_79 and Tomcat 8.0.26).

  3. Decrease PermGen size to be able to see the error after the first reload. Create setenv.bat (Windows) or setenv.sh (Linux) in Tomcat’s bin directory and add set "JAVA_OPTS=-XX:PermSize=24m -XX:MaxPermSize=24m" (Windows) or export "JAVA_OPTS=-XX:PermSize=24m -XX:MaxPermSize=24m" (Linux).

  4. Go to Tomcat’s conf directory, open tomcat-users.xml and add <role rolename="manager-gui"/><user username="admin" password="1" roles="manager-gui"/> inside <tomcat-users></ tomcat-users> to be able to use Tomcat Web Application Manager.

  5. Go to project’s directory and use mvn package to build a .war.

  6. Go to Tomcat’s webapps directory, delete everything except the manager directory and copy the .war here.

  7. Run Tomcat’s start script (bin\startup.bat or bin/startup.sh) and open http://localhost:8080/manager/, use username admin and password 1.

  8. Click on Reload and you should see java.lang.OutOfMemoryError: PermGen space in Tomcat's console.

  9. Stop Tomcat, open project’s source file src\main\java\org\example\LDAPLeakDemo.java, remove the useLDAP(); call and save it.

  10. Repeat steps 5-8, only this time there’s no OutOfMemoryError, because the LDAP code is never called.

2
Tomcat reports exactly what?user207421
@EJP It shows its standard leak message - "The following web applications were stopped (reloaded, undeployed), but their classes from previous runs are still loaded in memory, thus causing a memory leak (use a profiler to confirm)".John29
That's because of a prior condition. You need to show us that. Retaining classes in memory isn't any kind of a memory leak over a long run. It is both normal and essential. Your growing memory problem lies elsewhere.user207421
@EJP What prior condition are you talking about? This example project contains only the class from my question. You can download the project from the link I gave and reproduce it. I know that the garbage collector doesn't always collect old classloaders, but if there's not enough space in PermGen, it must collect it to avoid out of memory error. Otherwise it just can't collect it due to a leak.John29

2 Answers

1
votes

First of all: Yes, the LDAP API provided by Sun/Oracle can trigger ClassLoader leaks. It is on my list of known offenders, because if system property com.sun.jndi.ldap.connect.pool.timeout is > 0 com.sun.jndi.ldap.LdapPoolManager will spawn a new thread running in the web app that first invoked LDAP.

That being said, I added your example code as a test case in my ClassLoader Leak Prevention library, so that I'd get an automatic heap dump of the leak. According to my analysis, there is in fact no leak in your code, however it does seem to take more than one Garbage Collector cycle to get the ClassLoader in question GC:ed (probably due to transient references - haven't dug into it that much). This probably tricks Tomcat into believing there is a leak, even if there is none.

However, since you say you eventually get an OutOfMemoryError, either I'm wrong or there is something else in your app causing these leaks. If you add my ClassLoader Leak Prevention library to your app, does it still leak/cause OOMEs? Does the Preventor log any warnings?

If you set up your application server to create a heap dump whenever there is an OOME, you can look for the leak using Eclipse Memory Analyzer. I've explained the process in detail here.

1
votes

It's been a while since I posted this question. I finally found what really happened, so I thought I post it as the answer in case @MattiasJiderhamn or others are interested.

The reason profilers didn’t find any GC roots was because JVM was hiding the java.lang.Throwable.backtrace field as described in https://bugs.openjdk.java.net/browse/JDK-8158237. Now that this limitation is gone I was able to get the GC root:

this     - value: org.apache.catalina.loader.WebappClassLoader #2
 <- <classLoader>     - class: org.example.LDAPLeakDemo, value: org.apache.catalina.loader.WebappClassLoader #2
  <- [10]     - class: java.lang.Object[], value: org.example.LDAPLeakDemo class LDAPLeakDemo
   <- [2]     - class: java.lang.Object[], value: java.lang.Object[] #3394
    <- backtrace     - class: javax.naming.directory.SchemaViolationException, value: java.lang.Object[] #3386
     <- readOnlyEx     - class: com.sun.jndi.toolkit.dir.HierMemDirCtx, value: javax.naming.directory.SchemaViolationException #1
      <- EMPTY_SCHEMA (sticky class)     - class: com.sun.jndi.ldap.LdapCtx, value: com.sun.jndi.toolkit.dir.HierMemDirCtx #1

The cause of this leak is the LDAP implementation in JDK. The com.sun.jndi.ldap.LdapCtx class has a static filed

private static final HierMemDirCtx EMPTY_SCHEMA = new HierMemDirCtx();

com.sun.jndi.toolkit.dir.HierMemDirCtx contains the readOnlyEx field that is assigned to an instance of javax.naming.directory.SchemaViolationException during the LDAP initialization that happens after the new InitialDirContext(env) call in the code from my question. The issue is java.lang.Throwable, which is the superclass of all exceptions including javax.naming.directory.SchemaViolationException, has the backtrace field. This field contains references to all classes in the stacktrace at the time the constructor was called, including my own org.example.LDAPLeakDemo class, which in turn holds a reference to the web application classloader.

Here's a similar leak that was fixed in Java 9 https://bugs.openjdk.java.net/browse/JDK-8146961