Manage Connection Pooling in multi-tenant web app with Spring, Hibernate and C3P0

Question

I'm trying to setup a multi-tenant web application, with (ideally) possibility for both Database-separated and Schema-separated approach at the same time. Although I'm going to start with Schema separation. We're currently using:

Spring 4.0.0
Hibernate 4.2.8
Hibernate-c3p0 4.2.8 (which uses c3p0-0.9.2.1)
and PostgreSQL 9.3 (which I doubt it really matters for the overall architecture)

Mostly I followed this thread (because of the solution for @Transactional). But I'm kinda lost in implementing MultiTenantContextConnectionProvider. There is also this similar question asked here on SO, but there are some aspects that I can't figure out:

1) What happens to Connection Pooling? I mean, is it managed by Spring or Hibernate? I guess with ConnectionProviderBuilder - or as suggested - any of its implementation, Hibernate is the guy who manages it.
2) Is it a good approach that Spring does not manage Connection Pooling? or Is it even possible that Spring does manage it?
3) Is this the right path for future implementing of both Database and Schema separation?

Any comments or descriptions are totally appreciated.

application-context.xml

<beans>
    ...
    <bean id="dataSource" class="org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy">
        <property name="targetDataSource" ref="c3p0DataSource" />
    </bean>

    <bean id="c3p0DataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close">
        <property name="driverClass" value="org.postgresql.Driver" />
        ... other C3P0 related config
    </bean>

    <bean id="sessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
        <property name="packagesToScan" value="com.webapp.domain.model" />

        <property name="hibernateProperties">
            <props>
                <prop key="hibernate.dialect">org.hibernate.dialect.PostgreSQLDialect</prop>
                <prop key="hibernate.default_schema">public</prop>

                <prop key="hibernate.multiTenancy">SCHEMA</prop>
                <prop key="hibernate.tenant_identifier_resolver">com.webapp.persistence.utility.CurrentTenantContextIdentifierResolver</prop>
                <prop key="hibernate.multi_tenant_connection_provider">com.webapp.persistence.utility.MultiTenantContextConnectionProvider</prop>
            </props>
        </property>
    </bean>

    <bean id="transactionManager" class="org.springframework.orm.hibernate4.HibernateTransactionManager">
        <property name="autodetectDataSource" value="false" />
        <property name="sessionFactory" ref="sessionFactory" />
    </bean>

   ...
</beans>

CurrentTenantContextIdentifierResolver.java

public class CurrentTenantContextIdentifierResolver implements CurrentTenantIdentifierResolver {
    @Override
    public String resolveCurrentTenantIdentifier() {
        return CurrentTenantIdentifier;  // e.g.: public, tid130, tid456, ...
    }

    @Override
    public boolean validateExistingCurrentSessions() {
        return true;
    }
}

MultiTenantContextConnectionProvider.java

public class MultiTenantContextConnectionProvider extends AbstractMultiTenantConnectionProvider {
    // Do I need this and its configuratrion?
    //private C3P0ConnectionProvider connectionProvider = null;

    @Override
    public ConnectionProvider getAnyConnectionProvider() {
        // the main question is here.
    }

    @Override
    public ConnectionProvider selectConnectionProvider(String tenantIdentifier) {
        // and of course here.
    }
}

Edit

Regarding the answer of @ben75:

This is a new implementation of MultiTenantContextConnectionProvider. It no longer extends AbstractMultiTenantConnectionProvider. It rather implements MultiTenantConnectionProvider, to be able to return [Connection][4] instead of [ConnectionProvider][5]

public class MultiTenantContextConnectionProvider implements MultiTenantConnectionProvider, ServiceRegistryAwareService {
    private DataSource lazyDatasource;;

    @Override
    public void injectServices(ServiceRegistryImplementor serviceRegistry) {
        Map lSettings = serviceRegistry.getService(ConfigurationService.class).getSettings();

        lazyDatasource = (DataSource) lSettings.get( Environment.DATASOURCE );
    }

    @Override
    public Connection getAnyConnection() throws SQLException {
        return lazyDatasource.getConnection();
    }

    @Override
    public Connection getConnection(String tenantIdentifier) throws SQLException {
        final Connection connection = getAnyConnection();

        try {
            connection.createStatement().execute("SET SCHEMA '" + tenantIdentifier + "'");
        }
        catch (SQLException e) {
            throw new HibernateException("Could not alter JDBC connection to specified schema [" + tenantIdentifier + "]", e);
        }

        return connection;
    }
}

It would be useful to know why you want to do this and/or what are your requirements/concerns (ie compliance, performance, selling point) . With the "cloud" these days many people just solve multi-tenancy by booting up separate images and thus have true separation which is often required for compliance reason and thus you do not need multitenancy at the DataSource level. The other option is just to have one massive schema that is multi-tenant aware and partition/shard your application based on performance reasons. — Adam Gent
@AdamGent The main reason I'm not going to use separate instances for each customer is that we're targeting for 10K-50K+ customer. To best of my knowledge having that much "separate instances" would cost way more than split them via load-balanced, multi-tenant cluster of single-instance application. For the second part of your comment, for the sake of flexibility, we're going to provide alternatives approach for data layer. If a customer wants a separate database, and they are willing to pay extra charge/month for it, so be it. — Khosrow

ben75 ben75 · Accepted Answer · 2014-01-26T13:53:26

You can choose between 3 different strategies that will impact connection polling. In any case you have to provide an implementation of MultiTenantConnectionProvider. The strategy you choose will of course impact your implementation.

General remark about MultiTenantConnectionProvider.getAnyConnection()

getAnyConnection() is required by hibernate to collect metadata and setup the SessionFactory. Usually in a multi-tenant architecture you have a special/master database (or schema) not used by any tenant. It's a kind of template database (or schema). It's ok if this method returns a connection to this database (or schema).

Strategy 1 : each tenant have it's own database. (and so it's own connection pool)

In this case, each tenant have it's own connection pool managed by C3PO and you can provide an implementation of MultiTenantConnectionProvider based on AbstractMultiTenantConnectionProvider

Every tenant have it's own C3P0ConnectionProvider, so all you have to do in selectConnectionProvider(tenantIdentifier) is to return the correct one. You can keep a Map to cache them and you can lazy-initialize a C3POConnectionProvider with something like :

private ConnectionProvider lazyInit(String tenantIdentifier){
    C3P0ConnectionProvider connectionProvider = new C3P0ConnectionProvider();
    connectionProvider.configure(getC3POProperties(tenantIdentifier));
    return connectionProvider;
}

private Map getC3POProperties(String tenantIdentifier){
    // here you have to get the default hibernate and c3po config properties 
    // from a file or from Spring application context (there are good chances
    // that those default  properties point to the special/master database) 
    // and alter them so that the datasource point to the tenant database
    // i.e. : change the property hibernate.connection.url 
    // (and any other tenant specific property in your architecture like :
    //     hibernate.connection.username=tenantIdentifier
    //     hibernate.connection.password=...
    //     ...) 
}

Strategy 2 : each tenant have it's own schema and it's own connection pool in a single database

This case is very similar to the first strategy regarding ConnectionProvider implementation since you can also use AbstractMultiTenantConnectionProvider as base class to implement your MultiTenantConnectionProvider

The implementation is very similar to the suggested implementation for Strategy 1 except that you must alter the schema instead of the database in the c3po configuration

Strategy 3 : each tenant have it's own schema in a single database but use a shared connection pool

This case is slightly different since every tenant will use the same connection provider (and so the connection pool will be shared). In the case : the connection provider must set the schema to use prior to any usage of the connection. i.e. You must implement MultiTenantConnectionProvider.getConnection(String tenantIdentifier) (i.e. the default implementation provided by AbstractMultiTenantConnectionProvider won't work).

With postgresql you can do it with :

 SET search_path to <schema_name_for_tenant>;

or using the alias

 SET schema <schema_name_for_tenant>;

So here is what your getConnection(tenant_identifier); will look like:

@Override
public Connection getConnection(String tenantIdentifier) throws SQLException {
    final Connection connection = getAnyConnection();
    try {
        connection.createStatement().execute( "SET search_path TO " + tenanantIdentifier );
    }
    catch ( SQLException e ) {
        throw new HibernateException(
                "Could not alter JDBC connection to specified schema [" +
                        tenantIdentifier + "]",
                e
        );
    }
    return connection;
}

Useful reference is here (official doc)

Other useful link C3POConnectionProvider.java

You can combine strategy 1 and strategy 2 in your implementation. You just need a way to find the correct connection properties/connection url for the current tenant.

EDIT

I think that the choice between strategy 2 or 3 depends on the traffic and the number of tenants on your app. With separate connection pools : the amount of connections available for one tenant will be much lower and so: if for some legitime reason one tenant need suddenly many connections the performance seen by this particular tenant will drastically decrease (while the other tenant won't be impacted).

On the other hand, with strategy 3, if for some legitime reason one tenant need suddenly many connections: the performance seen by every tenant will decrease.

In general , I think that strategy 2 is more flexible and safe : every tenant cannot consume more than a given amount of connection (and this amount can be configured per tenant if you need it)

Manage Connection Pooling in multi-tenant web app with Spring, Hibernate and C3P0

2 Answers