18
votes

I need some help with my configuration of Puma (Multi-Thread+Multi-Core Server) on my RoR4 Heroku app. The Heroku docs on that are not quite up-to-date. I followed this one: Concurrency and Database Connections for the configuration, which does not mention the configuration for a Cluster, so I had to use both types together (threaded and multicore).

My current configuration:

./Procfile

web: bundle exec puma -p $PORT -C config/puma.rb

./config/puma.rb

environment production
threads 0,16

workers 4
preload_app!

on_worker_boot do
  ActiveRecord::Base.connection_pool.disconnect!

  ActiveSupport.on_load(:active_record) do
    config = Rails.application.config.database_configuration[Rails.env]
    config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
    config['pool']              = ENV['DB_POOL'] || 5
    ActiveRecord::Base.establish_connection
  end
end

Questions:

a) Do I need the before_fork / after_fork configuration like in Unicorn, since the Cluster workers are forked?.
b) How do I tune my thread count depending on my application - what would be the reason to drop it down? / In what cases would it make a difference? Isn't 0:16 already optimized?
c) The Heroku database allows 500 connections. What would be a good value for DB_POOL depending on thread, worker and dyno count? - Does every thread per worker per dyno require a sole DB connection when working parallely?

In general: How should my configuration look like for concurrency and performance?

2
When it comes to tuning thread count. I read a tutorial on Unicorn worker tuning which suggested running an ab and increasing worker count (thread in your case) until theres a performance drop (requests take more time to finish). It's good to take a fairly dynamic page and see how different request/concurrent proportions act first (also have in mind that if you do many requests heroku might cut you off suspecting DoS)Mike Szyndel
@MichaelSzyndel So I basically have to go first though every worker, check the performance and then go through the threads and check again? Doesn't it depend on what exactly is requested?Miiller
From what I read somewhere Heroku has two cores (4 virtual) per dyno. It's optimal to have one process per dyno and then it's up to you how many threads to run per process. That I would test with ab. Have in mind also that if you pass 521MB of RAM Heroku will send alerts and it swaps at >1GB (confirm with heroku docs)Mike Szyndel
Which dyno type you use? You mentioned: Multi-Thread+Multi-Core Server does that means PX dyno ($500 per month)?nothing-special-here

2 Answers

27
votes

a) Do I need the before_fork / after_fork configuration like in Unicorn, since the Cluster workers are forked?.

Normally no, but since you're using preload_app, yes. Preloading the app gets an instance up and running and then forks the memory space for the workers; the result is your initializers only get ran once (possibly allocating db connections and such). In this instance, your on_worker_boot code is appropriate. If you're not using preload_app, then each worker boots itself, in which case using an initializer would be ideal for setting up the custom connection like you're doing. In fact, without preload_app, your on_worker_boot block would error out because at that point ActiveRecord and friends aren't even loaded.

b) How do I tune my thread count depending on my application - what would be the reason to drop it down? / In what cases would it make a difference? Isn't 0:16 already optimized?

On Heroku (and my testing) you're best of matching your min/max threads, with max <= DB_POOL setting. The min threads allows your application to spin down resources when not under load, which is normally great to free up resources on the server, but likely less needed on Heroku; that dyno is already dedicated to serving web requests, may as well have them up and ready. While setting your max threads <= your DB_POOL environment variable isn't required, you run the risk of consuming all your database connections in the pool, then you have a thread wanting a connection but can't get it, and you can get the old "ActiveRecord::ConnectionTimeoutError - could not obtain a database connection within 5 seconds." error. This depends on your application though, you very well could have max > DB_POOL and be fine. I would say your DB_POOL should be at least the same as your min threads value, even though your connections are not eagerly loaded (5:5 threads wont open 5 connections if your app never hits the database).

c) The Heroku database allows 500 connections. What would be a good value for DB_POOL depending on thread, worker and dyno count? - Does every thread per worker per dyno require a sole DB connection when working parallely?

The Production Tier allows 500, to be clear :)

Every thread per worker per dyno could consume a connection, depending on if they're all trying to access the database at the same time. Usually the connections are reused once they're done, but as I mentioned in b), if you're threads are greater than your pool you can have a bad time. The connections will be reused, all of this is handled by ActiveRecord, but sometimes not ideally. Sometimes connections go idle, or die, and that's why turning on the Reaper is suggested, to detect and reclaim dead connections.

3
votes

You don't want less DB connections than threads. Remember that each separate process has its own connection pool, so if your DB supports 20 connections and you want to run 2 processes, the most threads you can run without risking timeouts is 10 threads each with a pool of 10 connections.

You want to leave a few connections for rails console sessions. Also be aware of background workers, and whether they are threaded.

If your workers are in a separate process (sidekiq), they will have their own pool. If your workers' threads are spawned from the web process (girl_friday or sucker_punch), you will want the DB_POOL to be larger than the max number of web threads, since they will be sharing a connection pool.