Archive for October, 2014

Elasticsearch, Ruby and Unicorn


We have been using Ruby on Rails as well as Elasticsearch for a while. To avoid downtime during deployment, we have been using Unicorn more or less configured like this blog post describes. While running a single instance of Elasticsearch was pretty trivial with Karel’s new Elasticsearch Ruby Gem – moving to a Clustered setup forced us to understand the configuration of the Gem a bit better. I thought I’d sum up a few lessons learned here just in case it might be useful to someone:

There’s quite a few options you can pass into the Elasticsearch client that allow you to leverage the clustered setup, my setup ended up looking like this:

elasticsearch_hosts = ENV['ELASTICSEARCH_HOSTS'].split(/,\s*?/)
require "#{Rails.root}/lib/wrappers/elastic_client_wrapper.rb"
ELASTIC_CLIENT = url: elasticsearch_hosts,
log: Rails.env == 'development',
transport_class: MyApp::ElasticClientWrapper,
randomize_hosts: true,
retry_on_failure: true,
reload_connections: true,
reload_on_failure: true,
transport_options: {
request: { open_timeout: 1, timeout: 45 }

The :url (or could be :hosts paramater), is basically an array of hostnames for the elasticsearch cluster that I load from the environment.

I’ve also specified :transport_class which refers to my custom wrapper client that I use to handle errors (just to make sure the entire app doesn’t crash if the search engine cluster becomes unavailable – this might be a bit overkill, but previous experience has thought me to try to wrap as many external services as possible like this).

The :randomize_hosts, :retry_on_failure, :reload_connections and :reload_on_failure options are all better described here – but you should at least understand them, and set them to true or specify a custom numeric value where appropriate.

Finally :transport_options are important. If you do not specify an open_timeout – the default implementation of Net::HTTP which is used by Faraday, will hang forever if it cannot open a connection to one of the servers. You should really test shutting down one of the nodes in the cluster and ensure your clients are not hanging.

While the :reload_connections option will enable the client to try to reload hosts info from the cluster, if you are replacing nodes for some reason or another, you will probably end up changing the value of the ENVIRONMENT variable or .yml your Rails app are using for the initial config. If you are using the no-downtime deployment setup for Unicorn, you need to make sure you actually reload the connection settings when reloading Unicorn. Similar to how you might reload ActiveRecord connections after forking Unicorn workers, I used this setup in my unicorn.rb

after_fork do |server, worker|