Rails and Postgresql – Eliminate Hundreds of Thousands of Queries a Day

Update – It turns out that Rails does cache column data dictionary queries (which is what you would expect), but not for :has_and_belongs_to_many associations (HABTMA). I know those are “old fashioned,” but they fit our data model perfectly in a couple of places. So be warned – using just a couple of HABTMA associations will generate a huge number of data dictionary queries.

As part of monitoring the performance of MapBuzz, we run a nifty little program called PgFouine to analyze the postgresql log files every night. PgFouine summarizes the most common queries and slowest queries. Here is our data from yesterday:

Most frequent queries (N)

Rank
Times executed
Total duration
Av. duration (s)
Query

1
140,115
2m42s
0.00
SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull
FROM pg_attribute a
LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = ''::regclass AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum;

Besides being the most run query, this was also the seventh most time consuming query.

I was shocked the first time I saw this many months ago – my first guess was that we were not caching classes in our production mode. But it turned out that wasn’t the problem. Its just Rails being silly – every time it loops over a model’s columns it strikes up a conversation with the database. That happens a fair bit – when you use :include to add additional tables to a query, when you use dynamic finders (find_by_x), when you use relationships setup by :has_many, :has_and_belongs_to, etc. And this isn’t the only place Rails is wasteful – it constantly queries the database for table names and indices – it just happens those queries don’t run nearly as often.

Rails Plugin

Clearly there is no reason to do this in a production environment, and in truth, I don’t see much reason to do it in a development environment either. So yesterday I finally go around to patching Rails and submitting a bug report. The patch caches data dictionary queries for the Postgresql adapter. After loading the patch, Rails still supports the ability to add tables to your database at runtime, but no longer supports adding or removing columns from a table at runtime or recycling table names. If these things are important to you, the patch also provide a flush_dd_cache method that flushes the query cache. An obvious alternative solution is to add a cache_dd_info class variable to ActiveRecord::Base, which would be off by default but on in production. However, I’m skeptical there is a need for such a flag.

On a per request basis, you won’t see much performance gain from the patch in your Rails application servers, but it will remove needless load from your database. And while you are waiting for Rails to be patched (if it is patched), feel free to download the Rails plugin we are using to solve the problem. Note the plugin also fixes two other ActiveRecord bugs, which are its incorrect handling of Postgresql schemas and its ignoring of views.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top