How to Profile Your Rails Application

Yesterday I mentioned that ruby-prof now supports profiling rails applications. Lets see how that works.

But a word of caution before beginning. Like all web applications, Rails apps are complex. They encompass many pieces – clients on remote machines, network connections, any number of application servers and database servers. Getting the whole ensemble to work can be tricky – let alone making it perform well.

To avoid drowning in complexity (and data), its crucial to use a rigorous approach to analyzing your application’s performance. For a good example of how to approach the problem, take a look at Zed Shaw’s analysis of Ruby/Odeum’s performance.

Ok – enough lecturing – onto the good stuff.

Installation and Setup

Assume that after much hard work, you’ve identified a request that takes too long (where of course you have to define what too long is). In addition, you know that the culprit is your Rails code – not the client, not the network and not the database.The trick is to find out what is taking so long – and there is where ruby-prof helps.

First install ruby-prof by opening a command line and typing:

gem install ruby-prof

Next, copy ruby-prof’s Rails plugin to your Rails plugin directory:

cp rails_plugin rails_app/vendor/plugins

Finally create a new Rails configuration with the following setting:

config.cache_classes = true

If you don’t cache classes, then the time Rails spends reloading and compiling your models and controllers will overwhelm anything else your code does. And obviously try to make your test setup as similar as possible to your production environment – use a copy of the same database, use the same web server, etc.

Profiling a Request

Once you’ve completed your setup, and started your Rails server, make a single request. Obviously for real testing you’ll need to make many requests, but let’s start with one to see how ruby-prof works.

To make this as realistic as possible, I’m going to show some old profiling data for MapBuzz, which is a social mapping site I started that let’s you see what’s going on in your neighborhood.

One of the things that is important to us is how long it takes to render the home page – so let’s make a request to the home page. Once the request is complete, the results are output in the Rails log file.

Completed in 0.71900 (1 reqs/sec)
Rendering: 0.37600 (52%) | DB: 0.18700 (26%)
200 OK [http://localhost/]

Thread ID: 175635930
Total: 0.719

 %self     total     self     wait    child    calls  name
 26.15      0.19     0.19     0.00     0.00       23  PGconn#exec
 26.01      0.19     0.19     0.00     0.00       27  PGresult#each
  4.45      0.03     0.03     0.00     0.00     1403  String#match
  4.45      0.03     0.03     0.00     0.00      171  IO#write
  2.23      0.02     0.02     0.00     0.00      872  String#to_s
  2.23      0.02     0.02     0.00     0.00      467  String#gsub
  2.23      0.03     0.02     0.00     0.02       76  ManyParser#_parse
  2.23      0.05     0.02     0.00     0.03       60  MonitorMixin
                                                      #synchronize
  2.23      0.02     0.02     0.00     0.00       60  Parsers#sequence
  2.23      0.02     0.02     0.00     0.00     1640  ParseContext#eof
  2.23      0.02     0.02     0.00     0.00       31  ActionView::
                                                      Partials
                                                      #add_object_to_
                                                      local_assigns!
  2.23      0.02     0.02     0.00     0.00       11  ActionView::
                                                      Helpers::
                                                      UrlHelper#link_to
  2.23      0.02     0.02     0.00     0.00      243  Object#add_error
  2.23      0.02     0.02     0.00     0.00      175  Kernel#clone

The first line shows the overall time as measured by the benchmark class used by Rails. Below that, you’ll see a flat profile generated by ruby-prof that shows the slowest methods. By default methods that take more than 1% of the time will be printed – for brevity’s sake I’ve culled it to 2%.

A quick scan of the results shows some interesting things. First, the most time is spent in PGconn#exec, which is used to query the database via a C extension. Most likely there isn’t much to optimize there on the Rails side, although there certainly may be room for optimization on the database side.

Next, PGconn#each is likely looping over the query results. Its time seems excessive, but we’ll worry about that later since we notice there are 1,403 calls to String#match. That seems overly excessive – what is causing that?

Digging Deeper

Flat profiles are great for providing an overview of where time is spent in your program, but they leave out a lot of useful information. In this case, we’d like to know what code is calling String#match. To do that we need information provided by graph profiles.

To create a graph profile, open up vendor/ruby-prof/lib/profiling.rb, and edit it like this:

module ActionController #:nodoc:
  module Profiling #:nodoc:
    def perform_action_with_profiling
      if RubyProf.running? or
        ...
      else
        ...
       # Example for Graph html printer
        printer = RubyProf::GraphHtmlPrinter.new(result)
        File.open('request.html', 'w') do |file|
          printer.print(file, {:min_percent => 1,
                               :print_file => true})
        end
      end
    end
  end
end

This will create a new file, request.html, that shows a graph profile of the request. Rerun the request, and then open up request.html.

In our case we are interested in the callers of String#match, so let’s take a look:

%Total
%Self
Total
Self
Wait
Child
Calls
Name
Line

0.00
0.00
0.00
0.00

1/1403
ActionController::AbstractRequest#accepts
65

0.00
0.00
0.00
0.00

195/1403
Array#each-1
103

0.00
0.00
0.00
0.00

1/1403
ActionController::Rescue#local_machine?
10

0.00
0.00
0.00
0.00

232/1403
Array#each-2
103

0.00
0.00
0.00
0.00

1/1403
ApplicationController#sanitize_status_codes
118

0.00
0.00
0.00
0.00

8/1403

ActiveRecord::ConnectionAdapters::

PostgreSQLAdapter#translate_field_type

55

0.00
0.00
0.00
0.00

232/1403
Array#each-3
103

0.00
0.00
0.00
0.00

11/1403
Geos::GeometryFactory#create_geom_from_hex
39

0.02
0.02
0.00
0.00

380/1403
Array#each-4
103

0.02
0.02
0.00
0.00

342/1403
Array#each-5
103

4.45%
4.45%
0.03
0.03
0.00
0.00
1403
String#match
39

0.00
0.00
0.00
0.00

1403/1429
Regexp#match
118

If you haven’t seen a graph profile before, you’re probably feeling a bit overwhelmed. To help out a bit, I’ve written some documentation here. However, the quick summary is that the method of interest is shown in bold. Methods above it are the method’s callers, while the methods below are the methods callees.

Notice the calls Array#each, Array#each-1, Array#each-2, etc? The dash and number indicate that a method has been called recursively. Thus Array#each-1 means that Array#each was called and then it in turn called Array#each either directly or indirectly.

Going back to our investigation, we see that Array#each-4 called String#match the most times. So the next step would be to click the Array#each-4 hyperlink and see where it is called from.

Visualizing a Profile

Once you create your own graph profile, you’ll quickly realize how valuable the html hyperlinks are for navigating through the mass of information. But wouldn’t it be nice to visualize the profile? Glad you asked!

ruby-prof 0.5.0 can now output call tree information thanks to a patch from Carl Shimer. To create a call out put format, once again you’ll have to modify profiling.rb:

module ActionController #:nodoc:
  module Profiling #:nodoc:
    def perform_action_with_profiling
      if RubyProf.running? or
        ...
      else
        ...
       # Example for Graph html printer
        printer = RubyProf::CallTreePrinter.new(result)
        File.open('callgrind.out', 'w') do |file|
          printer.print(file, {:min_percent => 1,
                               :print_file => true})
        end
      end
    end
  end
end

Once you’ve created the output file, open it with KCachegrind (sorry, this is Linux only, although there is a port to Windows which I have not tried using). KCachegrind take a bit of work to get used to, but once you do you can create visualizations that show you what your program is doing. In this case, here is a picture that shows String#match’s callers.

Call Graph

Wrapping Up

Remember – most of the time you won’t need ruby-prof. It provides much too much information to be used as a general way of finding performance bottlenecks – instead analyze your web log files, Rails log files and database log files.

But once you know the problem is in your Rails code, ruby-prof is invaluable in pointing out where it is.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top