Tick, tick, tick

While analyzing Rails performance, I stumbled upon the fact that parsing timestamps in ActiveRecord is very slow (I’ll post more about that in an upcoming blog).

A bit of googling revealed that Charlie Bowman discovered this a year an a half ago. He has a nice post comparing the performance of Ruby’s Time and DateTime classes. Sadly he didn’t use ruby-prof to come up with his results, but we’ll forgive him for that 🙂

Now you might reasonably assume that the Time class deals with, er, time and the DateTime class deals with date times, which we’ll refer to as timestamps. Well, like me, you’d be wrong. Totally wrong.

It turns out that Time and DateTime both deal with timestamps. But they are implemented completely differently.

Time is a c extension that wraps your operating system’s time functionality. It stores times as seconds from January 1st, 1970. Its biggest upside is performance. Its biggest downside is its limited range – times before 1970 are “out of range.” I have to admit that doesn’t make any sense to me since no other language I’ve used suffers from this problem and I’m fairly sure Windows and Linux don’t either. But no matter.

DateTime is a pure ruby implementation of time that let’s you store a much wider range of dates. It also supports both the Julian and Gregorian calendars – on the off hand chance you need such a thing.

Rails takes the safe route and converts timestamps to DateTimes. But there is a performance penalty to pay. Below is simple little script showing how much time it takes to parse a timestamp using 3 approaches – DateTime.parse, Time.parse and Time.xmlschema.

require 'date'
require 'ruby-prof'

# Use xml schema format
now = '2007-07-12T23:36:25.609375-06:00'

result = RubyProf.profile do 
  1000.times do
    time = DateTime.parse(now)
  end
end

printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})

result = RubyProf.profile do 
  1000.times do
    time = Time.parse(now)
  end
end

printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})

result = RubyProf.profile do 
  1000.times do
    time = Time.xmlschema(now)
  end
end

printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})

And the results:

Total: 3.86

 %self     total     self     wait    child    calls  name
 14.15      0.83     0.55     0.00     0.28    19000  Integer#gcd
 10.91      2.16     0.42     0.00     1.74    19000  <Class::Rational>#reduce
  8.06      0.37     0.31     0.00     0.06    27000  Rational#initialize
  7.33      0.28     0.28     0.00     0.00     6000  String#sub!
  5.70      0.74     0.22     0.00     0.52    27000  Class#new
  4.87      0.94     0.19     0.00     0.75     1000  <Class::Date>#_parse
  3.68      0.14     0.14     0.00     0.00    29000  <Class::Object>#allocate
  3.65      2.30     0.14     0.00     2.16    19000  Object#Rational
  3.26      0.86     0.13     0.00     0.74    27000  <Class::Rational>#new!
  3.24      0.79     0.13     0.00     0.67     4000  Rational#+

---------------
Total: 1.359

 %self     total     self     wait    child    calls  name
 32.16      0.44     0.44     0.00     0.00     6000  String#sub!
  6.92      0.95     0.09     0.00     0.86     1000  <Class::Date>#_parse
  6.84      0.36     0.09     0.00     0.27     1000  <Class::Time>#make_time
  5.81      0.23     0.08     0.00     0.15     1000  Rational#/
  5.74      0.08     0.08     0.00     0.00     5000  Rational#initialize
  4.71      0.10     0.06     0.00     0.03     1000  <Class::Time>#apply_offset
  4.49      0.09     0.06     0.00     0.03     3000  Integer#gcd
  3.46      0.23     0.05     0.00     0.19     3000  <Class::Rational>#reduce
  3.46      1.36     0.05     0.00     1.31     1000  <Class::Time>#parse

---------------
Thread ID: 21259270
Total: 0.297

 %self     total     self     wait    child    calls  name
 37.04      0.14     0.11     0.00     0.03     1000  <Class::Time>#zone_offset
 15.82      0.30     0.05     0.00     0.25     1000  <Class::Time>#xmlschema
 10.44      0.08     0.03     0.00     0.05     1000  <Class::Time>#apply_offset
  5.39      0.02     0.02     0.00     0.00     1000  <Class::Time>#utc
  5.39      0.03     0.02     0.00     0.02     1000  <Class::Time>#month_days
  5.39      0.02     0.02     0.00     0.00     3000  Fixnum#*
  5.39      0.02     0.02     0.00     0.00     1000  Fixnum#-
  5.05      0.01     0.01     0.00     0.00     4000  Fixnum#divmod
  5.05      0.01     0.01     0.00     0.00     1000  String#upcase
  5.05      0.01     0.01     0.00     0.00     8000  String#to_i

DateTime.parse is slow because its a complicated piece of code that relies heavily on rational numbers. Time.parse is roughly 3 times faster, although if you look under the hood, it actually uses part of DateTime.parse. That seems a bit strange to me, but I didn’t spend the time to investigate further.

The real winner here is Time.xmlschema which is a over 10 times faster than DateTime.parse. Why is it so much faster? Instead of a mashing a bunch of strings, it uses a regular expression to parse timestamps. Thus it trades flexibility for performance.

When talking to a database there is no need for flexibility – your database most likely outputs time in the exact same way all the time. So if you heavily use dates in your Rails application, then roll your own timestamp parsing.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top