While analyzing Rails performance, I stumbled upon the fact that parsing timestamps in ActiveRecord is very slow (I’ll post more about that in an upcoming blog).
A bit of googling revealed that Charlie Bowman discovered this a year an a half ago. He has a nice post comparing the performance of Ruby’s Time and DateTime classes. Sadly he didn’t use ruby-prof to come up with his results, but we’ll forgive him for that 🙂
Now you might reasonably assume that the Time class deals with, er, time and the DateTime class deals with date times, which we’ll refer to as timestamps. Well, like me, you’d be wrong. Totally wrong.
It turns out that Time and DateTime both deal with timestamps. But they are implemented completely differently.
Time is a c extension that wraps your operating system’s time functionality. It stores times as seconds from January 1st, 1970. Its biggest upside is performance. Its biggest downside is its limited range – times before 1970 are “out of range.” I have to admit that doesn’t make any sense to me since no other language I’ve used suffers from this problem and I’m fairly sure Windows and Linux don’t either. But no matter.
DateTime is a pure ruby implementation of time that let’s you store a much wider range of dates. It also supports both the Julian and Gregorian calendars – on the off hand chance you need such a thing.
Rails takes the safe route and converts timestamps to DateTimes. But there is a performance penalty to pay. Below is simple little script showing how much time it takes to parse a timestamp using 3 approaches – DateTime.parse, Time.parse and Time.xmlschema.
require 'date'
require 'ruby-prof'
# Use xml schema format
now = '2007-07-12T23:36:25.609375-06:00'
result = RubyProf.profile do
1000.times do
time = DateTime.parse(now)
end
end
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})
result = RubyProf.profile do
1000.times do
time = Time.parse(now)
end
end
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})
result = RubyProf.profile do
1000.times do
time = Time.xmlschema(now)
end
end
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, {:min_percent => 1})
And the results:
Total: 3.86 %self total self wait child calls name 14.15 0.83 0.55 0.00 0.28 19000 Integer#gcd 10.91 2.16 0.42 0.00 1.74 19000 <Class::Rational>#reduce 8.06 0.37 0.31 0.00 0.06 27000 Rational#initialize 7.33 0.28 0.28 0.00 0.00 6000 String#sub! 5.70 0.74 0.22 0.00 0.52 27000 Class#new 4.87 0.94 0.19 0.00 0.75 1000 <Class::Date>#_parse 3.68 0.14 0.14 0.00 0.00 29000 <Class::Object>#allocate 3.65 2.30 0.14 0.00 2.16 19000 Object#Rational 3.26 0.86 0.13 0.00 0.74 27000 <Class::Rational>#new! 3.24 0.79 0.13 0.00 0.67 4000 Rational#+ --------------- Total: 1.359 %self total self wait child calls name 32.16 0.44 0.44 0.00 0.00 6000 String#sub! 6.92 0.95 0.09 0.00 0.86 1000 <Class::Date>#_parse 6.84 0.36 0.09 0.00 0.27 1000 <Class::Time>#make_time 5.81 0.23 0.08 0.00 0.15 1000 Rational#/ 5.74 0.08 0.08 0.00 0.00 5000 Rational#initialize 4.71 0.10 0.06 0.00 0.03 1000 <Class::Time>#apply_offset 4.49 0.09 0.06 0.00 0.03 3000 Integer#gcd 3.46 0.23 0.05 0.00 0.19 3000 <Class::Rational>#reduce 3.46 1.36 0.05 0.00 1.31 1000 <Class::Time>#parse --------------- Thread ID: 21259270 Total: 0.297 %self total self wait child calls name 37.04 0.14 0.11 0.00 0.03 1000 <Class::Time>#zone_offset 15.82 0.30 0.05 0.00 0.25 1000 <Class::Time>#xmlschema 10.44 0.08 0.03 0.00 0.05 1000 <Class::Time>#apply_offset 5.39 0.02 0.02 0.00 0.00 1000 <Class::Time>#utc 5.39 0.03 0.02 0.00 0.02 1000 <Class::Time>#month_days 5.39 0.02 0.02 0.00 0.00 3000 Fixnum#* 5.39 0.02 0.02 0.00 0.00 1000 Fixnum#- 5.05 0.01 0.01 0.00 0.00 4000 Fixnum#divmod 5.05 0.01 0.01 0.00 0.00 1000 String#upcase 5.05 0.01 0.01 0.00 0.00 8000 String#to_i
DateTime.parse is slow because its a complicated piece of code that relies heavily on rational numbers. Time.parse is roughly 3 times faster, although if you look under the hood, it actually uses part of DateTime.parse. That seems a bit strange to me, but I didn’t spend the time to investigate further.
The real winner here is Time.xmlschema which is a over 10 times faster than DateTime.parse. Why is it so much faster? Instead of a mashing a bunch of strings, it uses a regular expression to parse timestamps. Thus it trades flexibility for performance.
When talking to a database there is no need for flexibility – your database most likely outputs time in the exact same way all the time. So if you heavily use dates in your Rails application, then roll your own timestamp parsing.