I’m happy to announce the release of libxml-ruby 1.1.3. Besides including the usual assortment of new features and bug fixes, this release also includes a speed boost of roughly 10% to 20%.
This resulted from RubyInside’s recent post summarizing the performance of Ruby parsers. As expected, libxml-ruby blew away Hpricot and REXML in pure parsing speed (which of course is a simplistic view of what is important in an xml processor, but nevertheless still important). But it consistently finished a bit behind Nokogiri.
I was a bit surprised by that since libxml-ruby and Nokogiri use the libxml2 library as their parsing engine. Since the specific test cases almost exclusively tested parsing, the two extensions should have identical run times.
Since the times were different, then the obvious conclusion was that the two extensions were using different libxml2 APIs or using different settings. I suspected the second, but when investigating performance you never know beforehand.
Not to bore everyone with the nitty-gritty details of using libxml2, but when looking into the first test, parsing an in-memory string, it didn’t look there was much difference in API calls.
For libxml-ruby:
xmlCreateMemoryParserCtxt xmlParseDocument
For Nokogiri:
xmlReadMemory -> xmlCreateMemoryParserCtxt -> xmlDoRead -> xmlParseDocument
So that didn’t solve the mystery.
The next possibility was xmlDoRead was modifying the libxml2 parser context. Now a libxml2 parser context is a beast of a thing – for those brave souls who want to take a peek, its defined in libxml2’s online documentation.
Working through the options one-by-one, I finally found the culprit, an obscure field in the structure:
int dictNames : Use dictionary names for the tree
What this setting controls is whether libxml2 uses a dictionary to cache strings it has previously parsed. Caching strings makes a big difference, so by default it should be enabled. That is now the case with libxml-ruby 1.1.3 and higher.
Rerunning the published benchmarks now shows libxml-ruby and Nokogiri to have equivalent performance. If you run the tests yourself, beware though. The order in which the extensions are tested changes the results. Whichever extension is tested first will always be faster, at least on my Fedora 10 box. I assume that’s because the first parser has more memory available to it when the test begins and therefore invokes Ruby’s garbage collector a few times less.