If Only I Could Read Japanese

Posted by Charlie Sun, 26 Aug 2007 06:13:00 GMT

How neat, Matz linked to my article about fixing architecture flaws in ActiveRecord.

Of course I was curious to see what he said, so I ran his comments through online translators from Babelfish, Google and Dictionary.com. Based on past experiences, I wasn't expecting much.

Updated on Sept. 30, 2007. Sam was kind enough to send me his real translation - so take a look at this first:

It says that ActiveRecord's faults are with the way it handles text-based column types and the fact that serialized objects get mixed with text from the database in a hash table.

I'm not in a position to make decisions but pointing out such flaws and making suggestions for improvement is very constructive.

Personally, I think the biggest flaw with Rails is its "self-centeredness". Rails is changing Ruby itself at a rapid pace so when you try to do similar things with other libraries and frameworks you quickly run into a brick wall. When everyone tries to do the same thing you usually get tragic results so I am understanding of how all the "black magic" of redefining core classes is delegated to ActiveSupport.

Well, flaws in Rails are also related to flaws within Ruby itself so I'd like to address that in some way with Ruby 2.0.

And now the automated translations, which were awful. And remarkably similar - they all must be using the same translation engine. Take a look:

Babelfish:

As for the fault of ActiveRecord the text and the object which the Syrian rise from the database is done blending, is in treatment and attribute hash of text based column type when. It cannot judge in me, but as for indication of fault and proposition of improvement you think that it is very healthy. Privately as for the largest fault of Rails, you think that there is a "self central characteristic". In other words, because Rails changes the Ruby very thing with rather enormous spirit, it tries to do kind of that the other library either the framework or are similar with directly it collides. Well, when everyone does similar thing, because it means mostly miserable thing, it seems that rewrites fundamental class as for the "black magic" leaving to ActiveSupport, the bucket, with is the attitude which can understand also the fact that you say, but. As for fault of ま and this Rails being the case that that way it is consequence of the weak point of Ruby, with Ruby 2.0, somehow we would like to cope with here.

Google:

As for the fault of ActiveRecord the text and the object which the Syrian rise from the database is done blending, is in treatment and attribute hash of text based column type when.

It cannot judge in me, but as for indication of fault and proposition of improvement you think that it is very healthy.

Privately as for the largest fault of Rails, you think that there is a “self central characteristic”. In other words, because Rails changes the Ruby very thing with rather enormous spirit, it tries to do kind of that the other library either the framework or are similar with directly it collides. Well, when everyone does similar thing, because it means mostly miserable thing, it seems that rewrites fundamental class as for the “black magic” leaving to ActiveSupport, the bucket, with is the attitude which can understand also the fact that you say, but.

As for fault of [ma] and this Rails being the case that that way it is consequence of the weak point of Ruby, with Ruby 2.0, somehow we would like to cope with here.

Dictionary.com:

As for the fault of ActiveRecord the text and the object which the Syrian rise from the database is done blending, is in treatment and attribute hash of text based column type when.

It cannot judge in me, but as for indication of fault and proposition of improvement you think that it is very healthy.

Privately as for the largest fault of Rails, you think that there is a “self central characteristic”. In other words, because Rails changes the Ruby very thing with rather enormous spirit, it tries to do kind of that the other library either the framework or are similar with directly it collides. Well, when everyone does similar thing, because it means mostly miserable thing, it seems that rewrites fundamental class as for the “black magic” leaving to ActiveSupport, the bucket, with is the attitude which can understand also the fact that you say, but.

As for fault of [ma] and this Rails being the case that way it is consequence of the weak point of Ruby, with Ruby 2.0, somehow we would like to cope with here.

So much for enlightenment.

3 comments | no trackbacks

Making Rails Better - Error Handling

Posted by Charlie Thu, 16 Aug 2007 19:59:00 GMT

Error handling in web applications is devilishly complex - we ended up rewriting SIAS's error handling code in almost every release.

There are two main sources of complexity. First, how do you handle errors that occur after you've already started returning a response to the client? The easiest solution, and the one Rails adopts, is to cache the generated result until its complete. If an exception occurs you dump the result and then render an error message - making damn sure your error message generating code is beyond flawless. This works if your results tend to be small, but breaks down if you need to return a lot of data to the client. For this article, let's stick with Rails assumption (we couldn't use it for SIAS).

The second issue is how do you return errors to the client - what format and HTTP status codes should you return? Unfortunately, most standards are mum on this point. For example, nowhere in the Atom Publishing Format does it tell you what to do when an error occurs. On the other extreme, the Web Map Server (WMS) standard, which we supported in SIAS, goes overboard. Since a Web Map Server is supposed to return a map, usually in a raster format such as PNG or JPEG, the standard specifies that errors should be returned as bitmaps that contain the error message (bet you never thought of that!). It also requires that you support returning errors as "blank" images (so as not to block out any other maps) and then, for a fun twist, as XML documents.

Its on this second point that Rails is lacking - error messages are always returned in HTML and are mapped to two HTTP status codes - 404 (not found) and 500 (internal server error).

An Error Handling Plugin

We can do better. Building off the content negotiation plugin I talked about last week, we have created an error handling plugin for MapBuzz called render_exceptions.

The plugin does four things:

  • Adds an :exception parameter to render, so you can write this code:
    rescue => e
      render(:exception => e)
    end
  • Adds a method, http_status_code, to the various Rails exceptions. For example:
    module ActiveRecord
      class RecordNotFound
        def http_status_code
          404
        end
      end
    end
    Note that ActionController::RecordInvalid is mapped to 418, as described earlier on my blog and rest-discuss.
  • Overrides rescue_action_locally and rescue_action_in_public to use render(:exception)
  • Defines a number of error handling templates - such as error_404.rhtml, error_404.rjson, error_404.ratom, error_406.rhtml, etc.

When an exception is raised, the plugin checks to see if the exception object responds to http_status_code. If it does, and if the value is not 500, then the plugin picks one of the error templates using the content negotiation plugin I blogged about last week. Otherwise, the plugin reverts back to the default Rails error handling.

The end result is that errors are returned using the appropriate mime type and http status code. So if a client asks for an Atom feed of a non-existent record, the server returns 404 and an error message as an Atom entry. Similiarly, if the client asks for an HTML document for a non-existent record, the server returns 404 and an HTML error message.

Using the Plugin

To install the plugin:

  1. Install the content negotiation plugin.
  2. Install the render_exception plugin.
  3. Copy the error templates directory to your views directory (edge Rails introduces a view search path, but I'm assuming you are using Rails 1.2.x). Thus you'll end up with a directory app/views/errors that contains a number of error handling templates.
  4. Modify the error templates to suite your needs. Most are fairly generic, so can be used as-is, but others will require some changes
  5. Don't forget that the plugin overrides rescue_action_in_public, so make sure that its fits into your global error handling scheme.

And that should do the trick.

4 comments | no trackbacks

Resources and Representations Redux

Posted by Charlie Tue, 14 Aug 2007 19:28:00 GMT

In response to my post about why it's so hard for developers to understand REST, Alex Bunardzic talks about the importance of distinguishing resources and representations. I didn't touch on the subject in my post, but Alex is right on the money that it's a key distinction in REST that you must understand.

Dreaming of Chairs

If you read through the rest-discuss archives you'll likely come to the same conclusion as Alex - many developers don't understand the difference between resources and representations.

But what's weird is the resource/representation distinction is a fundamental part of our spoken language and our understanding of the world. So we all understand the difference.

Think of a noun, let's say a chair. We all have an idealized version of a chair in our minds - its something you sit on that usually has four legs, but not always. This idealized version of a chair is something that exists only in our minds. In REST terminology, its a resource.

Now if you look around you'll see there's an almost infinite number of ways that the idea of a chair can be translated into a real, actual chairs. Here are just a few examples:

Side Chair Crater Chair Chair Small

Each of these real chairs is a representation of the chair resource.

We all understand the difference between the idea of a chair and real chairs, and thus between resources and representations. Therefore the only reason I can see why developers don't see the distinction in REST is simply because they are stuck in the wrong frame of reference like I discussed yesterday.

Dreaming of Web Sites

Now let's apply this distinction to the web. The W3C's Architecture of the World Wide Web document has a great illustration showing the differences between identifiers, resources and representations.

WebArch

The resource is a weather report for Oaxaca, Mexico. The resource only exists as a concept, you can't touch it, you can't feel it, its only in the mind of whoever created the Oaxaca weather report website. The website returns a representation of the weather report as an XHTML document, but it could also return an Atom feed, an image, etc.

Forecast

Is This Important?

Is this really important - are resources really key to understanding REST? Let me quote an exchange on rest-discuss between Walden Mathews and Roy Fielding.

Walden:

In modeling REST formally, you can remove "resource" from the picture without harming the purely technical aspects of the system. What you then end up with is a many-to-many relation from URI to representations. Perhaps that's all the Web *really* is, but it's hard to find any guidance in the building of useful systems in that. Put another way, it's hard to decide what value such a system might be providing.

Roy:

And that is the key, really, since the main reason we know that a conceptual resource must exist is because of the number of ways that a link can fail. It is only when people complain bitterly about the lack of an expected sameness that you realize the goal of the link has nothing to do with the listener at the end of the HTTP address, but rather the expectation that it identifies (indirectly, as there is no such thing as direct HTTP identification) something other than the address.

People don't create links to server software; they create links to information.

Let me repeat. People don't create links to server software; they create links to information. Service Oriented Architectures are about creating links to software. Resource Oriented Architectures are about creating links to information. Change your frame of reference.

4 comments | no trackbacks

Why Is REST So Hard to Understand?

Posted by Charlie Mon, 13 Aug 2007 15:48:00 GMT

I know, I know. Not another article about REST - hasn't this horse been flogged enough? Perhaps - but recently I've been watching the GIS industry finally discover REST and unsurprisingly there is bit of misunderstanding and a bit of resistance. Kudos to Sean, and others, for their efforts in spreading the word.

But it raises an interesting question - why is it so difficult for most people, including myself, to understand REST at first? Its not technical complexity - REST is significantly simpler than CORBA, or COM, or just about any programming language. And its not for a lack of an example - the web is one of the most successful distributed computer system ever built.

Instead, I think its because we are looking at the problem the wrong way. To borrow a metaphor from physics, we are using the wrong frame of reference. I finally figured out how important using the correct frame of reference is during a junior year thermodynamics class (my favorite engineering subject, I have to say). Our task was to calculate various properties (temperature, wind speed, etc) in front of and behind a shock wave. If you try to do it using a stationary frame of reference, the math is well nigh impossible. But if you change you frame of reference to the shock wave itself, then the math falls nicely out and you can easily solve the problem.

So instead of focusing on the nitty-gritty of REST, including URIs, resources, HTTP verbs, etc., let's take a look at the big picture.

The Service Oriented Architecture (SOA) Frame of Reference

Let's start by boiling a program down to its most fundamental parts:

Client Server

A client uses a service to access or process some data. Note I am using a very broad definition of client and service - this applies to a object calling another object in a program, a program calling a function in a dll, a browser accessing a web site, etc.

When learning to program, we are taught that the most important part of a system is the services and APIs they provide.

Client Server

In fact, a large part of object oriented design is teasing out the right objects (which are services under my broad definition) and designing their APIs. Design patterns are no more than precanned services that have been discovered via trial and error.

Notice three fundamental characteristics of this approach:

  • It's a design goal to hide data from clients via encapsulation
  • Every service has its own API
  • Clients are tightly bound to services - if the service API changes the client also has to change

How well does the SOA approach work in the real world? Object oriented design has become the accepted approach for building large scale applications. It works to a degree - although the failure rate for creating large applications makes one wonder if there is a better way.

It also works, albeit barely, for creating in-process libraries (shared libraries or DLLs). The barely bit has to do with the difficulties of versioning service APIs, as witnessed by DLL hell on the Windows platform.

However, this approach fails when building distributed systems. It fails because clients are too tightly bound to services, it ignores the unreliability of networks and its not scaleable because of the inherent state held by services. Take a look around - when is the last time you saw a large distributed computer system working across multiple organizational boundaries that is based on COM, CORBA, RMI, SOAP web services, etc?

The Resource Oriented Architecture (ROA) Frame of Reference

Now lets change our frame of reference to a ROA. In ROA, the most important part of the system is the data.

Client Server

Anything of interest gets a globally unique identifier, which is done via URIs.

Next, the service becomes uninteresting because every service has the exact same API. On the Web, the service API is defined by HTTP verb's GET, POST, PUT, DELETE, etc.

So what are the benefits of these changes?

  • Since all interesting data has a globally unique identifier, it becomes possible to create a web of links
  • Since there is only one service API, it becomes possible to create a single client (think of a browser) that can access an infinite number of services (think of websites)
  • Versioning issues between clients and services are greatly reduced (although they still happen when HTTP changes)
  • State can be taken out of the system and made into its own resource, thereby making the system much more scaleable

How does interoperability happen with ROA given that the service API is always the same? It's done through globally agreed upon data formats, known as representations. This might seem to be an impossible task, but in practice its has worked out quite well:

  • xhtml/html/css are used for content, layout and presentation
  • png/jpeg/gif are used for raster images
  • atom/rss are used to model collections of items
  • pdf is used to preserve print versus online fidelity
  • mp3 and others are used for audio data

How well does the ROA approach work in the real world? Well, the web is one of the most successful distributed computer systems ever built, so I would say quite well.

Next Steps

I'm curious to see if understanding the different frames of reference used by SOA and ROA helps clear up some of the mystery surrounding REST - so I'd love to hear your feedback.

For a similar take on the differences between SOA and ROA, there are a number of excellent articles including ones by Stuart Charlton, Alex Bunardzic, Stefan Tilkov and Sanjiva Weerawarana.

And for a more in-depth look at ROA, read the W3C's Architecture of the Web document and O'Reilly's new book RESTful Web Services.

Update: In response to a post by Alex Bunardzic, I've written a follow up article that talks more about the data part of the system.

31 comments | no trackbacks

Making Rails Better - Fixing Architecture Flaws in Active Record

Posted by Charlie Sun, 12 Aug 2007 02:19:00 GMT

ActiveRecord is a funny thing. On the outside it looks great - it neatly maps relational data to Ruby objects and provides an easy to use API via its domain specific language. But on the inside, it contains two suprising architecture flaws that make it difficult to extend and negatively impact performance.

The Vietnam of Computer Science

Mapping object to relational data turns out to be quite tricky. There are so many failed object-relational mapping systems that the whole field has been called the Vietnam of Computer Science. The problem is that objects and tables don't map cleanly to each other, and the more you try to automate the process the more complex your code becomes, and sooner or later your system becomes too hard and too slow to use.

The approach that I think works best, and ActiveRecord follows, is too keep things relatively simple. An record in a table is mapped to an object, and any related tables are mapped to associations that contain one or more object (depending if the relation is one to one or one to many). And that's it, anything past that risks descending into the morass of failed object-relational mappings.

ActiveRecord gets bonus points because it makes it easy to define such mappings via its Domain Specific Language (DSL) - the familiar methods :has_one, :has_many, etc.

Botching Columns

But underneath its exterior, ActiveRecord has a couple of architecture flaws in the way that it handles columns (attributes).

The first issue is that ActiveRecord botches its implementation of columns. Reading and writing data from a database requires converting the data from its textual representation (provided by the database's client APIs) to and from Ruby objects. Let's look at how Rails does it for Postgresql:

A quick glance at some code shows the problem:

def translate_field_type(field_type)
  case field_type
    when /\[\]$/i  then 'string'
    when /^timestamp/i    then 'datetime'
    when /^real|^money/i  then 'float'
    when /^interval/i     then 'string'
    when /^(?:point|lseg|box|"?path"?|polygon|circle)/i  then 'string'
    when /^bytea/i        then 'binary'
    else field_type       # Pass through standard types.
  end
end

def default_value(value)
  # Boolean types
  return "t" if value =~ /true/i
  return "f" if value =~ /false/i

  # Char/String/Bytea type values
  return $1 if value =~ /^'(.*)'::(bpchar|text|character varying|bytea)$/

  # Numeric values
  return value if value =~ /^-?[0-9]+(\.[0-9]*)?/

  # Fixed dates / times
  return $1 if value =~ /^'(.+)'::(date|timestamp)/

  # Anything else is blank, some user type, or some function
  # and we can't know the value of that, so return nil.
  return nil
end

Having large case statements in an object-oriented language is a sure sign your design is flawed. The fundamental problem is that the implementation above is not extensible - you can't easily add your own field types.

You could argue that that extensibility was not a design goal of ActiveRecord, but that would be silly. Even if you agreed that ActiveRecord should only support a few limited data types (which I don't) there are still enough differences between databases that having an extensible system would clean up the internals of ActiveRecord and get rid of the grungy code above.

And more importantly, it would let users add their own data types. And that is important. For example, with MapBuzz we need to support Postgres's geometry types and we would also like to support its full text search types. Overriding Rails to support them is an exercise in annoyance since it requires overriding various core methods in the Postgresql adapter.

The way this should have been implemented is introducting a Column object. The column object's api would be simple - it would have a serialize and derialize method. Note that ActiveRecord does indeed have a column object, but its very weird implementation. For example:

def klass
  case type
    when :integer       then Fixnum
    when :float         then Float
    when :decimal       then BigDecimal
    when :datetime      then Time
    when :date          then Date
    when :timestamp     then Time
    when :time          then Time
    when :text, :string then String
    when :binary        then String
    when :boolean       then Object
  end
end

This code is clearly trying to be much too clever. Keep it simple stupid! There should be a TimeColumn, FloatColumn, etc. That way, a developer can add their own column types - so for us a GeomColumn.

Attributes

The second issue, which is related, is the way that column values are handled. ActiveRecord stores data read from a database in a hash table called attributes. But suprisingly, the attributes hash table is also used to store Ruby objects. Thus the data stored in the attributes hash table may either be a Ruby object (in serialized format) or the text returned from the database (unserialized).

This is a horrible design for two main reasons.

First, it means that every time an attribute is accessed there has to be code to check to see if its its in string format not. If it is, the data must be converted to Ruby, which causes a performance hit.

Second, it means that ActiveRecord cannot keep track of which attributes have changed and which have not. That's important, because it means that ActiveRecord updates every column even when just one column changes. Besides being a performance hit, it means that ActiveRecord will corrupt your database if you are not careful. That happens when a table contains a column type that ActiveRecord is not familiar with - a good example being a ts_vector field in Postgresql. ActiveRecord will attempt to update it using the wrong value although the column hasn't changed at all.

So what's a better solution? A pure object-oriented solution would introduce a Field object, which has four fields - the raw value (from the database), the serialized value (the ruby object), a reference to the column object which knows how to serialize/deserialize the field and a changed column.

But that's pretty heavy-weight since you're introducing an extra object per field per record. An alternate solution would be too introduce three hash tables per record - one to hold the raw values, one to hold the serialized values and one to hold the changed flag. You would also want to store references to the records columns, most likely on the class itself (so if your table is called parents, then store the column information on the Parent class).

Fixing Active Record

The good news is that the Rails team is looking at these issues. In particular, Michael Koziarski has recently posted a patch that introduces the concept of a separate hash table to store serialized values. So check out the patch, and be sure to offer Michael your comments!

Posted in  | 10 comments | no trackbacks

Making Rails Better - Content Negotiation

Posted by Charlie Mon, 06 Aug 2007 21:06:00 GMT

Over a year ago, I released a Rails 1.0 plugin that added support for content negotiation. Conceptually it was simple - map the HTTP Accept header to template extensions.

For example, assume you have an article controller. A client may wish to GET the articles in various formats, including HTML, XHTML, RSS or ATOM. Thus your views would be:

  • article.rhtml
  • article.ratom
  • article.rjson

The extension is based on the mime type - not the template type. So article.ratom may be an ERB template, a HAML template, a builder template, etc. When the plugin compiles the template it is smart enough to tell what type of template it is, and acts accordingly.

We've found this solution works extremely well for MapBuzz, so we're releasing it as a Rails 1.2 plugin to encourage discussion in the community and hopefully influence its direction. And if you need to support XHTML with your Rails application you're in luck - the plugin has XHTML support baked in as explained below.

Rails Offers an Unsatisfying Solution

Now you might be thinking to yourself that Rails 1.1 solved this issue. Rails 1.1 did indeed add support for content negotiation by honoring the HTTP Accept header, adding a new format parameter and implementing a new controller method respond_to.

However, I think the implementation leaves much to be desired. Let's take a look at an example:

class ArticleController < ApplicationController
  def get
    @articles = Article.find(:all)
    respond_to do |wants|
      wants.html
      wants.xml { render :xml => @articles.to_xml }
    end
  end  
end

The first problem is the implementation's verbosity. You have to add the same boilerplate code for each method in each controller that supports multiple mime types (for MapBuzz that is almost all of them). For opinionated software, this seems like a strange oversight and I've always found it jarring.

The second problem is that the implementation mixes view logic into a controller. Why should a controller have knowledge, or care, about how its results are rendered? I can't see any good reason for it.

On the bright side, it looks like Rails 2 will change this implementation a bit. I've recently noticed some blog posts that mention the preferred template naming convention has changed to include both a mime type and template type, thus something like this - article.html.erb. Hopefully that means article.atom.erb, article.rss.xml, etc. will also work but I haven't checked.

Will these changes make our plugin obsolete? I certainly hope so, but I haven't had the time to dig into Rails edge to see for sure.

Why Content Negotiation?

Before diving into the plugin, you may wondering why bother - isn't it generally accepted that content negotiation is a failure? In the "old" Web I'd agree - and much of the blame has to fall on IE 6 for its use of this HTTP Accept header:

Accept: */*

Hmm, thanks Microsoft, very helpful.

But in the world of Ajax, things have changed. XmlHttpRequest lets you set HTTP headers, so a client can specify exactly what type of response it wants. Sometimes Atom is the best choice, sometimes JSON is and other times good old HTML fits the bill. Whichever you choose, when you create an Ajax-based website you control how the HTTP Accept will be set by the client, and therefore content negotiation all of a sudden becomes interesting again.

Using The Plugin

The plugin makes the simple case easy - rename your views based on their mime type:

  • article.rhtml
  • article.ratom
  • article.rjson

Partials also work the same way:

  • _article.rhtml
  • _article.ratom
  • _article.rjson

And if you are using a layout, then the same drill applies:

  • layout.rhtml
  • layout.ratom
  • layout.rjson

The plugin also supports mixing mime types. For example, you may wish to return an XHTML document that includes embedded SVG. To do that, in your enclosing .rhtmltemplate you would include this line:

<%= render(:file => 'article/get.rsvg') %>

By specifying the extension, .rsvg, you've alerted the plugin that you want to change the current mime type to SVG. Any templates or partials that get.rsvg in turn calls will be assumed to have an .rsvg extension unless you specifically override it again. Once get.rsvg is finished rendering, the current mime type will revert back to XHTML.

The Plugins Inner Workings

Now let's look at how the plugin works - it uses this algorithm to render the first template in a given request:

  1. Get a list of potential mime types:
    • If the request includes a format parameter use it.
    • Otherwise, create an array of mime types based on the HTTP accept header. Then modify the array by:
      • Prefer XHTML over HTML (see below).
      • If the client supports Atom, then make sure both the Atom Feed format and Atom Entry format are included in the list.
      • If the Accepts header includes */* (or various derivatives seen out in the wild such as *.*, *, etc.), replace it with HTML, ATOM and JSON.
  2. Loop over the list of mime types and search for a template with the correct extension. For example, if the list of mime types is HTML, ATOM and JSON, then the plugin will look for a template with an extension of .rhtml, .ratom or .rjson in that order.
  3. If a template is found, save the current mime type onto a stack. If a template is not found, raise an exception.

Once a mime type is chosen, the plugin will continue to use it for all other templates including partials and layouts. Thus if current mime type is ATOM and the current template calls a partial called called author, the plugin will look for a template named _author.ratom. If it can't find it, it will raise an exception.

There are two special cases. The first one was explained above, you can switch mime types in mid-stream if needed. The second one is when an exception is raised while rendering a template. In that case, the plugin will "forget" the current format and then look for an appropriate template (thus going back to the long algorithm above).

XHTML Support

Another benefit of the plugin is that it adds full XHTML support to Rails. I've previously blogged about Rail's utter disregard for XHTML, but for MapBuzz its crucial because we have to embed SVG in XHTML files.

So if a browser says it supports XHTML via the Accept header (pretty much any browser other than IE), then the plugin will automatically select XHTML over HTML and set the content type to application/xhtml+xml.

However, there are a couple of twists. First, XTHML is mapped to templates with an extension of either .rxhtml or .rhtml extension. The reason for reusing .rthml is to avoid duplicating them and violating the DRY principle. So make sure that your .rhtml files are valid XHTML. If they are not, you'll see parse errors in Firefox/Opera/Safari, since they will be applying XHTML's strict syntax rules.

Second, you have to make sure to specify the correct doctype. The way we solve this is by having our layout.rhtml file call a partial called doctype. There are two versions of this partial - _doctype.rhtml and _doctype.rxhtml.

_doctype.rhtml looks like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>

While _doctype.rxhtml looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xhtml="http://www.w3.org/1999/xhtml" xml:lang="en">

The plugin will pick the correct one based on the current mime type.

Finally, the plugin overrides ActionView::Helpers::TagHelpers so that it correctly ends empty tags with />, as required by XHTML/XML syntax rules, if the current content type is XHTML.

Wrapping Up

As the Rails core team is fond of saying, Rails is opinionated software. In my opinion, mapping mime types to template extensions is a big win:

  • It eliminates boring, boilerplate code
  • It more cleanly separates controllers from views
  • It makes it easy to add a new format to a controller action (just drop in a new template)

But of course the real test is how well does the plugin work in a production website? From our experience it works great - so give it a try and let us know what you think!

4 comments | no trackbacks

On Writing and Programming

Posted by Charlie Thu, 02 Aug 2007 17:29:00 GMT

I was planning to write about the blub paradox, a term coined by Paul Graham in an essay that argued that programmers can't conceptualize programming languages more powerful than the ones they already know. Or paraphrased, you get stuck in a certain way of thinking and can't see past it.

Unfortunately, I made the mistake of rereading his essay to verify that I remembered it correctly. Bad move. Several hours later, and essays, I ended up being impressed once again by how well Paul (or should that be Mr. Graham?) writes and thinks.

I hate to admit it, but I also had a more primal reaction - to shutter up this blog, purge Google's caches and hope that no one who has ever visited has heard of Paul Graham, let alone read anything he's written. But sadly, that cat is already out of the bag.

Working at a Snails Pace

But it did make me focus on an issue that's been percolating in the back of my mind - how long does it take Paul Graham, or any grade A blogger, to write their posts?

Why do I care? Because it takes me a significant amount of time to write anything and I'm curious how that compares to other people (I think the time is well spent, but that's a topic for another post).

If you're feeling obstreperous, you might point out that its an impossible question to answer. To really know, you'd have to compare the time it takes for two people to write equivalent posts. But of course there is no obvious way to define equivalency. What is you measuring yardstick? Number of words? Please. Relevant content? Popularity? Who knows.

The Realm of Masters

I think we can look to the programming world for guidance. It is generally accepted that the best programmers are an order of magnitude more productive than average ones. Put another way - one great programmer replaces ten average ones. In the abstract it sounds impressive. But when you see it in real life, its truly amazing.

I have an alarming suspicion that writing is the same. I don't have any proof to back it up, and a couple of minutes of googling failed to turn up any studies that compared the productivity of different writers (I assume such studies exist, does anyone have some links?).

But I think writing and programming are very similar. Both are creative endeavors that force you to carefully formulate your thoughts so that you can translate them from your mind into a language (English or Java) that is then read by a person or computer.

The tiniest details matter - every word, every line, every paragraph, every block. I used to argue that point with my English high school teacher, Mr. Kelleher - I didn't believe it. In particular, I remember arguing over Faulkner's As I Lay Dying, triggered by the famous chapter "My mother is a fish."

At the same time, the biggest details matter - without a clear organization your story or program soon dissolves into a mass of unintelligible gibberish. Thus writing, and programming, work from the bottom up and from the top down.

Both require raw talent. Without it, no amount of practice will help. You might want to be the next Tiger Woods, but even if you practice golf every day for the rest of your life, you likely aren't going to come close.

That's not to say practice isn't vital. It is. It helps you improve your skills and gives you an edge over people with similar talent. Without it, you'll squander your raw talent and achieve nothing. But sadly, practice alone won't let you beat those with much greater talent.

Trying to Be a Good Writer

You may have noticed my choice of I words above - an "alarming" suspicion. Why alarming? Because if my analogy is correct, then I fall squarely into the average category. To be a master writer, not only would I have to write interesting things (ala Paul Graham) but I'd also have to produce them with much greater frequency than the average writer.

I actually do consider myself a good writer (then again, what else am I going to say on my blog?) although of course by now you've come to your own conclusion. Back in my younger days, I put writing down as one of my skills when preparing for my first job performance review at Environ, an environmental consulting firm I worked at. I figured that might be a bit boastful - but little did I know that I had doomed myself to a 30 minute dress-down as to why I was wrong. And looking back, rightly so. Environ's business was, and still is, creating impartial documents that evaluate the health and environmental risks posed by chemicals, food packaging, superfund sites, industrial sites, etc. These documents are required by local, state and federal regulatory agencies and can make or break business deals. Clients pay tens of thousands of dollars for them - so they have to be clear, fair and defendable in court.

Since then, my writing has gotten better (practice, practice, practice) and I've gained a more nuanced view of my abilities. My real strength is editing - I can almost always improve drafts that people send me. And I'm good at creating/editing technical and scientific documents. But after that my skills start to fall off. In particular, and as I mentioned above, my achilles heel is my writing speed.

Program Like You Write

I program like I write. Which means I figure out roughly what I want to achieve in my head or on a whiteboard, and then I create a first cut of it. That inevitably exposes parts of the problem that I didn't see, which in turn leads to major revisions in my proposed solution. And then I repeat the process over and over until I'm happy with my final solution.

Its a ghastly inefficient way of doing anything, but that's how my mind works and I've come to accept it. But it doesn't work as well for writing as programming, because I have a harder time figuring out what I want to write. In fact, when I'm suffering writer's block (I rarely suffer programmer's block, although it happens), the only solution I've found is just start writing something - anything. After a few hours, or sometimes days, something will emerge out of my muddled thinking. The end result is that writing requires more iterations because I'm less able to see the solution before starting.

I wonder if other people program like they write? Is the same part of the brain involved in these two creative processes? Or do I have two parts of my brain that like to approach problems in the same way? Or is it some combination of the above? I'd love to hear other people's experiences.

The Final Score

To finish, let's tally up the final score. This was a particularly painful post to write - it took roughly 7 hours to write over the course of a week. Why did it take so long? Because it took 3 major revisions to figure out what I wanted to say. And even now, I'm not particularly happy with it. But I think its good enough to post, and there are other more pressing matters at hand.

6 comments | no trackbacks

Ex-Smallworld Bloggers - Brian and Robert Join the Fun

Posted by Charlie Thu, 26 Jul 2007 05:04:00 GMT

Its fun watching your friends start blogging.

I've written a few times about the work I did at GE/Smallworld developing the Smallworld Internet Application Server (SIAS). We started off with three people, Peter, Robert and myself, in early 2000, and released the first version in September. Three major releases quickly followed in succession, roughly once every 10 months. At the same time, the team rapidly grew to over 10 people. It was quite a couple of years.

Peter was the first to start blogging, and he finally goaded me into it a bit over a year ago. More recently Robert joined the fun, and last week Brian (the fourth member on the team) did too. Which makes four ex-SIAS bloggers. Which, as far as I know, is more than the combined number of other ex-Smallworld/GE bloggers (at least in the US). Looks like we were on the right team!

Of course teams only function as well as the team members work together. We had a good mix. Peter was the purist (we must do xyz this way - anything else is wrong!), Robert was the doer (stop screwing around on the whiteboard and let's get this thing built) and I was the one who had the joy of making sure we delivered something that worked on time.

When Brian joined, he assumed responsibility of our HTML client - including the design and implementation. Looking back, the client was way ahead of its time. You could drag maps around (although not nearly as nicely in Google Maps), select features, perform queries, etc., all the while supporting Navigator 4.5 and IE 5 and 5.5. The javascript was advanced even for today. To give you a feel, we had a framework for communicating back to the server using the strategy pattern. One strategy for a regular HTTP form request, one for using an IFrame and one for using XmlHttpRequest.

Its amazing how much has changed since then. IE 6 was gold back then - it was superior in every way to the Netscape products. And Mozilla was deep in the throes of its winter. However, we picked up Mozilla as soon as we could, I think at the 0.7 release (long before Firefox existed), although everyone thought we were nuts.

So welcome to Robert and Brian!

FYI, the other Smallworld bloggers I know about are:

Peter Batty - as I mentioned earlier

Carl Myhill - by far the most prolific - and clearly having the most fun - check out his Pacific Coast Trail pictures

Alfred Sawatzky - A great source of Magik/Smallworld programming tips

Brad Sileo - Brad and Chrissy have a great family blog - I particularly like the excellent youth hockey pictures

Derek Knight - Long ago and far away, Derek was part of the Core Smallworld development team and implemented COM support amongst other things. Derek just started blogging a few months ago, and it looks like he's focusing on Vista and local New Zealand goings-ons.

I'm sure there are many others that I don't know about. Drop me a line and I'll add you to the list.

4 comments | no trackbacks

End the Endpoint Madness

Posted by Charlie Fri, 20 Jul 2007 16:20:00 GMT

Sean's been writing about why the Open Geospatial Consortium (OGC) standards aren't RESTful. Since the OGC seems to be my favorite punching bag these days, I thought I'd jump right into the fun.

Everything that's wrong the the WxS Suite (that's a fancy acronym for Web Map Server, Web Feature Server, Web Context Server, etc.) boils down to one thing - they are based on the fundamentaly flawed concept of service endpoints. A service endpoint is a program sitting on the network that defines its own API. To make this more concrete, let's follow Sean's lead and pick on the Web Feature Service (WFS) standard. As Sean showed, a WFS service understands URIs like this:

/wfs?REQUEST=GetCapabilities
/wfs?REQUEST=DescribeFeatureType&typename=places
/wfs?REQUEST=GetFeature&typename=places

Do you see it API? Its hidden in the REQUEST parameter. A WFS service endpoint understands requests such as GetCapabilities, DescribeFeatureType, GetFeature, etc.

Is this important? You bet. It violates every tenent of REST as implemented on the Web.

The Tenets of REST

So what are these REST tenets? Glad you asked:

  • Everything of interest, called resources, are identified with URIs (in English - you can get to everything important on the web with a link)
  • Resources are manipulated, via their URIs, using a limited, uniform interface (in English - use HTTP's verbs dummy!)
  • Clients interact with resources via representations, which summarize the state of a resource at any given time and often provide links to other states (in English - when you go to a website you get back a web page that contains links to other web pages).

So how do OGC service endpoints get this wrong?

First, they lock away their resource behind big, thick stone walls, pretending they don't even exist. Any access to the resource must go through the service endpoints API. Heaven forbid the resources escape out onto the web, letting anyone link to them, bookmark them, query their state, etc.

Second, they haughtily impose their custom APIs over HTTP's standardized verbs. Clearly what works every day for almost a billion web users couldn't possibly work for viewing a few features.

Third, their representations are barren of links. Instead of a thriving neighborhood of interconnected plazas and quaint streets, they are abandoned, dark, scary culdesacs.

Time for a Rethink

Fixing OGC web services is easy - once you have the right mind set. Just a few changes will lead to vast improvements:

  1. Service points are anti-patterns. Abolish them.
  2. Everything gets its own URI - feature types, feature collections, features, queries, filters, every attribute on every feature, etc.
  3. Abolish the REQUEST parameter. Anything with a URI (feature types, features, queries, filters, etc). can only be created, updated and deleted using HTTP POST, PUT and DELETE.
  4. Representations, besides images, should overflow with links. GML includes XLink for a reason - use it.
  5. Never, ever, use POST to do a query. I know the argument - a complicated query is well nigh impossible to squeeze into a URI. Fine, I agree. Instead of grossly violating REST, turn the query into its own resource complete with a URI. Then your query becomes something like http://www.mywf.com?search=http://www.saved.queries/nearby_ice_cream_stores.
  6. Errors must be reported using the appropriate HTTP status codes. And absolutely do not invent new, random mime types to report errors like in WMS.

To do this right, go get a copy of the Atom Publishing Protocol and read it. And then read it again. When it talks about feeds think feature collections. When it talks about entries, think features. And when it talks about anything else, think twice, and then a third time, before doing anything different.

no comments | no trackbacks

OGC Takes Today's WTF

Posted by Charlie Fri, 20 Jul 2007 07:34:00 GMT

Today's WTF is provided courtesy of the Open Geospatial Consortium (OGC):

At last week's OGC Technical Committee meetings, there was considerable discussion related to developing standard guidance (and rules) for implementing SOAP/WSDL and RESTful bindings for OGC Web Service interface standards. The result of this activity will be a new OGC document that defines a consistent method for using SOAP/WSDL with current and future OGC web service standard.

Ah - takes me back to the good old days of my misguided youth. At least I had the sense to argue that SOAP RPC style web services were stupid (you know a spec is confused when it supports both an RPC style and a literal style). But instead of spilling any more ink on this tired old debate, I'll just ask a question. Can anyone point out a successful SOAP/WSDL service?

A Glimmer of Light

But at least there is hope. While one committee frivolously wastes time and money writing specs no one will ever use, a new subcommittee gets to go battle with the REST dragons:

At the same time, the members felt that REST is also a highly important method. Therefore, we are forming a new OGC Sub-committee to develop similar guidance and recommendations for implementing OGC standards in a RESTful environment. They are to provide an initial report by the Boulder meetings this September. Will be interesting to see what results and what the potential impacts are to the existing OGC web service standards.

I wait with bated breath.

2 comments | 1 trackback

Older posts: 1 2 3 4 5 6 ... 13