I know, I know. Not another article about REST – hasn’t this horse been flogged enough? Perhaps – but recently I’ve been watching the GIS industry finally discover REST and unsurprisingly there is bit of misunderstanding and a bit of resistance. Kudos to Sean, and others, for their efforts in spreading the word.
But it raises an interesting question – why is it so difficult for most people, including myself, to understand REST at first? Its not technical complexity – REST is significantly simpler than CORBA, or COM, or just about any programming language. And its not for a lack of an example – the web is one of the most successful distributed computer system ever built.
Instead, I think its because we are looking at the problem the wrong way. To borrow a metaphor from physics, we are using the wrong frame of reference. I finally figured out how important using the correct frame of reference is during a junior year thermodynamics class (my favorite engineering subject, I have to say). Our task was to calculate various properties (temperature, wind speed, etc) in front of and behind a shock wave. If you try to do it using a stationary frame of reference, the math is well nigh impossible. But if you change you frame of reference to the shock wave itself, then the math falls nicely out and you can easily solve the problem.
So instead of focusing on the nitty-gritty of REST, including URIs, resources, HTTP verbs, etc., let’s take a look at the big picture.
The Service Oriented Architecture (SOA) Frame of Reference
Let’s start by boiling a program down to its most fundamental parts:
A client uses a service to access or process some data. Note I am using a very broad definition of client and service – this applies to a object calling another object in a program, a program calling a function in a dll, a browser accessing a web site, etc.
When learning to program, we are taught that the most important part of a system is the services and APIs they provide.
In fact, a large part of object oriented design is teasing out the right objects (which are services under my broad definition) and designing their APIs. Design patterns are no more than precanned services that have been discovered via trial and error.
Notice three fundamental characteristics of this approach:
- It’s a design goal to hide data from clients via encapsulation
- Every service has its own API
- Clients are tightly bound to services – if the service API changes the client also has to change
How well does the SOA approach work in the real world? Object oriented design has become the accepted approach for building large scale applications. It works to a degree – although the failure rate for creating large applications makes one wonder if there is a better way.
It also works, albeit barely, for creating in-process libraries (shared libraries or DLLs). The barely bit has to do with the difficulties of versioning service APIs, as witnessed by DLL hell on the Windows platform.
However, this approach fails when building distributed systems. It fails because clients are too tightly bound to services, it ignores the unreliability of networks and its not scaleable because of the inherent state held by services. Take a look around – when is the last time you saw a large distributed computer system working across multiple organizational boundaries that is based on COM, CORBA, RMI, SOAP web services, etc?
The Resource Oriented Architecture (ROA) Frame of Reference
Now lets change our frame of reference to a ROA. In ROA, the most important part of the system is the data.
Anything of interest gets a globally unique identifier, which is done via URIs.
Next, the service becomes uninteresting because every service has the exact same API. On the Web, the service API is defined by HTTP verb’s GET, POST, PUT, DELETE, etc.
So what are the benefits of these changes?
- Since all interesting data has a globally unique identifier, it becomes possible to create a web of links
- Since there is only one service API, it becomes possible to create a single client (think of a browser) that can access an infinite number of services (think of websites)
- Versioning issues between clients and services are greatly reduced (although they still happen when HTTP changes)
- State can be taken out of the system and made into its own resource, thereby making the system much more scaleable
How does interoperability happen with ROA given that the service API is always the same? It’s done through globally agreed upon data formats, known as representations. This might seem to be an impossible task, but in practice its has worked out quite well:
- xhtml/html/css are used for content, layout and presentation
- png/jpeg/gif are used for raster images
- atom/rss are used to model collections of items
- pdf is used to preserve print versus online fidelity
- mp3 and others are used for audio data
How well does the ROA approach work in the real world? Well, the web is one of the most successful distributed computer systems ever built, so I would say quite well.
I’m curious to see if understanding the different frames of reference used by SOA and ROA helps clear up some of the mystery surrounding REST – so I’d love to hear your feedback.
For a similar take on the differences between SOA and ROA, there are a number of excellent articles including ones by Stuart Charlton, Alex Bunardzic, Stefan Tilkov and Sanjiva Weerawarana.
And for a more in-depth look at ROA, read the W3C’s Architecture of the Web document and O’Reilly’s new book RESTful Web Services.
Update: In response to a post by Alex Bunardzic, I’ve written a follow up article that talks more about the data part of the system.