Peter and Tim have been blogging about
one of the least undestood parts of REST - how state is handled. Tim puts it nicely:
The essence of REST is to make the states of the protocol explicit and addressible by URIs.
Unfortunately, the official REST terminology obscures this fact by using the indecipherable phrase "Hypermedia as the Engine of Application State."
Whatever it is called, having states addressable by URIs is one of the best parts of the Web. When you do a search on Google, the state of your interaction with Google is returned in the web page of results you get back. That page contains many other links, each of which moves you to a new state. Thus its easy to jump from state-to-state, directly leading to the magic of the Web.
The same thing is true for automated systems as Peter explains. However, I disagree with part of what Peter writes:
With a truly REST based architecture you are free to change just about anything about the disposition and naming of resources in the system, except for a few well known start point URIs. You can change the URI shapes (for example, if you decide that you really hate the currently scheme). You can relocate resources to different servers (for example, if you need to partition you data). Best of all, you can do those things without asking any ones permission, because clients will not even notice the difference.
If I understand him correctly, Peter is arguing that as long as you provide and unchaning "start" link then you are free to change the embedded links in the result pages. I understand what Peter is getting at, as long as the top URI
doesn't change, then clients can pick out the next set of URIs and proceed on their way.
However, I still think this is a bad idea. As Tim Berners-Lee
famously wrote, Cool URIs don't change. A URI is a contract between you and the world - people expect to be able to bookmark it, write it down a napkin, tatoo it on their bodies, and come back many months later and have the URI work.
In a scheme like Peter's, that breaks.
Now,
for the work Peter is doing that might be ok - perhaps this system is only for internal consumption. And of course a world of never-changing URIs is nothing but an ideal. In the real world companies go out of business, processes are time-limited (like reserving an airline ticket), etc.
But I still think you should try your best to keep your URIs. If they have to change, then at least redirect the old ones to the new ones, or put a page up saying whatever you were doing is no longer valid.
Chris Tweedie posted an interesting reply to my rant about GIS standards and took me to task for "WMS bashing" and why WMS is a useful Enterprise standard. Perhaps - except I clearly stated in my original article that I was only talking about standards used on the web and not in the Enterprise.
Nevertheless, Chris does a good job of pointing out the flaws of tiled mapping systems. Its the usual litany of suspects - they take a lot of disk space, aren't standardized, don't support arbitrary scales and don't support customized styles.
Having spent four years of my life writing a WMS server, the Smallworld Internet Application Server (SIAS), I'm all too familiar with these issues. But in the end it all boils down to a fundamental conflict - map styling versus performance.
The Importance of Style
Even the slightest hint that you're limiting styling options is enough to send customers into a rage. And no, I'm not kidding.
This is important stuff - so important that there are laws in Germany, and probably other countries, saying exactly how maps should be rendered. I remember long, agitated discussions with German customers about how SIAS rendered circles. Our circles weren't perfect, they were off slightly off for reasons I no longer remember and in ways no one would ever notice. But it made our system non-compliant under German law, so had to be fixed.
Styling in Smallworld is totally customizable. It can vary based any number of things - here is just a small subset:
Object type - interstates are blue, main roads are red
Object attribute - Steel pipes are gray, copper pipes are yellow
Scale - A pipe is a green at less than 1:10,000, blue for 1:10,000 and greater
Area - Roads in England are not rendered the same as in the United States
And of course the kicker - users could override the rendering methods themselves and do whatever they pleased.
Smallworld then adds the concept of "drawing applications" (yeah, horrible name), which basically means that an engineer wants to see one set of styles, while a customer service representative wants to see another.
So when building SIAS, one of the absolute, you must meet this requirement or not bother building a product, was faithfully reproducing the styles users had set up in their Smallworld databases.
The Importance of Performance
At the same time, producing a beautifully rendered map counts for diddly if users have to wait a minute for it to show up in their browser.
Now Smallworld's rendering system is fast - after fifteen years of optimization it is the fastest GIS rendering system I've ever seen (its used to blow away ESRI, maybe it still does). And it had a clever caching system that made it possible to keep nearby geographic features in memory, making panning and zooming operations lighting fast for desktop clients.
This is really important - one of the things most people don't appreciate is how much data it takes to render a map. For example, say you want to make a detailed map of Manhattan. It will include tens of thousands of street segments, parcels, buildings, etc. All of which have to be fetched from the database. Then you have to look up the styles for each feature, and finally, render the map. So local, in-memory, caching is key - even today.
But this system breaks down on the Intranet or Web. A SIAS server may support tens or hundreds of users - each interested in a different geographic area. Thus you're almost guaranteed to break the cache, resulting in expensive queries back to the database.
One obvious solution is tying clients to servers based on their bounding box, but we never went down that route because we didn't see a way of doing it in a generic manner that would work out of the box.
So to support arbitrary styles and scales, and have decent performance, requires a lot of hardware. A whole lot of hardware. And that introduces a whole host of other issues - managing the hardware, client sessions, updates, etc. All of which are solvable, but it still takes time to get right.
And the Twain Shall Meet
So is there a middle ground? I think so - if you're willing to throw away arbitrarily scales and don't mind using lots of diskspace.
From talking to customers over the years, I don't think arbitrarily scales are as important as people think. What is important is what features are displayed at different scales and how they are styled. For examples, roads should be visible when the scale is less than 1:20,000 and be drawn in red.
In a typical Smallworld setup, users would create roughly 10 of these rules, which were called display scales. Other GIS systems have similar concepts. Thus, these display scales become the basis for your tiled zoom levels - although you would probably want to make sure you have 15 to 20 of them.
So what are the downsides? There are myriad:
Users can't change map styling on the fly (and if you really want this, then you're a power user and should just install a desktop client)
Updates to the database invalidate tiles, you need a process to determine which tiles have been invalidated and then regenerate them
In versioned databases, like Smallworld, you can really only support one version (unless you have *lots* of diskspace and time, a typical Smallworld database would have hundreds or thousands of alternatives).
And the advantage? You move map rendering out of the main code path. For an Intranet or Web client, I think its a no-brainer, you have to do it if you want a scaleable system.
A Toy Standard
As you might have guessed by now, I think WMS is a fatally flawed standard. Its a "toy" standard - its great if you have a few users but its extremely difficult to scale - whether you are on the Web or in the Enterprise.
Its difficult to scale because it doesn't constrain the problem. It imagines a world of instant map rendering where any client can request any bounding box, any scale, any coordinate system and any styling. Such a world does not exist today. Maybe it will in five or ten years, but that's doesn't help us now.
The obvious solution is to constrain the problem. If this seems like a horrible thing to do then just think of the Web. There is only one way to address things (URIs), there are only a few actions you can perform (HTTP has a handful of verbs), there is no central authority (and thus you get broken links), etc.
In the map rendering world, the constraints are fairly clear - fixed scales, fixed bounding boxes (ie, tiles) and fixed styling via pre-defined style groups. If you're willing to make those three simplifications, then you can create a Web Mapping Standard that really works.
Let's face facts - the stable of GIS Web standards is suboptimal. To show you what I mean, let's think about the common things you'd want to do with web mapping and see if there is a successful standard that you can leverage.
We'll define success as the widespread use of the standard on the web - there are thousands, or tens of thousands, of working examples (of a web nearing billions of users, that doesn't seem too much to ask). Of course a standard may be wildley successful in another domain, such as Enterprise software, but that doesn't count as the web.
Rendering Maps
Let's start with the most obvious thing - making maps. This is the domain of Web Map Service (WMS) and Style Layer Descriptor (SLD).
It doesn't take long to figure out why no main stream mapping sites actually uses WMS. Its design makes it difficult to cache results and unfortunately rendering maps ain't quick nor easy - there is a reason Google/Microsoft/Yahoo use cached tiles. The biggest WMS server I know of is Terraserver, and its seems to have kept Microsoft researchers busy happily writing papers about how to scale it.
Instead, everyone has seemed to come up with the same approach for web mapping - use pre-rendered tiles that use the Mercator projection. Or, if you have a desktop client (Google Earth, Virtual Earth), then the engineering is difficult enough that you use your own proprietary algorithms/encodings to actually make the system scaleable enough to work.
And as far as styling maps - have you actually read the SLD standard or used it? I didn't think so.
So we'll say no success in the standards world here. Which raises an interesting question - is there any room or need for a web map tiling standard (and the Web Coverage Standard, WCS, doesn't count since it support arbitrarily bounding boxes like WMS).
Adding Points to a Map
To provide some hooks into Google Earth, Google offers KML, which lets developers add custom information to maps. Behind the might of Google, KML has gained a large market share. Since Google has recently submitted KML to the OGC for standardization, we'll call this a standards win.
Specifying Locations
This is the world of GeoRss, which lets you specify points/lines/polgyons. GeoRss has clawed its way into importance due to its simplicity and one would assume developer fatigue with reading the GML spec. GeoRss is fine for what it does, but this is the most basic level possible, akin to a first-grade reading level. But we'll call it a win.
Sharing Data
This is the world of Web Feature Server (WFS) and Geography Markup Language (GML). I've previously blogged about why I think GML is too complex for the web, and since WFS depends on GML, the same arguments apply. Just to add some spice to the party, WFS adds its own proprietary (is it fair to call a standard proprietary?) XML query language for reasons that have never been obvious to me (XQuery anyone?).
So do you see these standards used on the Web? I sure don't. Instead I see people using RSS and ATOM with GeoRSS, or shoehorning feature information into KML. There is event talk of shoehorning GML into KML.
In my view this is the biggest gap in GIS web standards - something, someday is going to fill it in. If you are a standards wonk, and want to make a difference, I'd say start here.
If it was up to me, such a standard would be designed as an Atom extension that provides a super-simple way of including feature property values. And it would use GeoRSS for geometries.
Remembering Context
There is remarkable amount of state involved when looking at a map. The most obvious thing is where you are looking at - but you also have to remember the scale, the layers that are on or off, the projection, the styles in use, etc. This is the world of the Web Map Context (WMC) standard. Since I've never implemented WMC, I don't have an opinion on its technical merits. But going back to our measure of success, is it used on the web? Not as far as I can see.
Dejure versus Defacto?
So standards strike out on map rendering and sharing geographic features, but have succeeded in specifying locations and custom map content. And there is an interesting pattern here - the dejure standards have failed, while the defacto standards have succeeded. Perhaps more about that in another post.
But after the incredible amout of energy and time spent developing these standards, it feels like precious little success.
A couple of weeks ago I blogged that Safari 3 beta's SVG support was buggy.
Here's a concrete example (if you are using Internet Explorer unfortunately you won't see anything at all). The svg drawing below is relatively positioned by 50 pixels (if you are not familiar with CSS terminology, that means it is offset 50 pixels down and right of where it normally would be drawn). When you mouse over the blue rectangle, it should turn red. This works correctly in Opera and Firefox.
However, in Safari, the blue rectangle turns red only when you mouse over the yellow rectangle! It sure looks like Safari isn't taking into account the relative offset of the SVG element. Absolutely positioning the SVG element only makes things worse - only a small part of the blue rectangle gets highlighted.
This is obviously a major bug, so we'll report it to Apple and hope it gets fixed in time for the Safari 3 release.
Silly me - I thought gaudy, in-your-face advertising went out with the Dotcom boom. Apparently not.
As I was reading the Denver Post the other day, I watched in horror, and amazement, as a Frontier ad slowly unfolded across the screen completely obscuring what I was reading:
A bit of experimentation shows that the ad rears its ugly face roughly once per fifteen visits to the site. And just to make sure it couldn't be more annoying, the close button on the top right of the ad doesn't work.
Thus you are forced to stare at this inane ad, getting angrier and angrier with each passing second, until it mercifully closes (which takes about 7 seconds, enough for me to get the screen shots).
The end result - I'm about to give up on the Denver Post (and I love reading newspapers) and Frontier airlines. Nice job guys!
Update - Armin has an alternate implementation based on some fancy regular expressions combined with String's split method. Its supports most of ERB and avoids all the string mashing I do. Nice!
Templating engines are the most popular way to generate HTML pages and other web content. First popularized by PHP and ASP , templating engines allow you to mix code and content. The templating engine then takes the combined content, extracts the code, runs it, and combines the results with the remaining content to produce the final output.
Since templating engines are generally used to create HTML that is displayed by a browser, they are almost always run on a server. But now that all modern browsers support the DOM, XML and Ajax, it can be helpful to run a templating engine on the client.
Before continuing, remember that JavaScript templates are often not the right solution. Alternatives include generating HTML on a server, or if you are using XML, to use XSL on the client or server to generate HTML.
But if you need something simple and light, perhaps to display a JSON result returned by an Ajax request, then JavaScript templates may fit the bill.
Writing the Templating Engine
A quick search on the Internet found a few existing engines, such as JavaScript Templates, Ajax Pages and the Prototype library. However, I found the first two to be a bit heavyweight while Prototype was a bit to simple (it only supports the replacement of values, not the execution of arbritrary statements such as for loops). So I decide to roll my own.
Creating a template engine in JavaScript is remarkably easy due to the power of String's replace method. One of its lesser known features is that you specify a function to invoke every time a pattern is matched. The pattern is replaced by the results of the invoked function. Using replace, you can write a template compiler in ten lines of code (and undoubtedly less if you wanted to).
The whole templating engine weighs in at 90 lines, including a helper function copied from the Prototype library. The engine defines two objects - a template object and a parser object. The template object takes a string that includes mixed code and content, invokes the parser to compile the template, and then evaluates and returns the result.
Using the Templating Engine
To see how this works, I've created a simple example that is online. If you look at the HTML code, you'll see:
functionreplaceContent(){var colorsArray =['Red','Green','Blue','Orange']
var source ='<p>Here is a list of <%= this.colors.length %> colors:'+' <ul>'+' <% for (var i=0; i<this.colors.length; i++) { %>'+' <li><%= this.colors[i] %></li>'+' <% } %>'+' </ul>'+'</p>'var template =new JsTemplate.Template(source) var content = template.run({colors: colorsArray}) var element = document.getElementById('content')
element.innerHTML = content
}
The first thing to notice is that the source variable specifies the mixed code and content. The syntax is similar to ERB, which is a Ruby templating engine. The two recognized tags are:
<% %> Run JavaScript code
<%= %> Replace JavaScript code with the result
To create your own tags create a new Parser object with an appropriate regular expression.
The second thing to notice is that the data use by the template engine is specified via a parameter to the run method. The parameter should be a JavaScript object. The properties of the object are copied to the template object, thus allowing the template to refer to them via the this keyword. And that is about it (I said it was lightweight!).
To source code of the templating engine is here, while the example is here. Note the code is released under an MIT license, so you can use it however you would like. Enjoy.
Opera 9 is a great browser - it small, standards compliant and fast. And not just slightly faster - really fast (I don't believe Apple's browser speed comparision they've put up with the Safari 3 Beta - its not what I've seen in the real world).
For example, out of the box its ten times faster than Firefox at rendering complex SVG drawings. IIt almost fast enough that creating SVG maps is plausible...ah well...skip that...but that is for another post. Anyway, I'll let you in on a little secret - Firefox's rendering performance is salveagable - go read SVG's suspendRedraw and unsuspendRedrawAll apis..
Hashing up Ajax
However, Opera has one awful gotcha - you cannot return an HTTP status code other than 200 in an Ajax request and get the response body back. You might be thinking that's awfully obscure, but its not.
For example, say your server supports the Atom Publishing Protocol. When a user POSTs a new resource to your server, its job is to return a representation of the new resource with an HTTP status code of 201 CREATED. So something like this:
Except Opera won't return the result! If you check the XmlHttpRequest's responseText or responseXML attributes they are NULL. This is problematic, because the response contains valuable information - such as the ID the server has assigned the new entry.
It also means you can't use standard error codes if your client needs access to the response body.
The Frustration of it All
What's galling about this bug is that its most likely caused by an if statement deep in the bowls of the Opera that intentionally throws away the result when the status code is not 200! Thus, its probably a simpe fix...which leads to the next frustration.
As good as Opera's browser is, its bug tracking sytem is as bad. I submitted a bug about this over a year ago and obviously it hasn't been fixed. I can live with that, but it sure would be nice to see its current status.
Except Opera hides its bug reports, like many other companies. I've never understood the logic of this. I've heard the argument that a bug report could contains information that could competitively disadvantage a company - but how often does that really happen? And when it does, just delete the information - you do read your bug reports, right!
I think the real reason is that corporations don't like to appear fallible, and bugs are obvious, small failures. However hiding them does not make them go away - instead it just frustrates users. Everyone knows software has bugs - so why not face up to the truth?
So to change the world in my own small way, MapBuzz's bug reports are public and always will be.
Its not every day someone takes the time to write me an open letter - I have to say its kind of fun. Brian added some additional thoughts to our ongoing conversation about GML. In truth, this is where blogging breaks down a bit, it would be much easier to sit down in a room for an hour and have a great in-depth technical discussion (of course, then our discussion wouldn't be available for the whole world to see which is significant downside).
Since its a bit hard sifting through where things stand in a long discussion, let me recap the points I think we agree on:
GML is a toolkit that provides rules for translating your proprietary data model into XML
Having translated your data model into GML/XML, it is then necessary to code both clients and servers to understand it
Where we disagree is whether this is a good idea or not.
I see at least three very different use cases here:
I want to share within my own organization
I want to share with a preselected set of outside organizations
I want to share with the world
I'll agree with Brian that for the first two use cases, GML 2 (and 3) provides a workable solution (although I think GML 1 was a better solution and that the overhead of GML 2 is prohibitive).
Its item #3 though that really matters. One of the things that makes the Web different is Metcalfe's Law (and Reed's Law) becomes predominant - the value of something becomes much more important the more people use it. Which leads me to the conclusion that everyone has to agree to a shared data model and format. Otherwise you end with thousands of one-off data integrations, which does nothing to solve the general problem.
There are obvious downsides to agreeing to a general data model - it will always be a lowest common denominator and wont work for many complex integrations that live in the realm of the first two use cases. But there is an obvious upside - it is the only thing that has any chance of working out on the web. If you don't agree, then please show me a real-life example that disproves it.
So where does that leave us? I believe that GML as it is formulated has no chance of success out on the Web because its simply not designed for it. The obvious consequence is the emergence of the Atom / GeoRSS combination and KML. And truth be told, those standards solve the problem of rendering maps made up of multiple geographic data sources well enough.
What they don't solve is exchanging attribute data between systems. And this leads right into the hornet's nest of the Semantic Web and data modeling - no one has every come up with a solution to this problem and I doubt anyone ever will.
In today's world, I'd modify this a bit and start with Atom, add in GeoRSS, and then add in an new namespace that encodes properties like above. And I'd stick the same stuff in the KML metadata tag.
Now, I don't expect this to do diddly-squat for machine to machine integration. What I do expect it to do is make it easy for clients to show a nice property browser to users when they mouse over a feature on a map. And for the web, that's good enough since it all comes down to humans in the end anyway.
Now that you've launched your new killer Web 2.0 website, how do you detect errors in your deployed Javascript?
Using onerror
The standard approach is to hook into the Window object's onerror event handler and use Ajax to log your requests to the server. There are some good tutorials on the web on how to do this, but here is one approach:
// Register global error handler
window.onerror =function(message, uri, line){var fullMessage = message +"\n at "+ uri +": "+ line
remoteLogger.log(fullMessage)
// Let the browser take it from herereturnfalse}
Notice that you have to explicity set window.onerror using old-style event handlers, attachEvent and addEventListener don't seem to do the trick. Also, Internet Explorer always stupidly sets the URI parameter to the URI of the current page, not the file that caused the problem. Thus you have to hunt through all the linked Javascript files looking at the specified line number and make a best guess as to which file caused the problem.
Getting a Stack Trace
Often times knowing where an error occurred is not enough - what you really need is the full stack trace. If your user has a Gecko based browser (eg, Firefox), you're in luck.
Notice the stack parameter in the code above? Gecko has a poorly documented stack property on the global Error object that provides just what we need.
Let's say you have code that looks like this:
functionfirst(){second()}
second =function(){blowUp()}functionblowUp(){try{
foo.bar()}catch(e){handleException(e)}}
If you call the first method, say from onload like in this example, the resulting stack trace is:
Notice that the method_name will be blank for anonymous functions, which are quite common if you follow the prototype javascript style.
A Simple Logger
Now let's take a look at some code. First, here is a simple logger that posts errors to a server via Ajax (note this code assumes your are using the prototype library):
Next, let's define the handleException method we used above. This method extracts out useful information from an Error object and uses the logger above to post the results to a server.
functionhandleException(exception){/* In FF exception can be a string if it happens when opening the xmlHttpRequest. Gah! */if(typeof exception =='string')
exception =newError(exception)/* If a xmlhttp request is happening in Mozilla and the user navigates to another page, then when the first request returns a NS_ERROR_NOT_AVAILABLE error will be thrown. So just ignore it. */if(exception.name =='NS_ERROR_NOT_AVAILABLE')returnvar fullMessage =''var uri =''var stack =''var line =''try{/* Don't use exception.toString since the JS spec does not require it to provide the error name or message (haven't tested to see if it matters though across browsers) */
fullMessage = exception.name +': '+ exception.message
uri = exception.fileName
stack = exception.stack
// Firefox sometimes blows up here
line = exception.lineNumber
}catch(e){}
fullMessage +="\n at "+ uri +": "+ line
console.info(fullMessage)
console.info(stack)
remoteLogger.log(fullMessage, stack)}
Annoying Gotchas
There are few annoying gotchas to know about:
Stack traces are only available from Error objects thrown by exceptions and thus are not available from the onerror method.
Uncaught exceptions are not hanlded by onerror.
There is no global exception handler in Javascript, so you have to be very careful in the way you right your code. On the positive side, its easy to implement a global error handler for methods that are invoked due to the results of an Ajax request. On the other hand, its much harder to do this for methods invoked due to normal events generated by a user interacting with the browser.
Anyway, the more information you can get about browser errors the better - you'll often be surprised by the results!
If you write a lot of JavaScript, then you'll find Matthias Miller's website
chockfull of useful information. And even better, he's put together two great
tools.
The first is JavaScript
Lint, a tool that analyzes JavaScript code and reports potential
errors. Mozilla's SpiderMonkey JavaScript
engine has similar functionality, which you can turn on by typing about:config
as the URL and then searching for javascript.options.strict. However, JavaScript
Lint detects more types of errors and is configurable. To give it a try,
I ran it over the MapBuzz code base and was very impressed by the errors it revealed.
I immediately added it to our automated testing processes (i.e., its spawned
by a rake task
now) and have come to depend on it.
And if that wasn't enough, Matthias has adopted a second tool, called Drip,
that detects memory leaks in Internet Explorer's JavaScript engine. Memory
leaks occur in IE because it uses COM based
reference counting to delete objects. Reference counting is
fast, and easy to understand, but does not handle
circular object
references. As you use your website in IE, Drip keeps track of these circular
references. Once you're done, it prints out a report showing you where the
memory leaks occurred.
Thanks for the great tools Matthias.
Update1 - A few people have pointed out some negative articles aboout JSLint. I've never used JSLint so I don't have any opinions about it. What I'm talking about above is a totally separate project called JavaScript Lint (confusing, isn't it?). I highly recommend JavaScript Lint, it turned up a number of bugs that I hadn't caught previously. In addition, it is highly configurable so it's easy to turn off certains classes of errors. And for the utmost in flexibility, you can annote your code to tell JavaScript Lint to skip certain sections of it.
Update 2 - A couple of people mentioned FireBug and Venkman as invaluable tools. I couldn't agree more - I posted about Firebug a while back when I first started using it. And Venkman has been an old favorite for years - it was great to see it finally updated to workin Firefox 1.5. Hopefully it will also be quickly updated to work with Firefox 2.0. And if you're debugging Internet Explorer, Visual Studio provides a good JavaScript debugger as does Microsoft's free script debugger package.