Since returning from FOSS4G, I’ve been nagged by a question in the back of my mind – can open source succeed in the geospatial world?
Now you might think I’ve lost my mind. Didn’t hundreds of open source geospatial developers just attend FOSS4G? And haven’t I checked MapServer’s market share? And haven’t I noticed how many people use PostGIS? And don’t I remember that Autodesk open sourced MapGuide last year?
More than Just Software
Sure – I noticed all that. But its only part of the story. The geospatial world is different because it requires more than software – it also requires massive amounts of data and lots of hardware to serve that data. That simply is not true in other industries that open source plays – it doesn’t apply to operating systems, web servers, databases, drawing programs, web browsers, email clients, sales force management, etc.
Similarly to these other problem domains, open source has created some wonderful geospatial software products that stack up well against their commercial brethen. But can open source also solve the data and distribution pieces? Can it recreate Google Maps – which is the penultimate combination of software, data and hardware?
Depending on your political viewpoint you may or may not think this is important. If you’re an open source advocate then developing the software is probably good enough. You’d just leave the data to Navteq and the distribution to Google, Yahoo or Microsoft. But if you’re a free software advocate, then clearly it isn’t. You’ll always be subject to data licensing fees and data distribution charges. And more fundamentally, quoting Jimmy Wales, the founder of Wikipedia :
It is broken for the same reason that proprietary software is always broken: lack of freedom, lack of community, lack of accountability, lack of transparency.
So assuming that the open source world can create the needed software to replicate Google Maps (and I think it already has), then the next challenge is the data. Traditionally, gathering and maintaining geospatial data has been time consuming and hugely expensive. As a result, freely available, high-quality data is only available in countries with enlightened governments – which sadly are few and far between.
But just as the Internet enabled the birth of the open source movement, mass-market GPS devices have enabled the birth of the open source mapping movement. These citizen mappers have congealed around the OpenStreetMap project, which can be thought of as a Wikipedia for maps. Thousands of users upload their GPS data, together creating highly detailed and accurate maps of where they live.
I consider OpenStreetMap wildly successful – it’s proven that citizens can create their own maps. Yet OpenStreetMap provides data for a vanishingly small part of the world. Will it continue to grow? Will it be possible to cover the world? What happen in countries with free data, like the United States? Will OpenStreetMap be able to gather and manage important non-geographic data, such as which roads are one way, where are addresses located, road numbers, etc?
My guess is yes – it feels to me that OpenStreetMap is nearing a tipping point. As more data comes online, the more useful it becomes, and the more people want to get involved. Thus starts a virtuous cycle which will propel OpenStreetMap to cover to the globe, just as Wikipedia has become the world’s encyclopedia.
And its seems likely this virtuous cycle will lead to a new competitor for Navteq. Navteq has built a $600 million a year company based on selling mapping data. Of that $600 million, last year $275 million went to creating and distributing their massive database.
A startup that leverages Open Street map data could compete directly against Navteq. Its main advantage would be cost – its main disadvantages would be data availability and data quality. But as the startup grew, it would pour resources into OpenStreetMap by paying top volunteers to add more data, ensuring data quality and funding the mapping of previously unmapped areas.
And of course more money means more data which means more utility which means more people. And the virtuous cycle would continue just as it does in the Linux world where RedHat, IBM and Novell bank role much of Linux’s development.
When it Costs Real Money
But now we run into trouble. Why do open source projects work? Because participants love what they do and love the recognition they get for doing it. But above all, open source projects are about community and working together with your colleagues and friends.
However, open source projects are predicated on the fact that participation does not entail significant out of pocket expenses. If I want to join an open source software project I need a computer and Internet connection – two things I’m likely to have. And if I want to join OpenStreetMap then I need a GPS, which I probably also have and if I don’t then I can buy one for a couple hundred of dollars.
And thus we get to the hard part in duplicating Google Maps. Two parts of it cost money – lots of money. The first part is the satellite and aerial imagery. And the second is the infrastructure to store, process and distribute all this data.
One of the most compelling features of Google maps is the imagery data. Who hasn’t looked at your house or apartment above? And who hasn’t seen pictures of B-2 bombers, Navy buildings that look like swastikas and advertisements big-box stores are putting on their roofs?
Can open source duplicate this data? I don’t see how. Very few citizens have the expertise or money to get an airplane, strap on a camera, and start taking high resolution aerial photography. It seems the best that can be done is using freely available government data, just as NASA’s World Wind program does. Is that good enough?
And in this case, I don’t see why a commercial enterprise would come to the rescue. Why would Digital Globe provide its imagery data to OpenStreetMap? First, they don’t get the benefit of user contributed data. And second, they’d seriously tick-off their commercial customers such as Google and Microsoft. Thus there is no virtuous cycle here.
Update – 80n in the comments pointed out that last year Yahoo announced that OpenStreetMap could use its aerial imagery. Very interesting, I had totally missed that. Of course, I think the same licensing worries as mentioned above still apply, but kudos to Yahoo.
Which leads us to the last point – storing, processing and distributing spatial data takes a whole lot of hardware. Last year I calculated that at zoom level 20 Google maps requires 1,099,511,627,776 (that would be a cool one trillion) tiles to cover the world. And since then, they’ve added additional zoom levels.
In addition to storing the tiles, you’d need machines to store the original vector data, render the tiles and serving them on the Web. And we haven’t even talked about imagery yet, which takes even more space. Even then, having the hardware isn’t enough. You’d also have to have the expertise to run it all, which perhaps is Google’s most treasured secret.
So its seems to me there is absolutely no chance that open source can duplicate this infrastructure. However, there could be two ways around it.
First, use peer-to-peer software to create a vast network of home computers that can render the tiles and serve the data around the world. SETI, BitTorrent and Skype have proven the basic concept. Grub is trying to prove it can work for crawling the web. OpenStreetMap is following with its tiles@home project, which is an attempt to distribute tile rendering across hundreds (maybe someday thousands) of machines across the Internet. But can peer-to-peer software also work for serving data on the web (I’m assuming no one is going to download a GoogleEarth clone to view maps)?
Second, let a commercial company build the infrastructure. The obvious business model is hosting Open Street map data for free and generating revenue through local advertising. Thus such a company would be competing directly against Google, Yahoo and Microsoft – hardly an easy task. And in the end, would it be any different than Google, Yahoo or Microsoft today?
So can open source replicate Google Maps? On the software side – yes. On the map data side – yes. On the imagery data side – no. And on the infrastructure side – its a stretch.
So I think the answer is no – open source cannot replicate Google Maps – at least for now. But it can get close. And if WikiSearch succeeds, and proves that even mighty Google is susceptible to open source, then at least there will be a foundation on which to stand and try.