The OGC is Stuck in 1999
All these new OGC standards are sadly designed like it is still 1999.
If you don’t like ramblings, then here is the TL;DR for you:
TL;DR OGC Standards should be written for the future, not the present nor the past
The current focus of the discussion, at the core, is around the issue that the GeoServices REST API is perceived as a set of competing standards (i.e redundant to the WFS/WMS/WCS/etc specs) that don’t have an open (read open source) reference implementation that any organization can refer to when trying to implement the standard.
To me, this discussion is more about politics (which I do agree is a valid topic to discuss) than about the reality of the standards that do come out of the OGC lately.
Reading any of the documents from the OGC make me feel like I am listening to Prince (before!) 1999.
Standards should be written for the future – not the present
I remember several years back when HTML5 was the hot new thing. The standards were being worked on a lot, but most of the browsers did not support any of the features from HTML5. The table at caniuse.com (a place where you could go to see what HTML5 features were implemented by what browsers) were mostly red (i.e. not implemented).
Fast forward four years into the future, most of it is green. That is because those standards became adopted and the functionality that it brought to life makes everyone’s life better.
Designed in the past, adopted in the future. The way it should be.
Contrast this with the current set of standards that we have “up for comment” from the OGC right now…
Before you think that I am against the OGC, let me tell you that you could not be further from the truth. In most presentations that I have given in the last 6 years (at several conferences, some keynotes, and even my own GeoMeetups ) I have always dedicated slides and time to convince/teach people about the OGC and the importance of standards.
Yet, although I have several friends that do work in these standard bodies (I love you guys – you know who you are!), it still feels that there is an unnecessary amount of bureaucracy at the OGC that is truly killing innovation.
OGC is making “standards” that are outdated, unnecessarily overly-complicated, reference implementations that cannot be used as a reference (read below!), and a whole bunch of protocols that resemble what a protocol would have looked like in 1999. They completely ignore what we have learned in the last decade.
Let me give you some real examples.
- WFS, XML Datastores and queries:
XML Datastores were all the academic hype back in the day. Yet they never took off (with reason). The query language used by OGC (which is another OGC standard) assumes the underlying data store is XML. For those of you not familiar with what this actually means, it is more than a representation of something as text. It assumes you have an XML Infoset.
At the core, the OGC spec for Filters is just an attempt to represent SQL as something equivalent to an XPath expression to be used on the web (?!?).
Anybody that has ever tried implementing WFS will understand what this means. Basically, several months of development, to create a parser that grabs an xml document and turns it back into SQL. But to do it correctly, you don’t have to generate one SQL statement, but several ones. Why? Because the query language is so expressive, than in reality you should be able to create expressions that span multiple underlying tables (it assumes it is an XML Datastore after all) which basically makes you just really sad and go home depressed because it is unnecessarily complex.
I have an idea, why, instead of writing a client application that transforms my query into OGC filter speak, goes through a middleware that grabs the OGC filter speak and turns it back into SQL which then goes to my underlying database - instead - do I just not have one method (ExecuteSQL) and be done with it. Use SQL. Let’s just not reinvent a SQL. The CartoDB guys seem to have this right in their API . One tiny little api, you pass a string… it does a lot! You know how long it takes to implement something like this? Hours, not months. And you actually have more power than what is expressed by the OGC equivalent. Problem solved.
- XML/GML as a transport mechanism:
This is the mechanism of transport of choice for OGC standards. I have a two year old post about why I think this is far from ideal. Although there are some updates to be made to that answer, at the core, it still remains valid. If somebody tells you that the answer is to ‘gzip’ GML, then, they are wrong (a topic for a future post).
Switching to any of these, or including them as serialization options would make all the implementations of these standards better/faster.
- “Reference Implementations” that cannot be referenced It is true that a “reference implementation” should be able to be, well, referenced. I cannot currently look at how the GeoServices REST reference implementation (I guess ArcGIS Server) is implemented internally, thus, the specification fails at having a valid reference implementation.
Does that mean that for a reference implementation to be useful it needs to be Open Source?
Maybe that is not enough.
Because you cannot copy/paste or even create “derivative work” from it without also having to GPL/LGPL your work. Is looking at a reference implementation, seeing the internals, and copying how it works considered derivative work? Those are the type of questions that no developer wants to deal with, so IMHO, the OGC has the responsibility to pick reference implementation that in unencumbered by these issues. By picking Spatialite and a proprietary server (ArcGIS Server) as reference implementations to their respective specs, the OGC is showing that they are either 1) really clueless about these type of issues, or 2) don’t care. Either way, this is horrible.
I wish! Where are the implementations that take advantage of full duplex communications? Server side push? Seriously, are we still assuming users are going to poll all the time and transfer state in every request? This is exactly why most real-time tracking implementations are done incorrectly. Why are we still stuck in a stateless architecture design? We can do server push now without Flash or Java (or Silverlight yikes!). Websockets exist. USE THEM.
I touched superficially around this topic in the last PyCon. The video is up if you are interested.
- Https, Authentication/authorization, security
I don’t even want to touch on authentication/authorization around all the OGC specs. OAuthv2, or even the first version of that are far better than what is expected of the current implementations of security. Most people rely on passing USERNAME/PASSWORD in the request. This is horrible.
But wait, I guess if I use https that means I am secure right? Hell no!
Look at this video from last year’s Blackhat (quite awesome actually) and then come back with a straight face and tell me this is fine.
Yes, they use a special protocol called SPDY.
It is faster than traditional http (even though it is going over an encrypted channel!). Think about this for a sec.
Does the OGC use SPDY? Of course not. Does it make a difference? Check out this video of last year’s Google I/O about SPDY and you be the judge.
- Non-blocking servers.
The Mapbox guys get it. They use node.js for serving their tiles. Do you want to understand why this is a good thing? Check out this presentation by Ryan Dahl 3 years ago which gives an overview of why this is the case.
I can go on and ramble forever, but I want this to be more constructive.
**The discussion around OGC specs should be about what GIS is going to be like in 5 years in the future. **
Why did the OGC not work with the W3C to push WebSQL through (now stalled)? Imagine spatial extensions in every browser on the client side?
Why don’t we have a good spec about spatial replication, spatial changetsets, optimistic/pessimistic/long-transaction versioning, etc?
Why don’t we have specs that take advantage of web stateful connections?
Why don’t we have a spec for highly compressed geometries?
Why don’t we have a spec that defines better editing capabilities? (hint: WFS-T is not enough for several editing workflows)
Why don’t we have a spec that defines how to truly take advantage time-based datasets?
The list goes on.
Instead, the community as a whole is having long discussions analogous to the one about whether 3857, 3785 or 900913 was a better number for the Web Mercator definition.
Listen OGC, want to see a good standard definition? Check out MBTiles. Easy to implement, and to the point.
Want to change? Design/Define for the future - not for the present nor the past.
UbuntuGIS - GIS on Linux
I sometimes get surprised to find out that some people don’t know how easy is to get GIS packages on Ubuntu. UbuntuGIS is really one of the easiest ways to get up and running. You will notice that there are two repos.
UbuntuGIS “Stable” which has really old packages. You most likely don’t want that.
UbuntuGIS “Unstable” which is really not unstable at all. It just has newer packages. You do want that!
To get it working in your Ubuntu system add the ubuntugis-unstable repo:
sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable -y
sudo apt-get update
sudo apt-get upgrade
Now you can install all kinds of cool packages.
sudo apt-get install postgis gdal qgis
Is an example of how to get gdal, postgis and qgis ready to go.
360 Streetview-like views from Perú
I took these a couple of months ago. Man, I miss Perú. The 360 views came out a lot nicer than what I expected
Knight News Challenge
Running a two person startup is not easy. Regardless, every time there is an opportunity to make true impact while working on your startup at the same time, it is truly an exciting opportunity. If you got a sec, we could use some votes in the Knight News Challenge. All we need is come clicks on the “Applause” button.
SQLite Homebrew Errors with Virtualenv
If, like me, you have recently experienced some sqlite errors after having updated Homebrew, it may probably be related to the fact that sqlite just got changed to being a “keg-only” formula.
The issue is that the python virtualenv environment correctly links to the Homebrew-installed sqlite version, but when trying to run it, it doesn’t find it. This is likely to happen if you use virtualenv.
A quick workaround until that bug is fixed is to force the creation of the sqlite lib soft links in /usr/local/lib. You can do this with:
brew link --force sqlite
GDAL/OGR plugin for ArcGIS v0.6.1
I updated the installer. It has GDAL trunk with support for 55 formats.
Enjoy - and please submit bugs!
Blob post about AmigoCloud’s GDAL/OGR plugin for ArcGIS
Want to integrate ArcGIS and various Open Source formats? This is certainly a way to do it.
What is an ESRI GeoDatabase?
For those of us dabbling in geo work, It is not uncommon to hear the term GeoDatabase. Sadly, it is also not uncommon to hear the term GeoDatabase used incorrectly. Sometimes people use it to refer to ArcSDE, sometimes they use it to refer to that “Access mdb file”, sometimes it is just that “FileGDB thing”.
Improving Performance in PostgreSQL
Craig Kerstiens has two articles that are worth reading for those trying to improve query performance in PostgreSQL
That should help you in understanding how to tweak SQL queries / indexes.
Mapping Rookie Mistakes (#1): Always Geocoding on-the-fly
If you are hitting Google’s (or Bing’s) geocoding limit, you are probably doing something very wrong.
If your first reaction is to look for a workaround, then please STOP.
There is a reason for hitting that limit and you should be caching your results in your db instead.
You have been warned.
Mapping Rookie Mistakes
There are a set of common Mapping Rookie Mistakes (tm) that happen very often. I am hoping I can refer to some of those so that somebody looking for an answer will stumble upon these.
For those experienced with GIS, these are blatantly obvious. Nevertheless, for the uninitiated in geo, these may not be so evident.
These will not be in any particular order, so the numbers only represent the order in which I wrote them - not order of importance.
Let’s start with geocoding on the fly
Your LGPL license is completely destroying iOS adoption
tl;dr If you have LGPL licensed code, please be clear if you want it to be used alongside proprietary libraries in iOS.
I hesitated about writing this because it will surely put some people up in arms (many of them good friends). So please let me clarify before throwing grenades.
This is not FUD. For the uninitiated in Open Source licensing discussions, “FUD” usually refers to when somebody (usually with a hidden agenda to support a particular proprietary technology) starts throwing rocks at Open Source in general. The idea is that this person will create enough confusion, that the proprietary option will be chosen instead.
I don’t have any hidden agenda.
I actually admire Richard Stallman (the GNU father) a lot. It is impossible to argue that without his full-blown activism, sacrifice, and energy dedicated to the Open Source movement, we would not be where we are today. We have Open Source libraries for practically anything. Yes, RMS can do extremely odd things while discussing complex issues, but nobody can take away that he is really the crazy genius that changed the world for the better. Truth is, I think we need people like RMS to create a sort of equilibrium.
In the GIS space, we have several key projects that are LGPL‘d. The main one that comes to mind is GEOS a port of the wonderful JTS library - and this is where the iOS problem begins (JTS is LGPL and thus GEOS has to be LGPL because it is derivative work).
GEOS is really at the heart most Open Source GIS projects; it provides the core geometry object model defined by a standards organization and it also has an amazing list of algorithms/functionality. Very popular key GIS projects like PostGIS, QGIS, GeoDjango, Spatialite (sqlite with spatial extensions) and others rely on GEOS for all the geometry needs.
Without GEOS, a lot of this applications become extremely crippled. Thus, why you will (sadly) not see strong adoption of spatialite on iOS.
OK, good. So why is this a problem for iOS devices?
Without getting into all the details of the obligations/freedoms/restrictions/whatever-you-want-to-call-it that come with all the different versions of LGPL, the LGPL license requires you to release all the source code of any library you link statically to. Let me emphasize the word static linking here.
So fine you say, use a shared library instead.
I can’t. iOS does not allow shared libraries!
They are afraid your app will download a new version of the shared library at runtime and change the behavior and thus circumvent the whole Apple App Store Review process.
Beefing up my anti-DDoS system. Why? Because this is the part where both camps start fighting like their life depends on it.
One side will say that Apple’s App Store is Censorship and that the LGPL clause is actually protecting your freedom. The discussion turns into a philosophical and ethical one. I can actually see how those argument are very valid.
The other side will say that the App Review process makes it more difficult for rogue code (notice difficult - not impossible). Having a review process reduces the possibility that your code will crash on devices after the first install because of something you did not test. It also reduces “duplication or redundant” apps - with Apple being the sole Ruler in choosing what goes in and out. Big brother I guess. But I also see how the App Store supporter’s point is a valid point.
Note that this problem does not exist in Android because it allows shared libraries to be compiled. Nevertheless, my customers use iOS a lot which means I cannot just “switch” to Android and “ignore iOS”. I need to use some proprietary libraries like Flurry and ESRI’s iOS SDK and many others for reasons outside the scope of this post.
Some people will argue that it is possible to do static linking of LGPL’d code and still be able to abide by the legal terms by providing the necessary binaries to be re-linked. This has no legal precedent and is still a very fuzzy argument. I would rather not bet my company on something so shaky as this.
So of course, some people may think: “Well then, you want to use other people’s hard work to profit in your [insert best insulting word here] startup. Sorry buddy, use something else! This was licensed as philosophical/political statement.”
And that is where they are right… and where they are also wrong.
OK, I should have added sometimes. For certain OS projects, like GCC, this is very true. Nevertheless, from personal experience, I’ve found many times when I am talking to leads/creators of some OS projects, that they are completely unaware that the default LGPL creates this complications in iOS.
Some authors choose to add a static linking exception, others choose to re-license completely (think cocos2d that started as LGPL and switched to MIT). I am aware that for some projects with tons of contributors (think ffmpeg) it is impossible.
But if you have an awesome new project that could be extremely useful on iOS - and you have no philosophical or ethical issues with it - please please please please add a clarification to your licensing that covers this use-case. Even the Free Software Foundation doesn’t want you to use LGPL. So, please, pick sides, but don’t leave it in limbo state. If you don’t clarify, you are just completely destroying iOS adoption.
Hey, if you want me to release the changes / updates I make to your LGPL library, I have no problem with that whatsoever. I do it all the time.
What I refuse to do entirely is to completely ignore the licensing obligations altogether and just start using the library. Sadly, there are examples of Spatialite-with-GEOS/ffmpeg/put-your-favorite-lgpl-library-here code in the Apple App Store with developers who do not care one bit that they are violating those terms.
**Update (1/8/13): ** Corrected link to the current JTS site.
Topology and Routing Graphs in GIS
I realize that from a theoretical, formal, standpoint I tend to be a bit relaxed with my terminology. You could argue, I am practically asking to feel the ire of (in this case) somebody classically trained in Mathematics!
Nevertheless, during grad school in Computer Science, an amazingly smart professor explained to me the distinction about Theorists and Practitioners in such a clear way that I could finally make sense of the horrendous chunks of code that I have seen from people that I would otherwise think of as brilliant.
At some point of doing geo-work (hopefully) you will face a problem that has to be solved by creating graphs. User andytilia at GIS-StackExchange asked this very question..
In my answer, I tried to explain how creating a graph from GIS data is more than splitting your features in edges and nodes. It is about creating a graph that is optimized for your particular problem.