Friday, August 30, 2013

A quick way to generate geospatial files

Sometimes I just need to visualize a few geometries right at that moment… thus I just need a quick and dirty way to generate a KML/Shapefile/FileGDB/GeoJSON/Whatever with a few geometries. For those instances, I just create a quick CSV with the data I need and a simple GDAL VRT file.

Create a file named test.csv with the contents that you want. In my case, I just want a couple of rows with some arbitrary IDs and a couple of polygons:

id,geo
1,"POLYGON ((-122.3986816406250000 37.7819824218750000, -122.3931884765625000 37.7819824218750000, -122.3931884765625000 37.7874755859375000, -122.3986816406250000 37.7874755859375000, -122.3986816406250000 37.7819824218750000))"
2,"POLYGON ((-122.3876953125000000 37.7819824218750000, -122.3822021484375000 37.7819824218750000, -122.3822021484375000 37.7874755859375000, -122.3876953125000000 37.7874755859375000, -122.3876953125000000 37.7819824218750000))"

Then, create a file named test.vrt and customize it as you see fit:

<OGRVRTDataSource>
    <OGRVRTLayer name="test">
        <SrcDataSource>test.csv</SrcDataSource>
        <GeometryType>wkbPolygon</GeometryType>
        <LayerSRS>WGS84</LayerSRS>
        <Field name="id" src="id" />
        <GeometryField encoding="WKT" field="geo" />
    </OGRVRTLayer>
</OGRVRTDataSource>

In this example, my SRS will be 4326. Now I can generate whatever type of file I want.

Generate a kml:

 ogr2ogr -f "KML" mykml.kml test.vrt

Generate an ESRI shapefile:

 ogr2ogr -f "ESRI Shapefile" myshapefile.shp test.vrt

You get the picture. If you want to create points or lines, the VRT documentation has examples.

Tuesday, August 27, 2013

Skybox, Velodyne LiDAR and 3DR Robotics drones

Last week’s GeoMeetup went well.

This time, our friends at SlidesLive were kind enough to record the three presentations and put them up:

Wolfgang Juchmann (Velodyne LiDAR) brought a LiDAR sensor that displayed realtime point clouds and showed some demos of projects where people were using LiDAR. Just a heads-up: there was a minor conflict between the slide-recorder and the OpenGL software used by the LiDAR sensor, so the realtime nature of the point cloud is not displayed accurately.

Brandon Basso (3DR Robotics) showed how low-cost UAVs/drones are currently being used by farmers to improve crops.

Oliver Guinan (Skybox) talked about possibilities that are enabled when having several low-cost micro-satellites.

And of course, beer.

If you are ever in SF, and you are into geospatial topics, you should check out the GeoMeetup or sign up for the GeoMeetup Discussion List.

Thursday, May 9, 2013

The OGC is Stuck in 1999

All these new OGC standards are sadly designed like it is still 1999.

If you don’t like ramblings, then here is the TL;DR for you:

TL;DR OGC Standards should be written for the future, not the present nor the past

I have to get something out of my chest. Yes, it is triggered by the current discussions about accepting GeoServices REST API as an OGC standard.

The current focus of the discussion, at the core, is around the issue that the GeoServices REST API is perceived as a set of competing standards (i.e redundant to the WFS/WMS/WCS/etc specs) that don’t have an open (read open source) reference implementation that any organization can refer to when trying to implement the standard.

To me, this discussion is more about politics (which I do agree is a valid topic to discuss) than about the reality of the standards that do come out of the OGC lately.

Reading any of the documents from the OGC make me feel like I am listening to Prince (before!) 1999.

Seriously.

Standards should be written for the future – not the present

I remember several years back when HTML5 was the hot new thing. The standards were being worked on a lot, but most of the browsers did not support any of the features from HTML5. The table at caniuse.com (a place where you could go to see what HTML5 features were implemented by what browsers) were mostly red (i.e. not implemented).

Fast forward four years into the future, most of it is green. That is because those standards became adopted and the functionality that it brought to life makes everyone’s life better.

Designed in the past, adopted in the future. The way it should be.

Contrast this with the current set of standards that we have “up for comment” from the OGC right now…

Before you think that I am against the OGC, let me tell you that you could not be further from the truth. In most presentations that I have given in the last 6 years (at several conferences, some keynotes, and even my own GeoMeetups ) I have always dedicated slides and time to convince/teach people about the OGC and the importance of standards.

Yet, although I have several friends that do work in these standard bodies (I love you guys – you know who you are!), it still feels that there is an unnecessary amount of bureaucracy at the OGC that is truly killing innovation.

OGC is making “standards” that are outdated, unnecessarily overly-complicated, reference implementations that cannot be used as a reference (read below!), and a whole bunch of protocols that resemble what a protocol would have looked like in 1999. They completely ignore what we have learned in the last decade.

Let me give you some real examples.

- WFS, XML Datastores and queries:

XML Datastores were all the academic hype back in the day. Yet they never took off (with reason). The query language used by OGC (which is another OGC standard) assumes the underlying data store is XML. For those of you not familiar with what this actually means, it is more than a representation of something as text. It assumes you have an XML Infoset.

At the core, the OGC spec for Filters is just an attempt to represent SQL as something equivalent to an XPath expression to be used on the web (?!?).

Anybody that has ever tried implementing WFS will understand what this means. Basically, several months of development, to create a parser that grabs an xml document and turns it back into SQL. But to do it correctly, you don’t have to generate one SQL statement, but several ones. Why? Because the query language is so expressive, than in reality you should be able to create expressions that span multiple underlying tables (it assumes it is an XML Datastore after all) which basically makes you just really sad and go home depressed because it is unnecessarily complex.

I have an idea, why, instead of writing a client application that transforms my query into OGC filter speak, goes through a middleware that grabs the OGC filter speak and turns it back into SQL which then goes to my underlying database - instead - do I just not have one method (ExecuteSQL) and be done with it. Use SQL. Let’s just not reinvent a SQL. The CartoDB guys seem to have this right in their API . One tiny little api, you pass a string… it does a lot! You know how long it takes to implement something like this? Hours, not months. And you actually have more power than what is expressed by the OGC equivalent. Problem solved.

- XML/GML as a transport mechanism:

This is the mechanism of transport of choice for OGC standards. I have a two year old post about why I think this is far from ideal. Although there are some updates to be made to that answer, at the core, it still remains valid. If somebody tells you that the answer is to ‘gzip’ GML, then, they are wrong (a topic for a future post).

I am still baffled as to why do none of the standards, yet, refer to things like Protobuffers, MessagePack, Thrift, Avro - gosh - anything that we have learned to do better in the past decade!

Switching to any of these, or including them as serialization options would make all the implementations of these standards better/faster.

- “Reference Implementations” that cannot be referenced It is true that a “reference implementation” should be able to be, well, referenced. I cannot currently look at how the GeoServices REST reference implementation (I guess ArcGIS Server) is implemented internally, thus, the specification fails at having a valid reference implementation.

Does that mean that for a reference implementation to be useful it needs to be Open Source?

Maybe that is not enough.

Some people would argue that the reference implementations for the OGC GeoPackage spec, Spatialite, is arguably also not a good choice.

Why?

Because you cannot copy/paste or even create “derivative work” from it without also having to GPL/LGPL your work. Is looking at a reference implementation, seeing the internals, and copying how it works considered derivative work? Those are the type of questions that no developer wants to deal with, so IMHO, the OGC has the responsibility to pick reference implementation that in unencumbered by these issues. By picking Spatialite and a proprietary server (ArcGIS Server) as reference implementations to their respective specs, the OGC is showing that they are either 1) really clueless about these type of issues, or 2) don’t care. Either way, this is horrible.

- Websockets?

I wish! Where are the implementations that take advantage of full duplex communications? Server side push? Seriously, are we still assuming users are going to poll all the time and transfer state in every request? This is exactly why most real-time tracking implementations are done incorrectly. Why are we still stuck in a stateless architecture design? We can do server push now without Flash or Java (or Silverlight yikes!). Websockets exist. USE THEM.

I touched superficially around this topic in the last PyCon. The video is up if you are interested.

- Https, Authentication/authorization, security

I don’t even want to touch on authentication/authorization around all the OGC specs. OAuthv2, or even the first version of that are far better than what is expected of the current implementations of security. Most people rely on passing USERNAME/PASSWORD in the request. This is horrible.

But wait, I guess if I use https that means I am secure right? Hell no!

Look at this video from last year’s Blackhat (quite awesome actually) and then come back with a straight face and tell me this is fine.

** - SPDY** If you do not what SPDY is, then, you will definitely be surprised to tell you that all your https traffic to Google , Facebook and Twitter is not going over “traditional” SSL.

Yes, they use a special protocol called SPDY.

It is faster than traditional http (even though it is going over an encrypted channel!). Think about this for a sec.

Does the OGC use SPDY? Of course not. Does it make a difference? Check out this video of last year’s Google I/O about SPDY and you be the judge.

- Non-blocking servers.

The Mapbox guys get it. They use node.js for serving their tiles. Do you want to understand why this is a good thing? Check out this presentation by Ryan Dahl 3 years ago which gives an overview of why this is the case.

Anyway.

I can go on and ramble forever, but I want this to be more constructive.

**The discussion around OGC specs should be about what GIS is going to be like in 5 years in the future. **

  • Why did the OGC not work with the W3C to push WebSQL through (now stalled)? Imagine spatial extensions in every browser on the client side?

  • Why don’t we have a good spec about spatial replication, spatial changetsets, optimistic/pessimistic/long-transaction versioning, etc?

  • Why don’t we have specs that take advantage of web stateful connections?

  • Why don’t we have a spec for highly compressed geometries?

  • Why don’t we have a spec that defines better editing capabilities? (hint: WFS-T is not enough for several editing workflows)

  • Why don’t we have a spec that defines how to truly take advantage time-based datasets?

The list goes on.

Instead, the community as a whole is having long discussions analogous to the one about whether 3857, 3785 or 900913 was a better number for the Web Mercator definition.

Listen OGC, want to see a good standard definition? Check out MBTiles. Easy to implement, and to the point.

Want to change? Design/Define for the future - not for the present nor the past.

Wednesday, May 8, 2013

UbuntuGIS - GIS on Linux

I sometimes get surprised to find out that some people don’t know how easy is to get GIS packages on Ubuntu. UbuntuGIS is really one of the easiest ways to get up and running. You will notice that there are two repos.

UbuntuGIS “Stable” which has really old packages. You most likely don’t want that.

UbuntuGIS “Unstable” which is really not unstable at all. It just has newer packages. You do want that!

To get it working in your Ubuntu system add the ubuntugis-unstable repo:

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable -y sudo apt-get update sudo apt-get upgrade

Now you can install all kinds of cool packages.

sudo apt-get install postgis gdal qgis

Is an example of how to get gdal, postgis and qgis ready to go.

Enjoy.

Tuesday, April 2, 2013

360 Streetview-like views from Perú

I took these a couple of months ago. Man, I miss Perú. The 360 views came out a lot nicer than what I expected

https://plus.google.com/photos/100825893460454450520/albums/5862409760288820561

Thursday, March 21, 2013
[My video is up in the PyCon site](http://pyvideo.org/video/1771/realtime-tracking-and-mapping-of-geographic-objec).

My PyCon talk video is up in the PyCon video site.

Realtime Tracking and Mapping of Geographic Objects using Python.

Monday, March 11, 2013

Knight News Challenge

Running a two person startup is not easy. Regardless, every time there is an opportunity to make true impact while working on your startup at the same time, it is truly an exciting opportunity. If you got a sec, we could use some votes in the Knight News Challenge. All we need is come clicks on the “Applause” button.

Tuesday, February 12, 2013
Sometimes bugs are beautiful

Sometimes bugs are beautiful

Monday, February 11, 2013

SQLite Homebrew Errors with Virtualenv

If, like me, you have recently experienced some sqlite errors after having updated Homebrew, it may probably be related to the fact that sqlite just got changed to being a “keg-only” formula.

The issue is that the python virtualenv environment correctly links to the Homebrew-installed sqlite version, but when trying to run it, it doesn’t find it. This is likely to happen if you use virtualenv.

A quick workaround until that bug is fixed is to force the creation of the sqlite lib soft links in /usr/local/lib. You can do this with:

brew link --force sqlite
Tuesday, January 29, 2013

GDAL/OGR plugin for ArcGIS v0.6.1

I updated the installer. It has GDAL trunk with support for 55 formats.

image

Enjoy - and please submit bugs!

Monday, January 28, 2013

Blob post about AmigoCloud’s GDAL/OGR plugin for ArcGIS

Bill Dollin’s has a nice blog entry about using the GDAL/OGR plugin for ArcGIS in his GeoMusings blog.

Want to integrate ArcGIS and various Open Source formats? This is certainly a way to do it.

Tuesday, January 15, 2013

What is an ESRI GeoDatabase?

For those of us dabbling in geo work, It is not uncommon to hear the term GeoDatabase. Sadly, it is also not uncommon to hear the term GeoDatabase used incorrectly. Sometimes people use it to refer to ArcSDE, sometimes they use it to refer to that “Access mdb file”, sometimes it is just that “FileGDB thing”.

Somebody asked "What is a GeoDatabase?" last year in my favorite GIS Q/A site - and here is the answer from yours truly.

Thursday, January 10, 2013

Improving Performance in PostgreSQL

Craig Kerstiens has two articles that are worth reading for those trying to improve query performance in PostgreSQL

- Understanding Postgres Performance - More on Postgres Performance

That should help you in understanding how to tweak SQL queries / indexes.

Nevertheless, I would hope you at least have an idea what some of the different postgresql.conf settings are for - especially since there are amazing resources out there to help you with this.

One of my favorite tools to see how tweaking affects my Postgres database is to use pgbench. If you have not use it, try it out.

Wednesday, January 2, 2013

Mapping Rookie Mistakes (#1): Always Geocoding on-the-fly

If you are hitting Google’s (or Bing’s) geocoding limit, you are probably doing something very wrong.

If your first reaction is to look for a workaround, then please STOP.

There is a reason for hitting that limit and you should be caching your results in your db instead.

You have been warned.

Update: Paul Ramsey and others kindly point out to look at 10.1.3(b) for currently allowed caching use cases.

Mapping Rookie Mistakes

There are a set of common Mapping Rookie Mistakes (tm) that happen very often. I am hoping I can refer to some of those so that somebody looking for an answer will stumble upon these.

For those experienced with GIS, these are blatantly obvious. Nevertheless, for the uninitiated in geo, these may not be so evident.

These will not be in any particular order, so the numbers only represent the order in which I wrote them - not order of importance.

Let’s start with geocoding on the fly