Archive for the ‘gis’ Category

I have a love-hate relationship with ArcGIS. For all its power and simplicity, it is very inconsistent. For example, some things are located in toolboxes which is good, while others are hidden away in various locations in the editing toolbar, for example.

The editting toolbar is bad. Firstly, your feature has to be in editting mode, which is slow and a pain in the neck. Edits made in the editting toolbar aren’t recorded in your spatial data files metadata. They can’t be automated, so you can put make them in model builder or call them from a Python script.

One of the few reasons I use the editting toolbar is for the Planarize tool, tucked away in the “More editing tools” -> “Topology” toolbar. It turns out, that for line shapefiles, the “Feature to line” toolbox does exactly the same thing. No more edit sessions for me!

Advertisements

I’ve long been on the search for a free, national road network GIS dataset for Australia. The search has been surprisingly difficult.

If you expect like me that the government would produce an official, national road network GIS database, then your expections are about the be dashed. In Australia, the maintenance of roads, and therefore their corresponding GIS datasets is a state responsibility. This means that there is no single, national official road network produced by the federal government. Even the data owned by state governments is not generally made freely available. While it is sometimes possible to contact state governments and receive a copy of a statewide road network, if you want to assemble a national road network the process has to be repeated for all states and territories. Furthermore, government derived networks generally have onerous licensing terms that mean they cannot be redistributed.

So what data sources are available? I have experience with four:

1. The canonincal combination of state-government road networks is produced by the Public Sector Mapping Agency (PSMA) Australia. Their Transport and Topography product, released quarterly, has been derived from the relevant bodies from each Australian State and Territory jurisdiction as well as the Commonwealth mapping agency. The major advantage of the Transport & Topography dataset is that it is up to date. PSMA also attempt to maintain consistency in attributes across the state derived networks (with varying degrees of success in my experience). Furthermore, PSMA attempt to correct spatial topology errors and perform automated quality assurance. Transport & Topography also has road names that are compatible with the address names in PSMA’s excellent Geocoded National Address File (G-NAF), which is an additional bonus. The downside is that it is very expensive, and its redistribution terms are restrictive.

2. A time-honoured source of road network data is VMAP0. VMAP0, short for Vector Map level 0, is a global data set, declassified by the US National Geospatial Intelligence Agency. VMAP0 was created from the digitisation of military navigational charts, and is perhaps still the most comprehensive global data set for topographic map data that is freely available. That said, there are two major problems with this dataset. First, because it was created from 1:1,000,000 scale maps, the road network is very sparse, meaning even some major roads are not listed. Second, VMAP0 has not been updated for twenty years! The advantage of this dataset is that it is public domain licensed, meaning you can do what ever you like with it. VMAP0 for Australia is available from the USA National Imagery and Mapping Agency, but requires software that can read its funky format like OGR. I believe ESRI also used to distribute VMAP0 shapefiles branded as the Digital Chart of World.

3. The new kid on the block is OpenStreetMap (OSM). OSM uses data contributed by volunteers (a.k.a. ‘Volunteered Geographical Information’ or VGI) to build up a rich spatial database (think Wikipedia for maps). While they are perhaps best known for delivering Google Maps like tiles on their website, you can also download their raw data. The good people at CloudMade take the raw OSM data, convert it to a number of accessible formats, split it up by country and make it available for download on their website. I am hesitant to use this data at the moment, I as don’t know how about its completeness in Australia, or how to selected out roads and not footpaths, for example. However, this has the potential to be a real competitor with Transport & Topography in terms of its currency and completeness, so I’m keen to learn more from anyone using OSM road network data in GIS analysis.

4. Finally, Geoscience Australia (GA) has a series of vector layers that is uses to produce 1:250,000 topographic maps for the whole continent. These are excellent data, consistent and well-curated, but with a couple of caveats. First, because these layers were created for topographic mapping, they are only at a course level of detail. Not all roads are included, particularly in residential areas, and those that are may have been moved in the cartographic production process to stop them from overlapping. Second, these data were last updated in June 2006, meaning that for some purposes they will be too old. Nevertheless, Australia’s national road system doesn’t change that fast, so for some purposes this will be an ideal dataset. Best of all, in 2009, GA relicensed this previously proprietary dataset to the Creative Commons 3.0 by-attribution license, meaning that you can use it, redistribute it and derive your own products from it, as long as you acknowledge Geoscience Australia as the original custodian. These data are freely available at the national scale in ArcGIS Personal Geodatabase format from GA although if you want pre-made shapefiles you’ll have to download individual map grids or buy their product.

If you know of any other national road networks for Australia, let me know.

Sometimes ArcGIS just can’t put the labels somewhere sensible, and no amount of Maplex options will help you. Here’s what you do if you want to move them manually.

1) If you want to label all features, it’s best to add them at the start. Right click the layer and select Properties, then Labels, then Placement properties and tick Place overlapping labels.

2) Right click on the layer and select Convert labels to annotations... and choose to Store Annotations In the map.

3) Show the Drawing toolbar, click the Select elements (arrow/cursor tool) and then double-click on the dataframe.

4) Click and drag on the labels.

5) If you didn’t do step one and thus need to add some new labels, select the Label tool on the Drawing toolbar (often hidden behind the New text tool) and click on the feature you want to label.

In a previous installment I demonstrated how to use ogr2ogr to reproject shapefiles. It turns out that its a bit trickier when you are reprojecting from AGD66 to GDA94 (and vice versa). This is because the ellipsoids they use to model the earth are different, and it turns out the be quite difficult to calculate the exact equivalent locations between them.

Difference between AGD66 and GDA94 at the ANU

Difference between AGD66 and GDA94 at the ANU

The most accurate way to convert between the two coordinate reference systems is to use a pre-calculated distortion grid. Fortunately, some clever surveyors have made one for us, available here (Webcite archive).

To download the distortion grid and put it in the right place for us to use, you can paste the following into your bash shell:

wget http://www.icsm.gov.au/gda/gdatm/national66.zip
unzip national66.zip
mkdir -p ~/bin
mv "A66 National (13.09.01).gsb" ~/bin/a66_national.gsb
rm national66.zip

I’ve also archived the grid files using Webcite, if the ICSM link is dead replace please see the archive.

Please note that if you change the filenames to something that contains spaces or funky punctuation ogr2ogr might not find it, and will not perform the grid transformation, leaving you with incorrect coordinates. So I would recommend just running the above code verbatim.

Then, to transform a shapefile from AGD66 to GDA94 type the following:

ogr2ogr -f "ESRI Shapefile" -s_srs "+proj=longlat +ellps=aust_SA +nadgrids=~/bin/a66_national.gsb +wktext" -t_srs EPSG:4283 outputgda94.shp inputagd66.shp

You will need to change the filenames outputgda94.shp and inputagd66.shp to whatever suits you.

Yesterday’s post expounded the virtues of doing GIS analysis within SQL transactions. One annoyance of that approach is that whenever an error is made within a transaction, the whole thing must be rolled back and redone. Any further queries won’t be processed within the transaction:

ERROR: current transaction is aborted, commands ignored until end of transaction block

Fortunately, this can be avoided, at least within recent versions of psql. All that needs to be done is to set a special variable from within psql and commands can continue to be issued as per normal:

gis=# \set ON_ERROR_ROLLBACK interactive

This variable is unset everytime you close psql. If you want this to be default behaviour within psql, close your psql session and issue the follow command from the shell:

fmark@fmark-laptop:~$ echo '\set ON_ERROR_ROLLBACK interactive' >> ~/.psqlrc

Thanks to the hivemind at stackoverflow.com for this tip.

Yep, its one of those small annoyances – like keeping up-to-date backups – that you should always do: always use transactions when doing analysis in PostGIS.

In case you don’t know, transactions let you wrap a bunch of database modifications such as UPDATE, INSERT, CREATE, etc. in an undo-able block. This means you can recover from mistakes, like you would use undo when using Excel. When you are happy with your changes, you can then commit them to the database to store them permanently.

You start a transaction by issuing a BEGIN comand, and end it by either undoing the transaction with ROLLBACK or saving it with COMMIT. Here’s an example of how to do it, using pqsl:

Let’s say you realise you have a table two columns with latitudes that are incorrectly entered as positive numbers, although they refer to locations south of the equator:

gis=# SELECT lat_to, lat_from FROM closures ORDER BY closure_id DESC LIMIT 10;

 lat_to  | lat_from 
---------+----------
       0 | -14.9819
 13.8779 | -13.9802
 13.6661 | -14.0998
 14.6906 |        0
 13.6661 | -14.5559
       0 |        0
       0 | -13.9802
       0 |        0
 25.4141 |        0
 22.2898 | -22.9968
(10 rows)

So we, issue some updates, wrapped in a transaction:

gis=# BEGIN;
BEGIN
gis=# UPDATE closures SET lat_to = (-(abs(lat_to))) WHERE lat_to != 0;
UPDATE 1866
gis=# UPDATE closures SET lat_to = (-(abs(lat_from))) WHERE lat_from != 0;
UPDATE 1912

Now, we check our result and realise we have made a mistake, by updating the wrong field in our second query:

gis=# SELECT lat_to, lat_from FROM closures ORDER BY closure_id DESC LIMIT 10;

  lat_to  | lat_from 
----------+----------
 -14.9819 | -14.9819
 -13.9802 | -13.9802
 -14.0998 | -14.0998
 -14.6906 |        0
 -14.5559 | -14.5559
        0 |        0
 -13.9802 | -13.9802
        0 |        0
 -25.4141 |        0
 -22.9968 | -22.9968
(10 rows)

If we hadn’t used a transaction, we would now be stuffed. Fortunately, we can revert the changes:

gis=# ROLLBACK;
ROLLBACK
gis=# SELECT lat_to, lat_from FROM closures ORDER BY closure_id DESC LIMIT 10;

 lat_to  | lat_from 
---------+----------
       0 | -14.9819
 13.8779 | -13.9802
 13.6661 | -14.0998
 14.6906 |        0
 13.6661 | -14.5559
       0 |        0
       0 | -13.9802
       0 |        0
 25.4141 |        0
 22.2898 | -22.9968
(10 rows)

Now we can try again, fixing the broken query:

gis=# BEGIN;
BEGIN
gis=# UPDATE closures SET lat_to = (-(abs(lat_to))) WHERE lat_to != 0;
UPDATE 1866
gis=# UPDATE closures SET lat_from = (-(abs(lat_from))) WHERE lat_from != 0;
UPDATE 1912
gis=# SELECT lat_to, lat_from FROM closures ORDER BY closure_id DESC LIMIT 10;

  lat_to  | lat_from 
----------+----------
        0 | -14.9819
 -13.8779 | -13.9802
 -13.6661 | -14.0998
 -14.6906 |        0
 -13.6661 | -14.5559
        0 |        0
        0 | -13.9802
        0 |        0
 -25.4141 |        0
 -22.2898 | -22.9968
(10 rows)

Yay, the changes we wanted have been made, so we can now commit the transaction to save the changes permanently:

gis=# COMMIT;
COMMIT
gis=# SELECT lat_to, lat_from FROM closures ORDER BY closure_id DESC LIMIT 10;

  lat_to  | lat_from 
----------+----------
        0 | -14.9819
 -13.8779 | -13.9802
 -13.6661 | -14.0998
 -14.6906 |        0
 -13.6661 | -14.5559
        0 |        0
        0 | -13.9802
        0 |        0
 -25.4141 |        0
 -22.2898 | -22.9968
(10 rows)

UPDATE:In order to avoid rolling back transactions every time you make a typo inside them, set ON_ERROR_ROLLBACK to interactive.

Firstly, you might want to view the projection information in the shapefile, ROAD_CENTRELINES.shp in this case.  The perl one-liner stops ogrinfo from printing copious data about each vector feature in the shapefile:

fmark@fmark-laptop:~$ ogrinfo -ro -al ROAD_CENTERLINES.shp | perl -e 'while (<STDIN>) { if (/OGRFeature/) { exit } print; }'
INFO: Open of `ROAD_CENTERLINES.shp'
using driver `ESRI Shapefile' successful.

Layer name: ROAD_CENTERLINES
Geometry: Line String
Feature Count: 4596
Extent: (128.999678, -25.998883) - (138.000000, -11.067068)
Layer SRS WKT:
GEOGCS["Longitude / Latitude (GDA 94)",
DATUM["Unknown",
SPHEROID["GRS_80",6378137,298.2572221010595]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]]
ROAD_NAME: String (100.0)
ROAD_TYPE: String (15.0)
ROAD_SUFFI: String (15.0)
ROAD_LABEL: String (130.0)
Length: Real (19.11)

We can see from this that the geographic coordinate system of the shapefile is GDA94.

In order to specify the projection we want to use, we need to know its EPSG code.  What’s an EPSG code?  Its a number that uniquely defines some well-known projections (according to Wikipedia, EPSG stands for the European Petroleum Survey Group, so we have big oil to thank for our daily GIS tools, as well as big military). Some useful Australian EPSG codes are:

4326 WGS84
4283 GDA94
4202 AGD66
3577 GDA94 / Australian Albers
3112 GDA94 / Geoscience Australia Lambert
28348 GDA94 / MGA zone 48
28349 GDA94 / MGA zone 49
28350 GDA94 / MGA zone 50
28351 GDA94 / MGA zone 51
28352 GDA94 / MGA zone 52
28353 GDA94 / MGA zone 53
28354 GDA94 / MGA zone 54
28355 GDA94 / MGA zone 55
28356 GDA94 / MGA zone 56
28357 GDA94 / MGA zone 57
28358 GDA94 / MGA zone 58

More can be found at http://spatialreference.org/.

In this case, I want MGA zone 53, or EPSG:28353.  So to reproject:

ogr2ogr -t_srs 'EPSG:28353' ROAD_CENTERLINES_MGAZ53.shp ROAD_CENTERLINES.shp

Note that the output filename comes before the input filename.  Also, for some reason on ubuntu karmic there was a missing dependency, so I needed to:

sudo apt-get install libproj-dev