Edit and Convert Geospatial Data with OGR (and Python)

Nikolai Janakiev @njanakiev

Geospatial File Formats

Simple Features

Open Geospatial Consortium (OGC) and ISO standard for two-dimensional geometries by geographic information systems.

  • POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, ...

GeoJSON

Simple format based on JSON for simple features with additional attributes

{
  "type": "FeatureCollection",
  "name": "ideaslab",
  "features": [ { 
    "type": "Feature", 
    "geometry": { 
       "type": "Point", 
        "coordinates": [ 13.03951621055603, 47.823647260665894 ] 
    },
    "properties": { 
       "title": "IDEAS:lab" 
    }
  } ]
}

TopoJSON

TopoJSON by Mike Bostock is a extension of GeoJSON encoding topology. Commonly used with D3.js data visualization library.

ESRI Shapefile

(Mostly) open and commonly used specification by Esri. Stores vector features and attributes.

In [1]:
!tree chapters/
chapters/
├── chapters.dbf
├── chapters.prj
├── chapters.shp
└── chapters.shx

0 directories, 4 files
  • .shp - shape format: the feature geometry
  • .shx - shape index format
  • .dbf - attribute format; columnar attributes for each shape
  • .prj - projection format
  • and many others

GeoPackage and Spatialite

  • File formats based on SQLite files.
  • Well-known binary (WKB) as geometry column
  • SpatiaLite requires additional libraries, but has more features

PostGIS

Geography Markup Language (GML) and KML

<gml:featureMember>
  <ogr:chapters fid="chapters.104">
    <ogr:geometryProperty>
      <gml:Point srsName="EPSG:4326">
        <gml:coordinates>13.03171,47.80918</gml:coordinates>
      </gml:Point>
    </ogr:geometryProperty>
    <ogr:location>Salzburg, Austria</ogr:location>
    <ogr:title>MaptimeSalzburg</ogr:title>
    <ogr:twitter>MaptimeSalzburg</ogr:twitter>
  </ogr:chapters>
</gml:featureMember>

GPS Exchange Format (GPX)

Common format for GPS trackers describing waypoints, tracks and routes.

<gpx version="1.0" 
     creator="GPSLogger 95 - http://gpslogger.mendhak.com/" 
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
     xmlns="http://www.topografix.com/GPX/1/0" 
     xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd">
<time>2018-10-23T07:12:32.932Z</time>
<trk>
  <trkseg>
    <trkpt lat="47.7809697" lon="13.0670005">
      <time>2018-10-23T07:12:32.932Z</time>
      <src>network</src>
    </trkpt>
    ...

Comma Seperated Values (CSV)

OGR Simple Features Library

  • Vector Data Access
  • Part of GDAL
  • Supports the majority of geospatial vector formats and spatial databases

ogrinfo

Lists information about an OGR supported data source.

In [2]:
!ogrinfo chapters.json
INFO: Open of `chapters.json'
      using driver `GeoJSON' successful.
1: chapters (Point)
In [3]:
!ogrinfo --formats | head -n 15
Supported Formats:
  PCIDSK -raster,vector- (rw+v): PCIDSK Database File
  netCDF -raster,vector- (rw+s): Network Common Data Format
  JP2OpenJPEG -raster,vector- (rwv): JPEG-2000 driver based on OpenJPEG library
  PDF -raster,vector- (rw+vs): Geospatial PDF
  ESRI Shapefile -vector- (rw+v): ESRI Shapefile
  MapInfo File -vector- (rw+v): MapInfo File
  UK .NTF -vector- (ro): UK .NTF
  OGR_SDTS -vector- (ro): SDTS
  S57 -vector- (rw+v): IHO S-57 (ENC)
  DGN -vector- (rw+): Microstation DGN
  OGR_VRT -vector- (rov): VRT - Virtual Datasource
  REC -vector- (ro): EPIInfo .REC 
  Memory -vector- (rw+): Memory
  BNA -vector- (rw+v): Atlas BNA

Get Information about the file/database

In [4]:
!ogrinfo chapters.json -sql "SELECT * FROM chapters LIMIT 1"
INFO: Open of `chapters.json'
      using driver `GeoJSON' successful.

Layer name: chapters
Geometry: Point
Feature Count: 1
Extent: (-149.900000, -37.815751) - (153.026490, 69.682778)
Layer SRS WKT:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]
Geometry Column = _ogr_geometry_
location: String (0.0)
title: String (0.0)
twitter: String (0.0)
website: String (0.0)
meetup: String (0.0)
comingSoon: String (0.0)
organizers: String (0.0)
moreInfo: String (0.0)
github: String (0.0)
facebook: String (0.0)
OGRFeature(chapters):0
  location (String) = Paris, France
  title (String) = MaptimeParis
  twitter (String) = 
  website (String) = http://maptime.io/paris/
  meetup (String) = 
  comingSoon (String) = true
  organizers (String) = [ { "name": "nerik", "twitter": "" } ]
  moreInfo (String) = 
  github (String) = 
  facebook (String) = 
  POINT (2.35176086425781 48.8629079198646)

Get the Number of Features with SQL

In [5]:
!ogrinfo chapters.json -sql "SELECT COUNT(*) FROM chapters"
INFO: Open of `chapters.json'
      using driver `GeoJSON' successful.

Layer name: chapters
Geometry: None
Feature Count: 1
Layer SRS WKT:
(unknown)
COUNT_*: Integer (0.0)
OGRFeature(chapters):0
  COUNT_* (Integer) = 111

Get a single Feature

In [6]:
!ogrinfo -q chapters.json \
    -sql "SELECT * FROM chapters WHERE title = 'MaptimeSalzburg'"
Layer name: chapters
OGRFeature(chapters):104
  location (String) = Salzburg, Austria
  title (String) = MaptimeSalzburg
  twitter (String) = MaptimeSalzburg
  website (String) = 
  meetup (String) = 
  comingSoon (String) = 
  organizers (String) = [object Object],[object Object]
  moreInfo (String) = 
  github (String) = 
  facebook (String) = 
  POINT (13.03171 47.80918)

Swiss Army Knife of GIS: ogr2ogr

Converts simple features data between file formats.

ogr2ogr -f FORMAT outputfile inputfile

  • -f "ESRI Shapefile"
  • -f "GeoJSON"
  • -f "CSV"
  • -f "GML"
  • -f "PostgreSQL"
  • ...

Convert GeoJSON to GeoPackage

In [7]:
!ogr2ogr -f "GPKG" chapters.gpkg chapters.json \
    -progress
0...10...20...30...40...50...60...70...80...90...100 - done.

Convert GeoJSON to CSV

In [8]:
!ogr2ogr -f "CSV" chapters.csv chapters.json \
    -lco GEOMETRY=AS_XY \
    -progress
0...10...20...30...40...50...60...70...80...90...100 - done.

Convert CSV to GeoJSON

In [9]:
!ogr2ogr -f "GeoJSON" chapters.json chapters.csv \
    -oo X_POSSIBLE_NAMES=X \
    -oo Y_POSSIBLE_NAMES=Y \
    -oo KEEP_GEOM_COLUMNS=NO \
    -progress
Warning 6: Progress turned off as fast feature count is not available.

Convert GeoJSON to PostGIS

In [10]:
!ogr2ogr -f "PostgreSQL" PG:"dbname='postgres' host='localhost' user='postgres' password='postgres'" \
    chapters.json \
    -nln chapters \
    -progress
0...10...20...30...40...50...60...70...80...90...100 - done.

OGR with Python

OGR Data Drivers

Driver code from OGR Vector Formats documentation.

In [11]:
from osgeo import ogr
driver = ogr.GetDriverByName('GeoJSON')
source = driver.Open('chapters.json')

# Do something with that file

source.Destroy()

Getting a Layer

In [12]:
from osgeo import ogr

driver = ogr.GetDriverByName('GeoJSON')
source = driver.Open('chapters.json')

layer = source.GetLayer()
print('Layer :', layer.GetName())
print('Number of features on Layer :', layer.GetFeatureCount())
print('Extent :', layer.GetExtent())

source.Destroy()
Layer : chapters
Number of features on Layer : 111
Extent : (-149.9, 153.02649, -37.8157510415635, 69.682778)

Getting Features and Geometry

In [13]:
from osgeo import ogr

driver = ogr.GetDriverByName('GeoJSON')
source = driver.Open('chapters.json')

layer = source.GetLayer()
feature = layer.GetNextFeature()
print('Title : ', feature.GetField('title'))
print('Number of Fields :', feature.GetFieldCount())

# Get geometry
geometry = feature.GetGeometryRef()
x = geometry.GetX()
y = geometry.GetY()
print('X:', x, 'Y:', y)

source.Destroy()
Title :  MaptimeParis
Number of Fields : 10
X: 2.35176086425781 Y: 48.8629079198646

Iterating over all Features

feature = layer.GetNextFeature()
while feature:
    # do something here
    feature = layer.GetNextFeature()

Create a File

In [14]:
from osgeo import ogr

x, y = 13.03951621055603, 47.823647260665894
title = "IDEAS:lab"

driver = ogr.GetDriverByName('GeoJSON')
source = driver.CreateDataSource('ideaslab.json')

# Create layer
layer = source.CreateLayer('ideaslab', geom_type=ogr.wkbPoint)
# Create feature
feature = ogr.Feature(layer.GetLayerDefn())

# Create field
fieldDefn = ogr.FieldDefn('title', ogr.OFTString)
layer.CreateField(fieldDefn)

# Create geometry
point = ogr.Geometry(ogr.wkbPoint)
point.AddPoint(x, y)

# Set geometry and field of feature
feature.SetGeometry(point)
feature.SetField('title', title)

layer.CreateFeature(feature)
source.Destroy()