Lambda Tile Generation with Tegola

Multi Technology Mix (MTM) at its finest. OSM
Layer ©MapTile

Inspired by the excellent NBN MTM Alpha I’ve been recently playing around with the NBN connection type data released on data.gov.au in July 2020. Combining it with Geocoded National Address File , you can produce maps that surface address level connection details. For NBN enthusiasts (such as myself), it’s pretty exciting, as the actual NBN map is purposely evasive.

So once you’ve loaded all that juicy data into PostGIS how do you get that information out? The open source mapping space has changed significantly since I played around with it in the early 2010s. Tile generation no longer revolves around muttering incantations to summon life out of Mapnik for raster tile generation. Now, there is the MapBox Vector Specification which allows you to serve piping hot geo data into the complementary Mapbox GL JS mapping library. To serve vector tiles using the MapBox Vector Specification, Tegola caught my eye as a lightweight, well supported application. Most excitingly, it supports Lambda based generation.

The Idea

The majority of articles (such as Running a Serverless Vector Tile Backend with AWS Lambda and Go) talk about running Tegola with a Postgres backend, and the associated connection pool issues with that. If I was doing this at work, I’d be looking at potentially an Amazon RDS Proxy with a smallish RDS instance. Even on a db.t3.small RDS Postgres instance, you’d be looking at $25.92 for the proxy, and $40.32 for the database. That’s $66.24 USD a month, which is $93.15 Australian Dollarydoos. That’s too much money. I could run a LightSail/Digital Ocean instance, but I’d want to put a Load Balancer in front ($18 USD for a LightSail LB). So how can we run it on Lambda, but not need a database?

Enter Lambda’s support for Amazon Elastic File System (EFS) which was announced in June. No longer do you need to use nested layers, or some S3 tomfoolery to load data into the 512-MB temporary storage available to Lambda. So with an expandable filesystem, it really opens up the possibilities for making larger files available to Lambda.

Tegola primarily supported PostGIS, but it also supports GeoPackage, an SQLite based standard for storing geospatial data. Now, SQLite and NFS don’t really mix. But this will be read-only, so we’ll just give it a go. So Tegola for tile generation, with a GeoPackage on NFS for the data storage. What could go wrong?

The Reality

My initial plan was to run the NBN dataset, as well as an OSM extract of Australia. I loaded them Postgis using Tegola OSM, and once I had it working, I fumbled through using ogr2ogr to export the data from PostGis into GeoPackage. Make sure you use the correct spatial reference system (SRS) otherwise bad things will happen; Tegola prefers 4326 for its GeoPackage configurations. Some of the data types also caused issues with the conversion, so I commented out a number of the layers in the Tegola OSM config. All up, my OSM data was about 4gig, and my NBN data was about 3.5gig. So once I had it working on my local computer, it was time to load it into Lambda.

Tegola has a pre-built Lambda version, so that was simple, as was wiring in the EFS volume. Of note, I did need to use an EC2 instance to load my 7.5gig of data into EFS, which was unfortunate, as I was hoping I could use an S3-style console for uploading. The only struggle I had was getting the API Gateway setup, along with CORS, and that’s mainly because I didn’t read the documentation correctly about setting the `’application/octet-stream` media-type; this Github Issue also helped. So with that corrected, i wired it up to my web-app and I was good to go…except…

Melbourne CBD with NBN
Mapping Data

It didn’t work. I bumped the memory allocation. Still didn’t work. Looking into the logs, it appeared it was timing out during the execution; hitting the 30 second API Gateway timeout. Using EFS requires connectivity via your VPC, but recent improvements have drastically reduced those cold-start times. The biggest issue seemed to be the size of the GeoPackage files. Removing the dream of serving the OSM data myself, I left the NBN GeoPackage file, and it worked significantly better. It still choked on denser areas of the map, for example the Melbourne CBD, and for that I’ve reverted to using an S3 based cache mechanism. Tegola has an excellent cache seeding function, so data can be served instead from a cache. A word of warning on caching, it can be very expensive, both from a time + storage perspective, to cache tiles at lower zoom levels. Cache seed only where something is expensive to generate on the fly.

Component Overview

Wiring it
all together

Conclusion

So, can you serve MapBox Vector Specification tiles on the cheap using Lambda + Tegola + EFS? Yes you can. But caveat emptor. You’ll need to make sure your GeoPackage aren’t too big, potentially splitting them into specific layers; instead of the giant single layer I used. You’ll also want to make liberal use of the cache-seeding function, where possible, but that does come with a rather large file size overhead.