KNN queries with SpatiaLite

May 20, 2021 ยท View on GitHub

The latest version of SpatiaLite adds KNN support, which makes it easy to efficiently answer the question "what are the X closest records to this point".

The USGS earthquakes GeoJSON is a great dataset for experimenting with these features.

Documentation for that is here: https://earthquake.usgs.gov/earthquakes/feed/v1.0/geojson.php

https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson currently contains 10,642 features.

To turn that into a SpatiaLite database using the latest version of geojson-to-sqlite:

curl 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson' | \
  geojson-to-sqlite usgs.db quakes - --spatial-index

This will create a usgs.db SpatiaLite database with a spatial index.

This should have a knn virtual table already. If one does not exist you can create one like this:

sqlite-utils --load-extension=spatialite usgs.db 'CREATE VIRTUAL TABLE knn USING VirtualKNN();'

Open it in Datasette like this:

datasette --load-extension=spatialite usgs.db

(Running datasette install datasette-cluster-map first will let you see them on a map.)

Use this SQL query to run KNN searches returning the ten closest earthquakes to a point:

SELECT
  knn.distance,
  quakes.title,
  quakes.mag,
  quakes.time,
  quakes.url,
  y(quakes.geometry) as latitude,
  x(quakes.geometry) as longitude
FROM
  knn
  join quakes on knn.fid = quakes.rowid
WHERE
  f_table_name = 'quakes'
  AND ref_geometry = MakePoint(cast(:longitude as real), cast(:latitude as real))
  AND max_items = 10