Calculating the maximum diagonal distance in a given collection of GeoJSON features using turf.js and lots of speed
I worked really hard to write an engaging, catchy, intelligent, and funny introduction to this article, but it was a hard struggle. Being a problem solver, I solved this problem with modern tooling. Behold, I present to you the introduction to this article written by ChatGPT in the style of Monty Python:
“Well, well, well, what do we have here? It seems we’ve stumbled upon a problem that needs solving — how to find the distance between any two given points in a given set of GeoJSON features using JavaScript. But fear not, brave knights of coding, for we have a solution that will make this task as easy as a jester’s jest. So gather round, and let us embark on a journey of mathematical hilarity and GeoJSON glory!”
In the words of a mediocre deputy chief engineer, “not great, not terrible”.
The motivation behind the article is a feature we implemented recently at CodeComply.ai. The feature in question requires the calculation of the maximum distance between any two points on a given floor plan. We usually work with objects of the floor plan represented as GeoJSON features, but since it’s much easier to visualize actual geographic data, I will try to measure the longest diagonal distance inside a random country, let’s say, Poland!
Luckily, there’s a great GeoJSON representation of Poland in Piotr Patrzyk’s polska-geojson repository. Let’s use that!
First, we need a map to see what’s actually going on. Let’s use Leaflet. First, we need to add some markup in our HTML file:
<head>
<link
rel="stylesheet"
<link
rel="stylesheet"
href="https://unpkg.com/leaflet@1.9.3/dist/leaflet.css"
integrity="sha256-kLaT2GOSpHechhsozzB+flnD+zUyjE2LlfWPgU04xyI="
crossorigin=""
/>
<script
src="https://unpkg.com/leaflet@1.9.3/dist/leaflet.js"
integrity="sha256-WBkoXOwTeyKclOHuWtc+i2uENFpDZ9YPdf5Hf+D7ewM="
crossorigin=""
></script>
</head>
<body>
<div id="map"></div>
</body>
Then, we can proceed with setting up the map in JavaScript:
import L from "leaflet";
const map = L.map("map").setView([52.069330831388974, 19.480308502370697], 5);
A curious reader might ask — what are those coordinates? Well, this happens to be the geometrical center of Poland, which in turn happens to be in a town, whose name translates to “Friday”, which is neither the middle point of the working week nor the normal one.
This is not impressive, and for it to become truly impressive, we need some map tiles. Let’s use the OpenStreetMap ones:
import L from "leaflet";
const map = L.map("map").setView([52.069330831388974, 19.480308502370697], 5);
L.tileLayer("https://tile.openstreetmap.org/{z}/{x}/{y}.png", {
maxZoom: 19,
attribution:
'© <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a>'
}).addTo(map);
That’s better. Next, let’s load the GeoJSON data. To be able to do it, we need a GeoJSON layer in our map:
import L from "leaflet";
const map = L.map("map").setView([52.069330831388974, 19.480308502370697], 5);
L.tileLayer("https://tile.openstreetmap.org/{z}/{x}/{y}.png", {
maxZoom: 19,
attribution:
'© <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a>'
}).addTo(map);
const geoJSONLayer = L.geoJSON().addTo(map);
Then, assuming that we copied the Poland data to a data.json file nearby, we can add it to the map:
import L from "leaflet";
import data from "./data.json";
const map = L.map("map").setView([52.069330831388974, 19.480308502370697], 5);
L.tileLayer("https://tile.openstreetmap.org/{z}/{x}/{y}.png", {
maxZoom: 19,
attribution:
'© <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a>'
}).addTo(map);
const geoJSONLayer = L.geoJSON().addTo(map);
geoJSONLayer.addData(data);
Now we have a map with 16 GeoJSON features representing Polish Voivodeships. Our features comprise precisely 8192 points, a number comparable to the number of points that we usually see in a medium-sized floor plan. Let’s proceed with finding the longest diagonal distance.
What would be the simplest way to do it? As mentioned before, we have a set of features — polygons, each being a set of points. The most straightforward option would be to iterate over all the points, and for each one, iterate over all the others, calculate all the distances, and find the longest ones. Let’s try that! First, we need to extract all the points. Usually, we’d need to extract those from the geometry.coordinates
property of each feature, but since we’re using Turf.js, we can use a helper that does that for us:
import { coordAll } from "@turf/meta";
import data from "./data.json";
const points = coordAll(data);
On an M1 Max MacBook Pro, this takes around half of a millisecond for 8192 points, which is negligible.
To calculate the greatest distance, we first need to calculate any distance between the two points. An important note needs to be made here: we originally calculated distances on a floor plan, where, in practice, the Earth’s curvature is irrelevant. When doing the same on actual geographic features, we should account for it, and use, for example, the Turf.js helper that uses the Haversine formula, and is much, much, much slower. Since we really want to be fast, we assume the flatness of the Earth (oh boy) and use the Pythagorean Theorem to calculate the distance, cheap and dirty:
const calculatePointsDistance = ([x1, y1], [x2, y2]) => {
return Math.sqrt(Math.pow(x1 - x2, 2) + Math.pow(y1 - y2, 2));
};
Next, we can iterate over the points:
const { maxDistance, p1, p2 } = points.reduce(
(acc, curr, _, array) => {
for (let i = 0; i < array.length; i++) {
const distance = calculatePointsDistance(curr, array[i]);
if (distance > acc.maxDistance) {
acc.maxDistance = distance;
acc.p1 = curr;
acc.p2 = array[i];
}
}
return acc;
},
{
maxDistance: 0,
p1: null,
p2: null
}
);
As a result of the reduce
call, we receive the maxDistance
(which actually makes little sense on it’s own, but is some kind of representation of distance), and coordinates two points that the maximum distance separates.
Since the above implementation is O(n²)
, and we run it on 8192 entries, it takes around 430 ms to run, which is a lot, if you’re aiming to do it in real-time in the browser.
There’s a simple improvement that we can perform — half of the iterations that we do above are repetitions. This happens, because we run the checks on a Cartesian product of the array of points with itself. This means that for each pair of points p1, p2
, we also do the same check for p2, p1
, which doesn’t make much sense. Let’s remove the repetitions:
const { maxDistance, p1, p2 } = points.reduce(
(acc, curr, index, array) => {
for (let i = index + 1; i < array.length; i++) {
const distance = calculatePointsDistance(curr, array[i]);
if (distance > acc.maxDistance) {
acc.maxDistance = distance;
acc.p1 = curr;
acc.p2 = array[i];
}
}
return acc;
},
{
maxDistance: 0,
p1: null,
p2: null
}
);
This came as no surprise; this takes more or less around half of what the previous implementation did, around 220 ms. Which is still much, much too long. But hey, this is still an O(n²/2)
algorithm.
To make our solution run faster, we really need to think about what makes it run for that long. Surprisingly, it’s not the fact that the check we’re performing is O(n²/2)
. It’s because of the number of points! If you take a look at the map, you can see that lots of points are actually unnecessary — with the boundaries between voivodeships coming to mind in the first place. There’s a simple way to remove those points — we can merge our polygons using a Turf.js helper which takes two polygons and returns their union:
import union from "@turf/union";
const mergedPolygon = data.features.reduce((acc, curr) => {
if (!acc) return curr;
return union(acc, curr);
}, null);
const mergedPoints = coordAll(mergedPolygon);
Then, after applying the non-repetition approach to the new set of points (only 1699 of those left after merging the polygons), we can see that it’s around 45% faster (it takes around 120 ms). Better, but still no cigar.
OK, but do we still need all those points to calculate the maximum distance? Consider a banana:
Intuition tells us that some of the points that make up the shape of this 2D representation of a banana are actually irrelevant. In fact, maybe we could wrap the banana (pun somewhat intended) into some kind of a simplified shape that would still allow us to calculate the maximum diagonal distance in a precise way? Let’s start with the most straightforward options.
The first one is to calculate a bounding box. That, though, would only work for cases where the maximum diagonal distance is actually a diagonal of the resulting rectangle, so it’s too specific to apply. We could also try to plot the smallest circle that encloses the banana, but that doesn’t make any sense. Luckily there is a kind of encompassing shape that is precisely what we need: a convex hull. Why a convex hull and not a concave one? Well, consider a banana again. Intuition tells us that the points on the inner curve of the banana are insignificant. If so, our hull doesn’t need any concavity, which actually takes more computing power to calculate.
Let’s calculate Poland’s convex hull, then, using the appropriate Turf.js helper:
import { featureCollection, point } from "@turf/helpers";
import convex from "@turf/convex";
// convex accepts a featureCollection made up of actual GeoJSON points
const collection = featureCollection(points.map((p) => point(p)));
const convexHull = convex(collection);
const convexPoints = coordAll(convexHull);
Now, all we need to do is iterate over the new set of points without repetitions. The result? Well, it’s quite nice actually, as it’s 13% of the time of the “merge the polygons” approach, and 3.7% of the original brute force approach, at 16 ms! This is pretty acceptable in terms of running it in the browser.
Why is it so much faster than the brute-force approach? Because out of the original 8192 points, the convex hull only needs 40 to preserve the encompassing shape. Why is the calculation of the convex hull faster than the O(n²/2)
algorithm? Well, because it’s actually closer to O(n log n)
, and that’s for the worst case! The algorithm only needs to check if the currently iterated point is inside of the intermediate polygon, and if not — expand the polygon.
For further reading, it’s also worth mentioning that the number of vertices differs between types of distributions of the points in question. For example, for Gaussian distribution, the number really, really small — it would be O(√(log n))
! Another interesting fact is that the convex hull approach is not the most effective one, but it’s simple and good enough for this use case.
OK, seems that our work here is done! Is it, though? What is the actual distance? How does it look on the map? Let’s see. First, we need to modify the map a bit. Let’s add some color to the GeoJSON layer:
const geoJSONLayer = L.geoJSON(null, {
style: (feature) => {
if (feature.geometry.type === "LineString") {
return {
color: "red"
};
}
}
}).addTo(map);
We will plot a line connecting the two points separated by the maximum distance, and we want it to contrast with the polygons we already have, so we’re painting all the lines red. Next, we can prepare the line, retrieve its length and add it to the GeoJSON layer:
import { lineString } from "@turf/helpers";
import length from "@turf/length";
// ...
const line = lineString([convexP1, convexP2]);
const len = length(line);
geoJSONLayer.addData(line);
Well, the distance (without accounting for the curvature of the Earth) is 768 and a quarter of a kilometer. As for the graphic representation:
EDIT: Based on a comment on HackerNews, there’s another way to speed up the calculation: removing the square root call from the function that calculates the distance between the two points inside of the loop. The thing is that this isn’t actually calculating the distance; it’s merely a function calculating a synthetic value that can be used to compare the pairs of points in terms of distance between them. It isn’t even used to calculate the actual distance. So the Math.sqrt
call is not really used for anything, and it’s surprisingly expensive. Removing it lowered the execution time to around 10 ms, which is a massive gain!
The code for everything above can be found in this CodeSandbox. For precise timing on your machine, be sure to check the browser's console panel, not the CodeSandbox one!
EDIT2: As stated in the beginning, I used the Euclidean approach for simplicity, but it’s also quite easy to get the results that consider the Earth’s curvature. Instead of using a function to calculate a value based on the distance (for which I used the sum of squares of differences between coordinates), we can use the Turf.js distance helper, which takes the curvature into account and calculates the distance using the haversine formula. This approach is a bit less performant (for the same setup as above, it takes around 16 ms), but it gives results that actually make sense for a spherical map.
The results are quite different from the ones above — the lower-right point is completely different (as shown on the map below), and the distance on a sphere is longer by more than the distance of a marathon! It’s longer by almost 43 kilometers, at 811.240 km.
The code can be found in this CodeSandbox.
I really hope you enjoyed this article. Follow me for more articles on geometry, geography, computer vision, optimization, and all of that (and more) in JavaScript!
I mentor software developers. Drop me a line on MentorCruise for long-term mentorship or on CodeMentor for individual sessions.