geokdtree
GeoKDTree
GeoKDTree
Ultra-fast nearest-neighbor lookup for latitude/longitude data
GeoKDTree is a lightweight, high-performance spatial indexing library for Python designed to find the nearest geographic coordinate from massive datasets in nanoseconds.
It wraps a highly optimized KD-Tree with a geographic interface, allowing you to work directly with (latitude, longitude) pairs. No projections, no external dependencies, and no heavy GIS stacks.

Documentation
- Docs: https://connor-makowski.github.io/geokdtree/geokdtree.html
- Git Repo: https://github.com/connor-makowski/geokdtree
Installation
pip install geokdtree
If you are having trouble building the C++ extension during the pip installation process, you can run:
On Mac / Linux / WSL2:
export SKBUILD_CMAKE_ARGS="-DSKIP_CPP_BUILD=ON" pip install geokdtreeOn Windows:
# POWERHELL: $env:SKBUILD_CMAKE_ARGS="-DSKIP_CPP_BUILD=ON" # CMD: set SKBUILD_CMAKE_ARGS=-DSKIP_CPP_BUILD=ON pip install scgraph
Getting Started
from geokdtree import GeoKDTree
example_points = [
(34.0522, -118.2437), # Los Angeles
(40.7128, -74.0060), # New York
(37.7749, -122.4194), # San Francisco
(51.5074, -0.1278), # London
(48.8566, 2.3522), # Paris
]
geo_kd_tree = GeoKDTree(points=example_points)
test_point = (47.6062, -122.3321) # Seattle
# Find the index of the closest point in the original dataset
closest_idx = geo_kd_tree.closest_idx(test_point) #=> 2
# Find the closest point itself
closest_point = geo_kd_tree.closest_point(test_point) #=> (37.7749, -122.4194)
Why Use GeoKDTree?
GeoKDTree is designed to solve one focused problem extremely well:
Fast nearest-neighbor lookup for latitude/longitude data at scale.
It is worth noting that the closest point found may not be the true closest point, but should be very close for most practical applications. See KD-Tree limitations for more details.
Extremely Fast Lookups
Once constructed, nearest-neighbor queries consistently complete in tens of nanoseconds, even with very large datasets.
Typical benchmark results from the included tests:
| Number of Points | Build Time | Query Time |
|---|---|---|
| 1,000 | ~1.7 ms | ~0.02 ms |
| 10,000 | ~25 ms | ~0.05 ms |
| 100,000 | ~350 ms | ~0.05 ms |
| 1,000,000 | ~6.8 s | ~0.07 ms |
This makes GeoKDTree well-suited for:
- Real-time proximity queries
- Matching incoming coordinates against large reference datasets
- High-throughput geospatial APIs
- Pre-filtering before more expensive geospatial calculations
Exact timings depend on hardware, Python version, and data distribution. These values reflect typical results from the repository’s benchmarks.
Built for Geographic Coordinates
GeoKDTree works directly with (latitude, longitude) pairs.
You do not need to:
- Project coordinates into planar space
- Use heavyweight GIS libraries
- Maintain custom spatial indexing code
Just pass geographic coordinates and query.
Simple API, Minimal Overhead
GeoKDTree intentionally keeps the API small and focused.
- Build once from a list of coordinates
- Query nearest neighbors with a single method call
- Retrieve indices or points directly from your original dataset
There are no external C extensions or heavy dependencies, keeping installation and deployment simple.
Deterministic and Predictable Performance
- Tree construction scales at approximately
O(n log n) - Query performance scales at approximately
O(log n) - No probabilistic approximations
- No background indexing or caching
This predictability is valuable for production systems where latency and reproducibility matter.
Supported Features
See: https://connor-makowski.github.io/geokdtree/geokdtree.html
Contributing
Issues, feature requests, and pull requests are welcome. Please open an issue to discuss changes or enhancements.
Development
Running Tests, Prettifying Code, and Updating Docs
Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2).
- Create a docker container and drop into a shell
./run.sh
- Run all tests (see ./utils/test.sh)
./run.sh test
- Prettify the code (see ./utils/prettify.sh)
./run.sh prettify
Update the docs (see ./utils/docs.sh)
./run.sh docs
Note: You can and should modify the
Dockerfileto test different python versions.
1""" 2# GeoKDTree 3[](https://badge.fury.io/py/geokdtree) 4[](https://opensource.org/licenses/MIT) 5[](https://pypi.org/project/geokdtree/) 6<!-- [](https://pypi.org/project/geokdtree/) --> 7 8# GeoKDTree 9 10## Ultra-fast nearest-neighbor lookup for latitude/longitude data 11 12**GeoKDTree** is a lightweight, high-performance spatial indexing library for Python designed to find the *nearest geographic coordinate* from massive datasets in nanoseconds. 13 14It wraps a highly optimized KD-Tree with a geographic interface, allowing you to work directly with `(latitude, longitude)` pairs. No projections, no external dependencies, and no heavy GIS stacks. 15 16 17 18### Documentation 19 20- Docs: https://connor-makowski.github.io/geokdtree/geokdtree.html 21- Git Repo: https://github.com/connor-makowski/geokdtree 22 23## Installation 24 25```bash 26pip install geokdtree 27``` 28 29If you are having trouble building the C++ extension during the pip installation process, you can run: 30 31- On Mac / Linux / WSL2: 32 ```bash 33 export SKBUILD_CMAKE_ARGS="-DSKIP_CPP_BUILD=ON" 34 pip install geokdtree 35 ``` 36- On Windows: 37 ```bash 38 # POWERHELL: 39 $env:SKBUILD_CMAKE_ARGS="-DSKIP_CPP_BUILD=ON" 40 # CMD: 41 set SKBUILD_CMAKE_ARGS=-DSKIP_CPP_BUILD=ON 42 pip install scgraph 43 ``` 44 45## Getting Started 46 47```python 48from geokdtree import GeoKDTree 49 50example_points = [ 51 (34.0522, -118.2437), # Los Angeles 52 (40.7128, -74.0060), # New York 53 (37.7749, -122.4194), # San Francisco 54 (51.5074, -0.1278), # London 55 (48.8566, 2.3522), # Paris 56] 57 58geo_kd_tree = GeoKDTree(points=example_points) 59 60test_point = (47.6062, -122.3321) # Seattle 61# Find the index of the closest point in the original dataset 62closest_idx = geo_kd_tree.closest_idx(test_point) #=> 2 63# Find the closest point itself 64closest_point = geo_kd_tree.closest_point(test_point) #=> (37.7749, -122.4194) 65``` 66 67## Why Use GeoKDTree? 68 69GeoKDTree is designed to solve one focused problem extremely well: 70 71**Fast nearest-neighbor lookup for latitude/longitude data at scale.** 72 73It is worth noting that the closest point found may not be the true closest point, but should be very close for most practical applications. See KD-Tree limitations for more details. 74 75### Extremely Fast Lookups 76 77Once constructed, nearest-neighbor queries consistently complete in **tens of nanoseconds**, even with very large datasets. 78 79Typical benchmark results from the included tests: 80 81| Number of Points | Build Time | Query Time | 82| ---------------: | ---------: | ---------: | 83| 1,000 | ~1.7 ms | ~0.02 ms | 84| 10,000 | ~25 ms | ~0.05 ms | 85| 100,000 | ~350 ms | ~0.05 ms | 86| 1,000,000 | ~6.8 s | ~0.07 ms | 87 88This makes GeoKDTree well-suited for: 89 90* Real-time proximity queries 91* Matching incoming coordinates against large reference datasets 92* High-throughput geospatial APIs 93* Pre-filtering before more expensive geospatial calculations 94 95> Exact timings depend on hardware, Python version, and data distribution. These values reflect typical results from the repository’s benchmarks. 96 97### Built for Geographic Coordinates 98 99GeoKDTree works directly with `(latitude, longitude)` pairs. 100 101You do **not** need to: 102 103* Project coordinates into planar space 104* Use heavyweight GIS libraries 105* Maintain custom spatial indexing code 106 107Just pass geographic coordinates and query. 108 109### Simple API, Minimal Overhead 110 111GeoKDTree intentionally keeps the API small and focused. 112 113* Build once from a list of coordinates 114* Query nearest neighbors with a single method call 115* Retrieve indices or points directly from your original dataset 116 117There are no external C extensions or heavy dependencies, keeping installation and deployment simple. 118 119### Deterministic and Predictable Performance 120 121* Tree construction scales at approximately `O(n log n)` 122* Query performance scales at approximately `O(log n)` 123* No probabilistic approximations 124* No background indexing or caching 125 126This predictability is valuable for production systems where latency and reproducibility matter. 127 128## Supported Features 129 130See: https://connor-makowski.github.io/geokdtree/geokdtree.html 131 132## Contributing 133 134Issues, feature requests, and pull requests are welcome. 135Please open an issue to discuss changes or enhancements. 136 137# Development 138## Running Tests, Prettifying Code, and Updating Docs 139 140Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2). 141 142- Create a docker container and drop into a shell 143 - `./run.sh` 144- Run all tests (see ./utils/test.sh) 145 - `./run.sh test` 146- Prettify the code (see ./utils/prettify.sh) 147 - `./run.sh prettify` 148- Update the docs (see ./utils/docs.sh) 149 - `./run.sh docs` 150 151- Note: You can and should modify the `Dockerfile` to test different python versions. 152 153""" 154 155try: 156 from geokdtree.cpp import GeoKDTree, KDTree 157except ImportError: 158 from geokdtree.geokdtree import GeoKDTree 159 from geokdtree.kdtree import KDTree