scgraph

SCGraph

PyPI version License: MIT PyPI Downloads

A Supply chain graph package for Python

scgraph

Quick Start:

Get the shortest maritime path length between Shanghai, China and Savannah, Georgia, USA

# Use a maritime network geograph
from scgraph.geographs.marnet import marnet_geograph
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    output_units='km',
)
print('Length: ',output['length']) #=> Length:  19596.4653

Documentation

How to Cite SCGraph in your Research

If you use SCGraph for your research, please consider citing the following paper:

Makowski, C., Saragih, A., Guter, W., Russell, T., Heinold, A., & Lekkakos, S. (2025). SCGraph: A dependency-free Python package for road, rail, and maritime shortest path routing generation and distance estimation. MIT Center for Transportation & Logistics Research Paper Series, (2025-028). https://ssrn.com/abstract=5388845

Or by using the BibTeX entry:

@article{makowski2025scgraph,
  title={SCGraph: A Dependency-Free Python Package for Road, Rail, and Maritime Shortest Path Routing Generation and Distance Estimation},
  author={Makowski, Connor and Saragih, Austin and Guter, Willem and Russell, Tim and Heinold, Arne and Lekkakos, Spyridon},
  journal={MIT Center for Transportation & Logistics Research Paper Series},
  number={2025-028},
  year={2025},
  url={https://ssrn.com/abstract=5388845}
}

Getting Started

Installation

pip install scgraph

Basic Geograph Usage

Get the shortest path between two points on earth using a latitude / longitude pair.

In this case, calculate the shortest maritime path between Shanghai, China and Savannah, Georgia, USA.

# Use a maritime network geograph
from scgraph.geographs.marnet import marnet_geograph

# Note: The origin and destination nodes can be any latitude / longitude pair
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    output_units='km',
    # Optional: Cache the origin node's shortest path tree for faster calculations on future calls from the same origin node when cache=True
    # Note: This will make the first call slower, but future calls using this origin node will be substantially faster.
    cache=True,
)
print('Length: ',output['length']) #=> Length:  19596.4653

Adding in a few additional parameters to the get_shortest_path function can change the output format as well as performance of the calculation.

# Use a maritime network geograph
from scgraph.geographs.marnet import marnet_geograph

# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    output_units='km',
    node_addition_lat_lon_bound=180, # Optional: The maximum distance in degrees to consider nodes when attempting to add the origin and destination nodes to the graph
    node_addition_type='quadrant', # Optional: Instead of connecting the origin node to the graph by the closest node, connect it to the closest node in each direction (NE, NW, SE, SW) if found within the node_addition_lat_lon_bound
    destination_node_addition_type='all', # Optional: Instead of connecting the destination node to the graph by the closest node, connect it to all nodes found within the node_addition_lat_lon_bound
    # When destination_node_addition_type='all' is set with a node_addition_lat_lon_bound=180, this will guarantee a solution can be found since the destination node will also connect to the origin node
)
print('Length: ',output['length']) #=> Length:  19596.4653

In the above example, the output variable is a dictionary with two keys: length and coordinate_path.

  • length: The distance between the passed origin and destination when traversing the graph along the shortest path
    • Notes:
      • This will be in the units specified by the output_units parameter.
      • output_units options:
        • km (kilometers - default)
        • m (meters)
        • mi (miles)
        • ft (feet)
  • coordinate_path: A list of lists [latitude,longitude] that make up the shortest path

You can also select a different algorithm function for the shortest_path:

from scgraph.geographs.marnet import marnet_geograph
from scgraph import Graph
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    # Optional: Specify an algorithm_fn to call when solving the shortest_path
    algorithm_fn=Graph.bmssp,
)

Don't neglect the very efficient distance matrix function to quickly get the distances between multiple points on the graph. Each origin graph entry point and shortest path tree is cached so you can generate massive distance matricies incredibly quickly (approaching 500 nano seconds per distance for large enough distance matricies).

from scgraph.geographs.us_freeway import us_freeway_geograph

cities = [
    {"latitude": 34.0522, "longitude": -118.2437},  # Los Angeles
    {"latitude": 40.7128, "longitude": -74.0060},   # New York City
    {"latitude": 41.8781, "longitude": -87.6298},   # Chicago
    {"latitude": 29.7604, "longitude": -95.3698},   # Houston
]

distance_matrix = us_freeway_geograph.distance_matrix(cities, output_units='km')
# [
#  [0.0, 4510.965665644833, 3270.3864033755776, 2502.886438995942],
#  [4510.9656656448415, 0.0, 1288.473118634311, 2637.5821542546687],
#  [3270.3864033755744, 1288.4731186343113, 0.0, 1913.1928919854067],
#  [2502.886438995935, 2637.5821542546687, 1913.1928919854076, 0.0],
# ]

For more examples including viewing the output on a map, see the example notebook.

Examples with Google Colab

GridGraph usage

Example:

  • Create a grid of 20x20 cells.
    • This creates a grid based graph with connections to all 8 neighbors for each grid item.
    • Each grid item has 4 cardinal connections at length 1 and 4 diagonal connections at length sqrt(2)
  • Create a wall from (10,5) to (10,19).
    • This would foce any path to go to the bottom of the graph to get around the wall.
  • Get the shortest path between (2,10) and (18,10)
    • Note: The length of this path should be 16 without the wall and 20.9704 with the wall.
from scgraph import GridGraph

x_size = 20
y_size = 20
blocks = [(10, i) for i in range(5, y_size)]

# Create the GridGraph object
gridGraph = GridGraph(
    x_size=x_size,
    y_size=y_size,
    blocks=blocks,
    add_exterior_walls=True,
)

# Solve the shortest path between two points
output = gridGraph.get_shortest_path(
    origin_node={"x": 2, "y": 10},
    destination_node={"x": 18, "y": 10},
    # Optional: Specify the output coordinate format (default is 'list_of_dicts)
    output_coordinate_path="list_of_lists",
    # Optional: Cache the origin point shortest path tree for faster calculations on future calls
    cache=True,
)

print(output)
#=> {'length': 20.9704, 'coordinate_path': [[2, 10], [3, 9], [4, 8], [5, 8], [6, 7], [7, 6], [8, 5], [9, 4], [10, 4], [11, 4], [12, 5], [13, 6], [14, 7], [15, 7], [16, 8], [17, 9], [18, 10]]}

About

Key Features

  • Graph:
    • A low level graph object that has methods for validating graphs, calculating shortest paths, and more.
    • See: Graph Documentation
    • Contains the following methods:
      • validate_graph: Validates symmetry and connectedness of a graph.
      • dijkstra: Calculates the shortest path between two nodes using Dijkstra's algorithm.
      • dijkstra_makowski: Calculates the shortest path between two nodes using a modified version of Dijkstra's algorithm designed for real world performance
      • dijkstra_negative: Calculates the shortest path between two nodes using a modified version of Dijkstra's algorithm that supports negative edge weights and detects negative cycles.
      • a_star: Modified version of dijkstra_makowski that incorporates a heuristic function to guide the search.
      • bellman_ford: Calculates the shortest path between two nodes using the Bellman-Ford algorithm.
      • bmssp: Calculates the shortest path between two nodes using a modified version of the BMSSP Algorithm. See the BMSSPy
        • Note: To use this, you must install BMSSPy as well. If not installed, the algorithm will default to solve with dijkstra_makowski.
        • You can install the needed packages with pip install scgraph[bmsspy].
  • GeoGraphs:
    • A geographic graph data structure that allows for the calculation of shortest paths between two points on earth
    • Uses latitude / longitude pairs to represent points on earth
    • Supports maritime, rail, road and other geographic networks
    • Uses a sparse network data structure to represent the graph
    • How to use it - Calculate the shortest path between two points on earth
      • Inputs:
        • A latitude / longitude pair for the origin
        • A latitude / longitude pair for the destination
      • Calculation:
        • See the Graph documentation above for available algorithms.
      • Returns:
        • path:
          • A list of lists [latitude, longitude] that make up the shortest path
        • length:
          • The distance (in the units requested) between the two points
    • Precompiled Geographs offer Antimeridian support
    • Arbitrary start and end points are supported
      • Start and end points do not need to be in the graph
    • Cached shortest path calculations can be used for very fast repetative calculations from the same origin node in a GeoGraph.
      • This is done by caching the origin node's shortest path tree
      • The first call will be slower, but future calls using this origin node will be substantially faster.
  • GridGraphs:
    • A grid based graph data structure that allows for the calculation of shortest paths between two points on a grid
    • See: GridGraph Documentation
    • Supports arbitrary grid sizes and blockages
    • Uses a sparse network data structure to represent the graph
    • How to use it - Calculate the shortest path between two points on a grid
      • Inputs:
        • A (x,y) coordinate on the grid for the origin
        • A (x,y) coordinate on the grid for the destination
      • Calculation:
        • Algorithms:
          • Dijkstra's algorithm
          • Modified Dijkstra algorithm
          • A* algorithm (Extension of the Modified Dijkstra)
      • Returns:
        • length:
          • The distance between the two points on the grid
        • coordinate_path:
          • A list of dicts {"x": x, "y": y} representing the path taken through the grid
    • Arbitrary start and end points are supported
      • Start and end points do not need to be in the graph
    • Arbitrary connection matricies are supported
      • Cardinal connections (up, down, left, right) and diagonal connections (up-left, up-right, down-left, down-right) are used by default
      • Custom connection matricies can be used to change the connections between grid items
    • Cached shortest path calculations can be used for very fast repetative calculations to or from the same point in a GridGraph.
  • Other Useful Features:
    • CacheGraphs:
      • A graph extension that caches shortest path trees for fast shortest path calculations on repeat calls from the same origin node
      • See: CacheGraphs Documentation
    • SpanningTrees:
      • Note: In general, we calculate the shortest path tree instead of the minimum spanning tree, however the name is retained for backwards compatibility.
      • See: Spanning Trees Documentation

Included GeoGraphs

  • marnet_geograph:
    • What: A maritime network data set from searoute
    • Use: from scgraph.geographs.marnet import marnet_geograph
    • Attribution: searoute
    • Length Measurement: Kilometers
    • Marnet Picture
  • oak_ridge_maritime_geograph:
  • north_america_rail_geograph:
  • us_freeway_geograph:
  • scgraph_data geographs:
    • What: Additional geographs are available in the scgraph_data package
      • Note: These include larger geographs like the world highways geograph and world railways geograph.
    • Installation: pip install scgraph_data
    • Use: from scgraph_data.world_highways import world_highways_geograph
    • See: scgraph_data for more information and all available geographs.
  • Custom Geographs:

Advanced Usage

Using scgraph_data geographs

Using scgraph_data geographs:

  • Note: Make sure to install the scgraph_data package before using these geographs
from scgraph_data.world_railways import world_railways_geograph
from scgraph import Graph

# Get the shortest path between Kalamazoo Michigan and Detroit Michigan by Train
output = world_railways_geograph.get_shortest_path(
    origin_node={"latitude": 42.29,"longitude": -85.58},
    destination_node={"latitude": 42.33,"longitude": -83.05},
    # Optional: Use the A* algorithm
    algorithm_fn=Graph.a_star,
    # Optional: Pass the haversine function as the heuristic function to the A* algorithm
    algorithm_kwargs={"heuristic_fn": world_railways_geograph.haversine},
)

Building your own Geographs from Open Source Data

You can build your own geographs using various tools and data sources. For example, you can use OpenStreetMap data to create a high fidelity geograph for a specific area.

Follow this step by step guide on how to create a geograph from OpenStreetMap data. For this example, we will use some various tools to create a geograph for highways (including seconday highways) in Michigan, USA.

Download an OSM PBF file using the AWS CLI:

  • Geofabrik is a good source for smaller OSM PBF files. See: https://download.geofabrik.de/
  • To keep things generalizable, you can also download the entire planet OSM PBF file using AWS. But you should consider downloading a smaller region if you are only interested in a specific area.
    • Note: For this, you will need to install the AWS CLI.
    • Note: The planet OSM PBF file is very large (About 100GB)
    
aws s3 cp s3://osm-pds/planet-latest.osm.pbf .
  • Use Osmium to filter and extract the highways from the OSM PBF file.

    • Install osmium on macOS:

      brew install osmium-tool
      
    • Install osmium on Ubuntu:

      sudo apt-get install osmium-tool
      
  • Download a Poly file for the area you are interested in. This is a polygon file that defines the area you want to extract from the OSM PBF file.

    • For Michigan, you can download the poly file from Geofabrik:

      curl https://download.geofabrik.de/north-america/us/michigan.poly > michigan.poly
      
    • Google around to find an appropriate poly file for your area of interest.

  • Filter and extract as GeoJSON (EG: Michigan) substituting the poly and pbf file names as needed:

    osmium extract -p michigan.poly --overwrite -o michigan.osm.pbf planet-latest.osm.pbf
    
  • Filter the OSM PBF file to only areas of interest and export to GeoJSON:

    osmium tags-filter michigan.osm.pbf w/highway=motorway,trunk,primary,motorway_link,trunk_link,primary_link,secondary,secondary_link,tertiary,tertiary_link -t --overwrite -o michigan_roads.osm.pbf
    osmium export michigan_roads.osm.pbf -f geojson --overwrite -o michigan_roads.geojson
    
  • Simplify the geojson

    • This uses some tools in the SCGraph library as well as Mapshaper to simplify the geojson files.
    • Mapshaper is a CLI and web tool for simplifying and editing geojson files.
    • To install Mapshaper for CLI use, use NPM:

      npm install -g mapshaper
      
    • Mapshaper is particularly helpful since it repairs intersections in the lines which is crutial for geographs to work properly.

    • Mapshaper, however, does not handle larger files very well, so it is recommended to simplify the geojson file first using the scgraph.helpers.geojson.simplify_geojson function first to reduce the size of the file.
    • Make sure to tailor the parameters to your needs.
    python -c "from scgraph.helpers.geojson import simplify_geojson; simplify_geojson('michigan_roads.geojson', 'michigan_roads_simple.geojson', precision=4, pct_to_keep=100, min_points=3, silent=False)"
    mapshaper michigan_roads_simple.geojson -simplify 10% -filter-fields -o force michigan_roads_simple.geojson
    mapshaper michigan_roads_simple.geojson -snap -clean -o force michigan_roads_simple.geojson
    
  • Load the newly created geojson file as a geograph:

    • Note: The GeoGraph.load_from_geojson function is used to load the geojson file as a geograph.
    • This will create a geograph that can be used to calculate shortest paths between points on the graph.
    from scgraph import GeoGraph
    michigan_roads_geograph = GeoGraph.load_from_geojson('michigan_roads_simple.geojson')
    

Custom Graphs and Geographs

Modify an existing geograph: See the notebook here

You can specify your own custom graphs for direct access to the solving algorithms. This requires the use of the low level Graph class

from scgraph import Graph

# Define an arbitrary graph
# See the graph definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
graph = [
    {1: 5, 2: 1},
    {0: 5, 2: 2, 3: 1},
    {0: 1, 1: 2, 3: 4, 4: 8},
    {1: 1, 2: 4, 4: 3, 5: 6},
    {2: 8, 3: 3},
    {3: 6}
]

# Optional: Validate your graph
Graph.validate_graph(graph=graph)

# Get the shortest path between idx 0 and idx 5
output = Graph.dijkstra_makowski(graph=graph, origin_id=0, destination_id=5)
#=> {'path': [0, 2, 1, 3, 5], 'length': 10}

You can also use a slightly higher level GeoGraph class to work with latitude / longitude pairs

from scgraph import GeoGraph

# Define nodes
# See the nodes definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.__init__
nodes = [
    # London
    [51.5074, -0.1278],
    # Paris
    [48.8566, 2.3522],
    # Berlin
    [52.5200, 13.4050],
    # Rome
    [41.9028, 12.4964],
    # Madrid
    [40.4168, -3.7038],
    # Lisbon
    [38.7223, -9.1393]
]
# Define a graph
# See the graph definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
graph = [
    # From London
    {
        # To Paris
        1: 311,
    },
    # From Paris
    {
        # To London
        0: 311,
        # To Berlin
        2: 878,
        # To Rome
        3: 1439,
        # To Madrid
        4: 1053
    },
    # From Berlin
    {
        # To Paris
        1: 878,
        # To Rome
        3: 1181,
    },
    # From Rome
    {
        # To Paris
        1: 1439,
        # To Berlin
        2: 1181,
    },
    # From Madrid
    {
        # To Paris
        1: 1053,
        # To Lisbon
        5: 623
    },
    # From Lisbon
    {
        # To Madrid
        4: 623
    }
]

# Create a GeoGraph object
my_geograph = GeoGraph(nodes=nodes, graph=graph)

# Optional: Validate your graph
my_geograph.validate_graph()

# Optional: Validate your nodes
my_geograph.validate_nodes()

# Get the shortest path between two points
# In this case, Birmingham England and Zaragoza Spain
# Since Birmingham and Zaragoza are not in the graph,
# the algorithm will add them into the graph.
# See: https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.get_shortest_path
# Expected output would be to go from
# Birmingham -> London -> Paris -> Madrid -> Zaragoza

output = my_geograph.get_shortest_path(
    # Birmingham England
    origin_node = {'latitude': 52.4862, 'longitude': -1.8904},
    # Zaragoza Spain
    destination_node = {'latitude': 41.6488, 'longitude': -0.8891}
)
print(output)
# {
#     'length': 1799.4323,
#     'coordinate_path': [
#         [52.4862, -1.8904],
#         [51.5074, -0.1278],
#         [48.8566, 2.3522],
#         [40.4168, -3.7038],
#         [41.6488, -0.8891]
#     ]
# }

Development

Running Tests, Prettifying Code, and Updating Docs

Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2).

  • Create a docker container and drop into a shell
    • ./run.sh
  • Run all tests (see ./utils/test.sh)
    • ./run.sh test
  • Prettify the code (see ./utils/prettify.sh)
    • ./run.sh prettify
  • Update the docs (see ./utils/docs.sh)

    • ./run.sh docs
  • Note: You can and should modify the Dockerfile to test different python versions.

Attributions and Thanks

Originally inspired by searoute including the use of one of their datasets that has been modified to work properly with this package.

  1"""
  2# SCGraph
  3[![PyPI version](https://badge.fury.io/py/scgraph.svg)](https://badge.fury.io/py/scgraph)
  4[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
  5[![PyPI Downloads](https://pepy.tech/badge/scgraph)](https://pypi.org/project/scgraph/)
  6<!-- [![PyPI Downloads](https://img.shields.io/pypi/dm/scgraph.svg?label=PyPI%20downloads)](https://pypi.org/project/scgraph/) -->
  7
  8
  9### A Supply chain graph package for Python
 10
 11
 12![scgraph](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/scgraph.png)
 13
 14## Quick Start:
 15Get the shortest maritime path length between Shanghai, China and Savannah, Georgia, USA
 16```py
 17# Use a maritime network geograph
 18from scgraph.geographs.marnet import marnet_geograph
 19output = marnet_geograph.get_shortest_path(
 20    origin_node={"latitude": 31.23,"longitude": 121.47},
 21    destination_node={"latitude": 32.08,"longitude": -81.09},
 22    output_units='km',
 23)
 24print('Length: ',output['length']) #=> Length:  19596.4653
 25```
 26
 27### Documentation
 28
 29- Docs: https://connor-makowski.github.io/scgraph/scgraph.html
 30- Git Repo: https://github.com/connor-makowski/scgraph
 31- Paper: https://ssrn.com/abstract=5388845
 32- Awards: [2025 MIT Prize for Open Data](https://libraries.mit.edu/opendata/open-data-mit-home/mit-prize/)
 33
 34### How to Cite SCGraph in your Research
 35
 36If you use SCGraph for your research, please consider citing the following paper:
 37
 38> Makowski, C., Saragih, A., Guter, W., Russell, T., Heinold, A., & Lekkakos, S. (2025). SCGraph: A dependency-free Python package for road, rail, and maritime shortest path routing generation and distance estimation. MIT Center for Transportation & Logistics Research Paper Series, (2025-028). https://ssrn.com/abstract=5388845
 39
 40Or by using the BibTeX entry:
 41
 42```
 43@article{makowski2025scgraph,
 44  title={SCGraph: A Dependency-Free Python Package for Road, Rail, and Maritime Shortest Path Routing Generation and Distance Estimation},
 45  author={Makowski, Connor and Saragih, Austin and Guter, Willem and Russell, Tim and Heinold, Arne and Lekkakos, Spyridon},
 46  journal={MIT Center for Transportation & Logistics Research Paper Series},
 47  number={2025-028},
 48  year={2025},
 49  url={https://ssrn.com/abstract=5388845}
 50}
 51```
 52
 53# Getting Started
 54
 55## Installation
 56
 57```
 58pip install scgraph
 59```
 60
 61## Basic Geograph Usage
 62
 63Get the shortest path between two points on earth using a latitude / longitude pair.
 64
 65In this case, calculate the shortest maritime path between Shanghai, China and Savannah, Georgia, USA.
 66
 67```py
 68# Use a maritime network geograph
 69from scgraph.geographs.marnet import marnet_geograph
 70
 71# Note: The origin and destination nodes can be any latitude / longitude pair
 72output = marnet_geograph.get_shortest_path(
 73    origin_node={"latitude": 31.23,"longitude": 121.47},
 74    destination_node={"latitude": 32.08,"longitude": -81.09},
 75    output_units='km',
 76    # Optional: Cache the origin node's shortest path tree for faster calculations on future calls from the same origin node when cache=True
 77    # Note: This will make the first call slower, but future calls using this origin node will be substantially faster.
 78    cache=True,
 79)
 80print('Length: ',output['length']) #=> Length:  19596.4653
 81```
 82
 83Adding in a few additional parameters to the `get_shortest_path` function can change the output format as well as performance of the calculation.
 84```py
 85# Use a maritime network geograph
 86from scgraph.geographs.marnet import marnet_geograph
 87
 88# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
 89output = marnet_geograph.get_shortest_path(
 90    origin_node={"latitude": 31.23,"longitude": 121.47},
 91    destination_node={"latitude": 32.08,"longitude": -81.09},
 92    output_units='km',
 93    node_addition_lat_lon_bound=180, # Optional: The maximum distance in degrees to consider nodes when attempting to add the origin and destination nodes to the graph
 94    node_addition_type='quadrant', # Optional: Instead of connecting the origin node to the graph by the closest node, connect it to the closest node in each direction (NE, NW, SE, SW) if found within the node_addition_lat_lon_bound
 95    destination_node_addition_type='all', # Optional: Instead of connecting the destination node to the graph by the closest node, connect it to all nodes found within the node_addition_lat_lon_bound
 96    # When destination_node_addition_type='all' is set with a node_addition_lat_lon_bound=180, this will guarantee a solution can be found since the destination node will also connect to the origin node
 97)
 98print('Length: ',output['length']) #=> Length:  19596.4653
 99```
100
101In the above example, the `output` variable is a dictionary with two keys: `length` and `coordinate_path`.
102
103- `length`: The distance between the passed origin and destination when traversing the graph along the shortest path
104    - Notes:
105        - This will be in the units specified by the `output_units` parameter.
106        - `output_units` options:
107            - `km` (kilometers - default)
108            - `m` (meters)
109            - `mi` (miles)
110            - `ft` (feet)
111- `coordinate_path`: A list of lists [`latitude`,`longitude`] that make up the shortest path
112
113
114You can also select a different algorithm function for the shortest_path:
115```py
116from scgraph.geographs.marnet import marnet_geograph
117from scgraph import Graph
118output = marnet_geograph.get_shortest_path(
119    origin_node={"latitude": 31.23,"longitude": 121.47},
120    destination_node={"latitude": 32.08,"longitude": -81.09},
121    # Optional: Specify an algorithm_fn to call when solving the shortest_path
122    algorithm_fn=Graph.bmssp,
123)
124```
125
126Don't neglect the very efficient distance matrix function to quickly get the distances between multiple points on the graph. Each origin graph entry point and shortest path tree is cached so you can generate massive distance matricies incredibly quickly (approaching 500 nano seconds per distance for large enough distance matricies).
127```py
128from scgraph.geographs.us_freeway import us_freeway_geograph
129
130cities = [
131    {"latitude": 34.0522, "longitude": -118.2437},  # Los Angeles
132    {"latitude": 40.7128, "longitude": -74.0060},   # New York City
133    {"latitude": 41.8781, "longitude": -87.6298},   # Chicago
134    {"latitude": 29.7604, "longitude": -95.3698},   # Houston
135]
136
137distance_matrix = us_freeway_geograph.distance_matrix(cities, output_units='km')
138# [
139#  [0.0, 4510.965665644833, 3270.3864033755776, 2502.886438995942],
140#  [4510.9656656448415, 0.0, 1288.473118634311, 2637.5821542546687],
141#  [3270.3864033755744, 1288.4731186343113, 0.0, 1913.1928919854067],
142#  [2502.886438995935, 2637.5821542546687, 1913.1928919854076, 0.0],
143# ]
144```
145
146For more examples including viewing the output on a map, see the [example notebook](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/getting_started.ipynb).
147
148
149### Examples with Google Colab
150
151- [Getting Started](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/getting_started.ipynb)
152- [Creating A Multi Path Geojson](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/multi_path_geojson.ipynb)
153- [Modifying A Geograph](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/geograph_modifications.ipynb)
154
155## GridGraph usage
156
157Example:
158- Create a grid of 20x20 cells.
159    - This creates a grid based graph with connections to all 8 neighbors for each grid item.
160    - Each grid item has 4 cardinal connections at length 1 and 4 diagonal connections at length sqrt(2)
161- Create a wall from (10,5) to (10,19).
162    - This would foce any path to go to the bottom of the graph to get around the wall.
163- Get the shortest path between (2,10) and (18,10)
164    - Note: The length of this path should be 16 without the wall and 20.9704 with the wall.
165
166```py
167from scgraph import GridGraph
168
169x_size = 20
170y_size = 20
171blocks = [(10, i) for i in range(5, y_size)]
172
173# Create the GridGraph object
174gridGraph = GridGraph(
175    x_size=x_size,
176    y_size=y_size,
177    blocks=blocks,
178    add_exterior_walls=True,
179)
180
181# Solve the shortest path between two points
182output = gridGraph.get_shortest_path(
183    origin_node={"x": 2, "y": 10},
184    destination_node={"x": 18, "y": 10},
185    # Optional: Specify the output coordinate format (default is 'list_of_dicts)
186    output_coordinate_path="list_of_lists",
187    # Optional: Cache the origin point shortest path tree for faster calculations on future calls
188    cache=True,
189)
190
191print(output)
192#=> {'length': 20.9704, 'coordinate_path': [[2, 10], [3, 9], [4, 8], [5, 8], [6, 7], [7, 6], [8, 5], [9, 4], [10, 4], [11, 4], [12, 5], [13, 6], [14, 7], [15, 7], [16, 8], [17, 9], [18, 10]]}
193```
194
195
196# About
197## Key Features
198
199- `Graph`:
200    - A low level graph object that has methods for validating graphs, calculating shortest paths, and more.
201    - See: [Graph Documentation](https://connor-makowski.github.io/scgraph/scgraph/graph.html)
202    - Contains the following methods:
203        - `validate_graph`: Validates symmetry and connectedness of a graph.
204        - `dijkstra`: Calculates the shortest path between two nodes using Dijkstra's algorithm.
205        - `dijkstra_makowski`: Calculates the shortest path between two nodes using a modified version of Dijkstra's algorithm designed for real world performance
206        - `dijkstra_negative`: Calculates the shortest path between two nodes using a modified version of Dijkstra's algorithm that supports negative edge weights and detects negative cycles.
207        - `a_star`: Modified version of `dijkstra_makowski` that incorporates a heuristic function to guide the search.
208        - `bellman_ford`: Calculates the shortest path between two nodes using the Bellman-Ford algorithm.
209        - `bmssp`: Calculates the shortest path between two nodes using a modified version of the [BMSSP Algorithm](https://arxiv.org/pdf/2504.17033). See the [BMSSPy](https://github.com/connor-makowski/bmsspy)
210            - Note: To use this, you must install `BMSSPy` as well. If not installed, the algorithm will default to solve with `dijkstra_makowski`.
211            - You can install the needed packages with `pip install scgraph[bmsspy]`.
212- `GeoGraph`s:
213    - A geographic graph data structure that allows for the calculation of shortest paths between two points on earth
214    - Uses latitude / longitude pairs to represent points on earth
215    - Supports maritime, rail, road and other geographic networks
216    - Uses a sparse network data structure to represent the graph
217    - How to use it - Calculate the shortest path between two points on earth
218        - Inputs:
219            - A latitude / longitude pair for the origin
220            - A latitude / longitude pair for the destination
221        - Calculation:
222            - See the `Graph` documentation above for available algorithms.
223        - Returns:
224            - `path`:
225                - A list of lists `[latitude, longitude]` that make up the shortest path
226            - `length`:
227                - The distance (in the units requested) between the two points
228    - Precompiled Geographs offer Antimeridian support
229    - Arbitrary start and end points are supported
230        - Start and end points do not need to be in the graph
231    - Cached shortest path calculations can be used for very fast repetative calculations from the same origin node in a GeoGraph.
232        - This is done by caching the origin node's shortest path tree
233        - The first call will be slower, but future calls using this origin node will be substantially faster.
234- `GridGraph`s:
235    - A grid based graph data structure that allows for the calculation of shortest paths between two points on a grid
236    - See: [GridGraph Documentation](https://connor-makowski.github.io/scgraph/scgraph/grid.html)
237    - Supports arbitrary grid sizes and blockages
238    - Uses a sparse network data structure to represent the graph
239    - How to use it - Calculate the shortest path between two points on a grid
240        - Inputs:
241            - A (x,y) coordinate on the grid for the origin
242            - A (x,y) coordinate on the grid for the destination
243        - Calculation:
244            - Algorithms:
245                - Dijkstra's algorithm
246                - Modified Dijkstra algorithm
247                - A* algorithm (Extension of the Modified Dijkstra)
248        - Returns:
249            - `length`:
250                - The distance between the two points on the grid
251            - `coordinate_path`:
252                - A list of dicts `{"x": x, "y": y}` representing the path taken through the grid
253    - Arbitrary start and end points are supported
254        - Start and end points do not need to be in the graph
255    - Arbitrary connection matricies are supported
256        - Cardinal connections (up, down, left, right) and diagonal connections (up-left, up-right, down-left, down-right) are used by default
257        - Custom connection matricies can be used to change the connections between grid items
258    - Cached shortest path calculations can be used for very fast repetative calculations to or from the same point in a GridGraph.
259- Other Useful Features:
260    - `CacheGraph`s:
261        - A graph extension that caches shortest path trees for fast shortest path calculations on repeat calls from the same origin node
262        - See: [CacheGraphs Documentation](https://connor-makowski.github.io/scgraph/scgraph/cache.html)
263    - `SpanningTree`s:
264        - Note: In general, we calculate the shortest path tree instead of the  minimum spanning tree, however the name is retained for backwards compatibility.
265        - See: [Spanning Trees Documentation](https://connor-makowski.github.io/scgraph/scgraph/spanning.html)
266
267## Included GeoGraphs
268
269- marnet_geograph:
270    - What: A maritime network data set from searoute
271    - Use: `from scgraph.geographs.marnet import marnet_geograph`
272    - Attribution: [searoute](https://github.com/genthalili/searoute-py)
273    - Length Measurement: Kilometers
274    - [Marnet Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/marnet.png)
275- oak_ridge_maritime_geograph:
276    - What: A maritime data set from the Oak Ridge National Laboratory campus
277    - Use: `from scgraph.geographs.oak_ridge_maritime import oak_ridge_maritime_geograph`
278    - Attribution: [Oak Ridge National Laboratory](https://www.ornl.gov/) with data from [Geocommons](http://geocommons.com/datasets?id=25)
279    - Length Measurement: Kilometers
280    - [Oak Ridge Maritime Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/oak_ridge_maritime.png)
281- north_america_rail_geograph:
282    - What: Class 1 Rail network for North America
283    - Use: `from scgraph.geographs.north_america_rail import north_america_rail_geograph`
284    - Attribution: [U.S. Department of Transportation: ArcGIS Online](https://geodata.bts.gov/datasets/usdot::north-american-rail-network-lines-class-i-freight-railroads-view/about)
285    - Length Measurement: Kilometers
286    - [North America Rail Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/north_america_rail.png)
287- us_freeway_geograph:
288    - What: Freeway network for the United States
289    - Use: `from scgraph.geographs.us_freeway import us_freeway_geograph`
290    - Attribution: [U.S. Department of Transportation: ArcGIS Online](https://hub.arcgis.com/datasets/esri::usa-freeway-system-over-1500k/about)
291    - Length Measurement: Kilometers
292    - [US Freeway Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/us_freeway.png)
293- `scgraph_data` geographs:
294    - What: Additional geographs are available in the `scgraph_data` package
295        - Note: These include larger geographs like the world highways geograph and world railways geograph.
296    - Installation: `pip install scgraph_data`
297    - Use: `from scgraph_data.world_highways import world_highways_geograph`
298    - See: [scgraph_data](https://github.com/connor-makowski/scgraph_data) for more information and all available geographs.
299- Custom Geographs:
300    - What: Users can create their own geographs from various data sources
301    - See: [Building your own Geographs from Open Source Data](https://github.com/connor-makowski/scgraph#building-your-own-geographs-from-open-source-data)
302
303# Advanced Usage
304
305## Using scgraph_data geographs
306
307Using `scgraph_data` geographs:
308- Note: Make sure to install the `scgraph_data` package before using these geographs
309```py
310from scgraph_data.world_railways import world_railways_geograph
311from scgraph import Graph
312
313# Get the shortest path between Kalamazoo Michigan and Detroit Michigan by Train
314output = world_railways_geograph.get_shortest_path(
315    origin_node={"latitude": 42.29,"longitude": -85.58},
316    destination_node={"latitude": 42.33,"longitude": -83.05},
317    # Optional: Use the A* algorithm
318    algorithm_fn=Graph.a_star,
319    # Optional: Pass the haversine function as the heuristic function to the A* algorithm
320    algorithm_kwargs={"heuristic_fn": world_railways_geograph.haversine},
321)
322```
323
324## Building your own Geographs from Open Source Data
325You can build your own geographs using various tools and data sources. For example, you can use OpenStreetMap data to create a high fidelity geograph for a specific area.
326
327Follow this step by step guide on how to create a geograph from OpenStreetMap data.
328For this example, we will use some various tools to create a geograph for highways (including seconday highways) in Michigan, USA.
329
330Download an OSM PBF file using the AWS CLI:
331- Geofabrik is a good source for smaller OSM PBF files. See: https://download.geofabrik.de/
332- To keep things generalizable, you can also download the entire planet OSM PBF file using AWS. But you should consider downloading a smaller region if you are only interested in a specific area.
333    - Note: For this, you will need to install the AWS CLI.
334    - Note: The planet OSM PBF file is very large (About 100GB)
335        ```
336        aws s3 cp s3://osm-pds/planet-latest.osm.pbf .
337        ```
338- Use Osmium to filter and extract the highways from the OSM PBF file.
339    - Install osmium on macOS:
340        ```
341        brew install osmium-tool
342        ```
343    - Install osmium on Ubuntu:
344        ```
345        sudo apt-get install osmium-tool
346        ```
347- Download a Poly file for the area you are interested in. This is a polygon file that defines the area you want to extract from the OSM PBF file.
348    - For Michigan, you can download the poly file from Geofabrik:
349        ```
350        curl https://download.geofabrik.de/north-america/us/michigan.poly > michigan.poly
351        ```
352    - Google around to find an appropriate poly file for your area of interest.
353- Filter and extract as GeoJSON (EG: Michigan) substituting the poly and pbf file names as needed:
354    ```
355    osmium extract -p michigan.poly --overwrite -o michigan.osm.pbf planet-latest.osm.pbf
356    ```
357- Filter the OSM PBF file to only areas of interest and export to GeoJSON:
358    - See: https://wiki.openstreetmap.org/wiki/
359    - EG For Highways, see: https://wiki.openstreetmap.org/wiki/Key:highway
360    ```
361    osmium tags-filter michigan.osm.pbf w/highway=motorway,trunk,primary,motorway_link,trunk_link,primary_link,secondary,secondary_link,tertiary,tertiary_link -t --overwrite -o michigan_roads.osm.pbf
362    osmium export michigan_roads.osm.pbf -f geojson --overwrite -o michigan_roads.geojson
363    ```
364
365- Simplify the geojson
366    - This uses some tools in the SCGraph library as well as Mapshaper to simplify the geojson files.
367    - Mapshaper is a CLI and web tool for simplifying and editing geojson files.
368    - To install Mapshaper for CLI use, use NPM:
369        ```
370        npm install -g mapshaper
371        ```
372    - Mapshaper is particularly helpful since it repairs intersections in the lines which is crutial for geographs to work properly.
373    - Mapshaper, however, does not handle larger files very well, so it is recommended to simplify the geojson file first using the `scgraph.helpers.geojson.simplify_geojson` function first to reduce the size of the file.
374    - Make sure to tailor the parameters to your needs.
375    ```
376    python -c "from scgraph.helpers.geojson import simplify_geojson; simplify_geojson('michigan_roads.geojson', 'michigan_roads_simple.geojson', precision=4, pct_to_keep=100, min_points=3, silent=False)"
377    mapshaper michigan_roads_simple.geojson -simplify 10% -filter-fields -o force michigan_roads_simple.geojson
378    mapshaper michigan_roads_simple.geojson -snap -clean -o force michigan_roads_simple.geojson
379    ```
380- Load the newly created geojson file as a geograph:
381    - Note: The `GeoGraph.load_from_geojson` function is used to load the geojson file as a geograph.
382    - This will create a geograph that can be used to calculate shortest paths between points on the graph.
383    ```
384    from scgraph import GeoGraph
385    michigan_roads_geograph = GeoGraph.load_from_geojson('michigan_roads_simple.geojson')
386    ```
387
388## Custom Graphs and Geographs
389Modify an existing geograph: See the notebook [here](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/geograph_modifications.ipynb)
390
391
392You can specify your own custom graphs for direct access to the solving algorithms. This requires the use of the low level `Graph` class
393
394```py
395from scgraph import Graph
396
397# Define an arbitrary graph
398# See the graph definitions here:
399# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
400graph = [
401    {1: 5, 2: 1},
402    {0: 5, 2: 2, 3: 1},
403    {0: 1, 1: 2, 3: 4, 4: 8},
404    {1: 1, 2: 4, 4: 3, 5: 6},
405    {2: 8, 3: 3},
406    {3: 6}
407]
408
409# Optional: Validate your graph
410Graph.validate_graph(graph=graph)
411
412# Get the shortest path between idx 0 and idx 5
413output = Graph.dijkstra_makowski(graph=graph, origin_id=0, destination_id=5)
414#=> {'path': [0, 2, 1, 3, 5], 'length': 10}
415```
416
417You can also use a slightly higher level `GeoGraph` class to work with latitude / longitude pairs
418
419```py
420from scgraph import GeoGraph
421
422# Define nodes
423# See the nodes definitions here:
424# https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.__init__
425nodes = [
426    # London
427    [51.5074, -0.1278],
428    # Paris
429    [48.8566, 2.3522],
430    # Berlin
431    [52.5200, 13.4050],
432    # Rome
433    [41.9028, 12.4964],
434    # Madrid
435    [40.4168, -3.7038],
436    # Lisbon
437    [38.7223, -9.1393]
438]
439# Define a graph
440# See the graph definitions here:
441# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
442graph = [
443    # From London
444    {
445        # To Paris
446        1: 311,
447    },
448    # From Paris
449    {
450        # To London
451        0: 311,
452        # To Berlin
453        2: 878,
454        # To Rome
455        3: 1439,
456        # To Madrid
457        4: 1053
458    },
459    # From Berlin
460    {
461        # To Paris
462        1: 878,
463        # To Rome
464        3: 1181,
465    },
466    # From Rome
467    {
468        # To Paris
469        1: 1439,
470        # To Berlin
471        2: 1181,
472    },
473    # From Madrid
474    {
475        # To Paris
476        1: 1053,
477        # To Lisbon
478        5: 623
479    },
480    # From Lisbon
481    {
482        # To Madrid
483        4: 623
484    }
485]
486
487# Create a GeoGraph object
488my_geograph = GeoGraph(nodes=nodes, graph=graph)
489
490# Optional: Validate your graph
491my_geograph.validate_graph()
492
493# Optional: Validate your nodes
494my_geograph.validate_nodes()
495
496# Get the shortest path between two points
497# In this case, Birmingham England and Zaragoza Spain
498# Since Birmingham and Zaragoza are not in the graph,
499# the algorithm will add them into the graph.
500# See: https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.get_shortest_path
501# Expected output would be to go from
502# Birmingham -> London -> Paris -> Madrid -> Zaragoza
503
504output = my_geograph.get_shortest_path(
505    # Birmingham England
506    origin_node = {'latitude': 52.4862, 'longitude': -1.8904},
507    # Zaragoza Spain
508    destination_node = {'latitude': 41.6488, 'longitude': -0.8891}
509)
510print(output)
511# {
512#     'length': 1799.4323,
513#     'coordinate_path': [
514#         [52.4862, -1.8904],
515#         [51.5074, -0.1278],
516#         [48.8566, 2.3522],
517#         [40.4168, -3.7038],
518#         [41.6488, -0.8891]
519#     ]
520# }
521
522```
523
524# Development
525## Running Tests, Prettifying Code, and Updating Docs
526
527Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2).
528
529- Create a docker container and drop into a shell
530    - `./run.sh`
531- Run all tests (see ./utils/test.sh)
532    - `./run.sh test`
533- Prettify the code (see ./utils/prettify.sh)
534    - `./run.sh prettify`
535- Update the docs (see ./utils/docs.sh)
536    - `./run.sh docs`
537
538- Note: You can and should modify the `Dockerfile` to test different python versions.
539
540
541## Attributions and Thanks
542Originally inspired by [searoute](https://github.com/genthalili/searoute-py) including the use of one of their [datasets](https://github.com/genthalili/searoute-py/blob/main/searoute/data/marnet_densified_v2_old.geojson) that has been modified to work properly with this package.
543"""
544
545from .graph import Graph
546from .geograph import GeoGraph
547from .grid import GridGraph