scgraph

scgraph

PyPI version License: MIT PyPI Downloads

Supply chain graph package for Python

scgraph

Documentation

Getting Started: https://github.com/connor-makowski/scgraph

Low Level: https://connor-makowski.github.io/scgraph/scgraph.html

Key Features

  • GeoGraphs:
    • A geographic graph data structure that allows for the calculation of shortest paths between two points on earth
    • Uses latitude / longitude pairs to represent points on earth
    • Supports maritime, rail, road and other geographic networks
    • Uses a sparse network data structure to represent the graph
    • How to use it - Calculate the shortest path between two points on earth
      • Inputs:
        • A latitude / longitude pair for the origin
        • A latitude / longitude pair for the destination
      • Calculation:
        • Algorithms:
          • Dijkstra's algorithm
            • Modified to support sparse network data structures
          • Modified Dijkstra algorithm
            • Modified for O((n+m)log(n)) performance where n is the number of nodes and m is the number of edges
            • Uses a priority queue and other improvements to run fast on large graphs
          • A* algorithm (Extension of the Modified Dijkstra)
            • Uses a heuristic function to improve performance on large graphs
          • Possible future support for other algorithms
      • Returns:
        • path:
          • A list of lists [latitude, longitude] that make up the shortest path
        • length:
          • The distance (in the units requested) between the two points
    • Precompiled Geographs offer Antimeridian support
    • Arbitrary start and end points are supported
      • Start and end points do not need to be in the graph
    • Cached shortest path calculations can be used for very fast repetative calculations from the same origin node in a GeoGraph.
      • This is done by caching the origin node's spanning tree
      • The first call will be slower, but future calls using this origin node will be substantially faster.
  • GridGraphs:
    • A grid based graph data structure that allows for the calculation of shortest paths between two points on a grid
    • Supports arbitrary grid sizes and blockages
    • Uses a sparse network data structure to represent the graph
    • How to use it - Calculate the shortest path between two points on a grid
      • Inputs:
        • A (x,y) coordinate on the grid for the origin
        • A (x,y) coordinate on the grid for the destination
      • Calculation:
        • Algorithms:
          • Dijkstra's algorithm
          • Modified Dijkstra algorithm
          • A* algorithm (Extension of the Modified Dijkstra)
      • Returns:
        • length:
          • The distance between the two points on the grid
        • coordinate_path:
          • A list of dicts {"x": x, "y": y} representing the path taken through the grid
    • Arbitrary start and end points are supported
      • Start and end points do not need to be in the graph
    • Arbitrary connection matricies are supported
      • Cardinal connections (up, down, left, right) and diagonal connections (up-left, up-right, down-left, down-right) are used by default
      • Custom connection matricies can be used to change the connections between grid items
    • Cached shortest path calculations can be used for very fast repetative calculations to or from the same point in a GridGraph.
  • Other Useful Features:
    • Graph
      • A low level graph object that has methods for validating graphs, calculating shortest paths, and more
    • CacheGraphs
      • A graph extension that caches spanning trees for fast shortest path calculations on repeat calls from the same origin node

Setup

Make sure you have Python 3.10.x (or higher) installed on your system. You can download it here.

Note: Support for python3.6-python3.9 is available up to version 2.2.0.

Installation

pip install scgraph

Use with Google Colab

Getting Started

Basic Geograph Usage

Get the shortest path between two points on earth using a latitude / longitude pair.

In this case, calculate the shortest maritime path between Shanghai, China and Savannah, Georgia, USA.

# Use a maritime network geograph
from scgraph.geographs.marnet import marnet_geograph

# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
# Note: The origin and destination nodes can be any latitude / longitude pair
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    output_units='km',
    # Optional: Cache the origin node's spanning tree for faster calculations on future calls from the same origin node when cache=True
    # Note: This will make the first call slower, but future calls using this origin node will be substantially faster.
    cache=True,
)
print('Length: ',output['length']) #=> Length:  19596.4653

Adding in a few additional parameters to the get_shortest_path function can change the output format as well as performance of the calculation.

# Use a maritime network geograph
from scgraph.geographs.marnet import marnet_geograph

# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 31.23,"longitude": 121.47},
    destination_node={"latitude": 32.08,"longitude": -81.09},
    output_units='km',
    node_addition_lat_lon_bound=180, # Optional: The maximum distance in degrees to consider nodes when attempting to add the origin and destination nodes to the graph
    node_addition_type='quadrant', # Optional: Instead of connecting the origin node to the graph by the closest node, connect it to the closest node in each direction (NE, NW, SE, SW) if found within the node_addition_lat_lon_bound
    destination_node_addition_type='all', # Optional: Instead of connecting the destination node to the graph by the closest node, connect it to all nodes found within the node_addition_lat_lon_bound
    # When destination_node_addition_type='all' is set with a node_addition_lat_lon_bound=180, this will guarantee a solution can be found since the destination node will also connect to the origin node
)
print('Length: ',output['length']) #=> Length:  19596.4653

In the above example, the output variable is a dictionary with two keys: length and coordinate_path.

  • length: The distance between the passed origin and destination when traversing the graph along the shortest path
    • Notes:
      • This will be in the units specified by the output_units parameter.
      • output_units options:
        • km (kilometers - default)
        • m (meters)
        • mi (miles)
        • ft (feet)
  • coordinate_path: A list of lists [latitude,longitude] that make up the shortest path

You can also use the efficient distance matrix function to quickly get the distances between multiple points on the graph.

from scgraph.geographs.us_freeway import us_freeway_geograph

cities = [
    {"latitude": 34.0522, "longitude": -118.2437},  # Los Angeles
    {"latitude": 40.7128, "longitude": -74.0060},   # New York City
    {"latitude": 41.8781, "longitude": -87.6298},   # Chicago
    {"latitude": 29.7604, "longitude": -95.3698},   # Houston
]

distance_matrix = us_freeway_geograph.distance_matrix(cities, output_units='km')
# [
#  [0.0, 4510.965665644833, 3270.3864033755776, 2502.886438995942],
#  [4510.9656656448415, 0.0, 1288.473118634311, 2637.5821542546687],
#  [3270.3864033755744, 1288.4731186343113, 0.0, 1913.1928919854067],
#  [2502.886438995935, 2637.5821542546687, 1913.1928919854076, 0.0],
# ]

For more examples including viewing the output on a map, see the example notebook.

Included GeoGraphs

  • marnet_geograph:
    • What: A maritime network data set from searoute
    • Use: from scgraph.geographs.marnet import marnet_geograph
    • Attribution: searoute
    • Length Measurement: Kilometers
    • Marnet Picture
  • oak_ridge_maritime_geograph:
  • north_america_rail_geograph:
  • us_freeway_geograph:
  • scgraph_data geographs:
    • What: Additional geographs are available in the scgraph_data package
      • Note: These include larger geographs like the world highways geograph and world railways geograph.
    • Installation: pip install scgraph_data
    • Use: from scgraph_data.world_highways import world_highways_geograph
    • See: scgraph_data for more information and all available geographs.

GridGraph usage

Example:

  • Create a grid of 20x20 cells.
    • This creates a grid based graph with connections to all 8 neighbors for each grid item.
    • Each grid item has 4 cardinal connections at length 1 and 4 diagonal connections at length sqrt(2)
  • Create a wall from (10,5) to (10,19).
    • This would foce any path to go to the bottom of the graph to get around the wall.
  • Get the shortest path between (2,10) and (18,10)
    • Note: The length of this path should be 16 without the wall and 20.9704 with the wall.
from scgraph import GridGraph

x_size = 20
y_size = 20
blocks = [(10, i) for i in range(5, y_size)]

# Create the GridGraph object
gridGraph = GridGraph(
    x_size=x_size,
    y_size=y_size,
    blocks=blocks,
    add_exterior_walls=True,
)

# Solve the shortest path between two points
output = gridGraph.get_shortest_path(
    origin_node={"x": 2, "y": 10},
    destination_node={"x": 18, "y": 10},
    # Optional: Specify the output coodinate format (default is 'list_of_dicts)
    output_coordinate_path="list_of_lists",
    # Optional: Cache the origin point spanning_tree for faster calculations on future calls
    cache=True,
)

print(output)
#=> {'length': 20.9704, 'coordinate_path': [[2, 10], [3, 9], [4, 8], [5, 8], [6, 7], [7, 6], [8, 5], [9, 4], [10, 4], [11, 4], [12, 5], [13, 6], [14, 7], [15, 7], [16, 8], [17, 9], [18, 10]]}

Advanced Usage

Using scgraph_data geographs

Using scgraph_data geographs:

  • Note: Make sure to install the scgraph_data package before using these geographs
from scgraph_data.world_railways import world_railways_geograph
from scgraph import Graph

# Get the shortest path between Kalamazoo Michigan and Detroit Michigan by Train
output = world_railways_geograph.get_shortest_path(
    origin_node={"latitude": 42.29,"longitude": -85.58},
    destination_node={"latitude": 42.33,"longitude": -83.05},
    # Optional: Use the A* algorithm
    algorithm_fn=Graph.a_star,
    # Optional: Pass the haversine function as the heuristic function to the A* algorithm
    algorithm_kwargs={"heuristic_fn": world_railways_geograph.haversine},
)

Using Geographs for Visualization

Get a geojson line path of an output for easy visualization:

  • Note: mapshaper.org and geojson.io are good tools for visualizing geojson files
from scgraph.geographs.marnet import marnet_geograph
from scgraph.utils import get_line_path

 # Get the shortest sea path between Sri Lanka and Somalia
output = marnet_geograph.get_shortest_path(
    origin_node={"latitude": 7.87,"longitude": 80.77},
    destination_node={"latitude": 5.15,"longitude": 46.20}
)
# Write the output to a geojson file
get_line_path(output, filename='output.geojson')

Building your own Geographs from Open Source Data

You can build your own geographs using various tools and data sources. For example, you can use OpenStreetMap data to create a high fidelity geograph for a specific area.

Expand the secion below for a step by step guide on how to create a geograph from OpenStreetMap data.

Click to see an example for Michigan, USA

For this example, we will use some various tools to create a geograph for highways (including seconday highways) in Michigan, USA.

Download an OSM PBF file using the AWS CLI:

  • Geofabrik is a good source for smaller OSM PBF files. See: https://download.geofabrik.de/
  • To keep things generalizable, you can also download the entire planet OSM PBF file using AWS. But you should consider downloading a smaller region if you are only interested in a specific area.
    • Note: For this, you will need to install the AWS CLI.
    • Note: The planet OSM PBF file is very large (About 100GB)
    
aws s3 cp s3://osm-pds/planet-latest.osm.pbf .
  • Use Osmium to filter and extract the highways from the OSM PBF file.

    • Install osmium on macOS:

      brew install osmium-tool
      
    • Install osmium on Ubuntu:

      sudo apt-get install osmium-tool
      
  • Download a Poly file for the area you are interested in. This is a polygon file that defines the area you want to extract from the OSM PBF file.

    • For Michigan, you can download the poly file from Geofabrik:

      curl https://download.geofabrik.de/north-america/us/michigan.poly > michigan.poly
      
    • Google around to find an appropriate poly file for your area of interest.

  • Filter and extract as GeoJSON (EG: Michigan) substituting the poly and pbf file names as needed:

    osmium extract -p michigan.poly --overwrite -o michigan.osm.pbf planet-latest.osm.pbf
    
  • Filter the OSM PBF file to only areas of interest and export to GeoJSON:

    osmium tags-filter michigan.osm.pbf w/highway=motorway,trunk,primary,motorway_link,trunk_link,primary_link,secondary,secondary_link,tertiary,tertiary_link -t --overwrite -o michigan_roads.osm.pbf
    osmium export michigan_roads.osm.pbf -f geojson --overwrite -o michigan_roads.geojson
    
  • Simplify the geojson

    • This uses some tools in the SCGraph library as well as Mapshaper to simplify the geojson files.
    • Mapshaper is a CLI and web tool for simplifying and editing geojson files.
    • To install Mapshaper for CLI use, use NPM:

      npm install -g mapshaper
      
    • Mapshaper is particularly helpful since it repairs intersections in the lines which is crutial for geographs to work properly.

    • Mapshaper, however, does not handle larger files very well, so it is recommended to simplify the geojson file first using the scgraph.helpers.geojson.simplify_geojson function first to reduce the size of the file.
    • Make sure to tailor the parameters to your needs.
    python -c "from scgraph.helpers.geojson import simplify_geojson; simplify_geojson('michigan_roads.geojson', 'michigan_roads_simple.geojson', precision=4, pct_to_keep=100, min_points=3, silent=False)"
    mapshaper michigan_roads_simple.geojson -simplify 10% -filter-fields -o force michigan_roads_simple.geojson
    mapshaper michigan_roads_simple.geojson -snap -clean -o force michigan_roads_simple.geojson
    
  • Load the newly created geojson file as a geograph:

    • Note: The GeoGraph.load_from_geojson function is used to load the geojson file as a geograph.
    • This will create a geograph that can be used to calculate shortest paths between points on the graph.
    from scgraph import GeoGraph
    michigan_roads_geograph = GeoGraph.load_from_geojson('michigan_roads_simple.geojson')
    

Custom Graphs and Geographs

Modify an existing geograph: See the notebook here

You can specify your own custom graphs for direct access to the solving algorithms. This requires the use of the low level Graph class

from scgraph import Graph

# Define an arbitrary graph
# See the graph definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
graph = [
    {1: 5, 2: 1},
    {0: 5, 2: 2, 3: 1},
    {0: 1, 1: 2, 3: 4, 4: 8},
    {1: 1, 2: 4, 4: 3, 5: 6},
    {2: 8, 3: 3},
    {3: 6}
]

# Optional: Validate your graph
Graph.validate_graph(graph=graph)

# Get the shortest path between idx 0 and idx 5
output = Graph.dijkstra_makowski(graph=graph, origin_id=0, destination_id=5)
#=> {'path': [0, 2, 1, 3, 5], 'length': 10}

You can also use a slightly higher level GeoGraph class to work with latitude / longitude pairs

from scgraph import GeoGraph

# Define nodes
# See the nodes definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.__init__
nodes = [
    # London
    [51.5074, -0.1278],
    # Paris
    [48.8566, 2.3522],
    # Berlin
    [52.5200, 13.4050],
    # Rome
    [41.9028, 12.4964],
    # Madrid
    [40.4168, -3.7038],
    # Lisbon
    [38.7223, -9.1393]
]
# Define a graph
# See the graph definitions here:
# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
graph = [
    # From London
    {
        # To Paris
        1: 311,
    },
    # From Paris
    {
        # To London
        0: 311,
        # To Berlin
        2: 878,
        # To Rome
        3: 1439,
        # To Madrid
        4: 1053
    },
    # From Berlin
    {
        # To Paris
        1: 878,
        # To Rome
        3: 1181,
    },
    # From Rome
    {
        # To Paris
        1: 1439,
        # To Berlin
        2: 1181,
    },
    # From Madrid
    {
        # To Paris
        1: 1053,
        # To Lisbon
        5: 623
    },
    # From Lisbon
    {
        # To Madrid
        4: 623
    }
]

# Create a GeoGraph object
my_geograph = GeoGraph(nodes=nodes, graph=graph)

# Optional: Validate your graph
my_geograph.validate_graph()

# Optional: Validate your nodes
my_geograph.validate_nodes()

# Get the shortest path between two points
# In this case, Birmingham England and Zaragoza Spain
# Since Birmingham and Zaragoza are not in the graph,
# the algorithm will add them into the graph.
# See: https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.get_shortest_path
# Expected output would be to go from
# Birmingham -> London -> Paris -> Madrid -> Zaragoza

output = my_geograph.get_shortest_path(
    # Birmingham England
    origin_node = {'latitude': 52.4862, 'longitude': -1.8904},
    # Zaragoza Spain
    destination_node = {'latitude': 41.6488, 'longitude': -0.8891}
)
print(output)
# {
#     'length': 1799.4323,
#     'coordinate_path': [
#         [52.4862, -1.8904],
#         [51.5074, -0.1278],
#         [48.8566, 2.3522],
#         [40.4168, -3.7038],
#         [41.6488, -0.8891]
#     ]
# }

Development

Running Tests, Prettifying Code, and Updating Docs

Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2).

  • Create a docker container and drop into a shell
    • ./run.sh
  • Run all tests (see ./utils/test.sh)
    • ./run.sh test
  • Prettify the code (see ./utils/prettify.sh)
    • ./run.sh prettify
  • Update the docs (see ./utils/docs.sh)

    • ./run.sh docs
  • Note: You can and should modify the Dockerfile to test different python versions.

Attributions and Thanks

Originally inspired by searoute including the use of one of their datasets that has been modified to work properly with this package.

  1"""
  2# scgraph
  3[![PyPI version](https://badge.fury.io/py/scgraph.svg)](https://badge.fury.io/py/scgraph)
  4[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
  5[![PyPI Downloads](https://pepy.tech/badge/scgraph)](https://pypi.org/project/scgraph/)
  6<!-- [![PyPI Downloads](https://img.shields.io/pypi/dm/scgraph.svg?label=PyPI%20downloads)](https://pypi.org/project/scgraph/) -->
  7
  8
  9Supply chain graph package for Python
 10
 11
 12![scgraph](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/scgraph.png)
 13
 14## Documentation
 15
 16Getting Started: https://github.com/connor-makowski/scgraph
 17
 18Low Level: https://connor-makowski.github.io/scgraph/scgraph.html
 19
 20
 21## Key Features
 22
 23- `GeoGraph`s:
 24    - A geographic graph data structure that allows for the calculation of shortest paths between two points on earth
 25    - Uses latitude / longitude pairs to represent points on earth
 26    - Supports maritime, rail, road and other geographic networks
 27    - Uses a sparse network data structure to represent the graph
 28    - How to use it - Calculate the shortest path between two points on earth
 29        - Inputs:
 30            - A latitude / longitude pair for the origin
 31            - A latitude / longitude pair for the destination
 32        - Calculation:
 33            - Algorithms:
 34                - Dijkstra's algorithm
 35                    - Modified to support sparse network data structures
 36                - Modified Dijkstra algorithm
 37                    - Modified for O((n+m)log(n)) performance where n is the number of nodes and m is the number of edges
 38                    - Uses a priority queue and other improvements to run fast on large graphs
 39                - A* algorithm (Extension of the Modified Dijkstra)
 40                    - Uses a heuristic function to improve performance on large graphs
 41                - Possible future support for other algorithms
 42        - Returns:
 43            - `path`:
 44                - A list of lists `[latitude, longitude]` that make up the shortest path
 45            - `length`:
 46                - The distance (in the units requested) between the two points
 47    - Precompiled Geographs offer Antimeridian support
 48    - Arbitrary start and end points are supported
 49        - Start and end points do not need to be in the graph
 50    - Cached shortest path calculations can be used for very fast repetative calculations from the same origin node in a GeoGraph.
 51        - This is done by caching the origin node's spanning tree
 52        - The first call will be slower, but future calls using this origin node will be substantially faster.
 53- `GridGraph`s:
 54    - A grid based graph data structure that allows for the calculation of shortest paths between two points on a grid
 55    - Supports arbitrary grid sizes and blockages
 56    - Uses a sparse network data structure to represent the graph
 57    - How to use it - Calculate the shortest path between two points on a grid
 58        - Inputs:
 59            - A (x,y) coordinate on the grid for the origin
 60            - A (x,y) coordinate on the grid for the destination
 61        - Calculation:
 62            - Algorithms:
 63                - Dijkstra's algorithm
 64                - Modified Dijkstra algorithm
 65                - A* algorithm (Extension of the Modified Dijkstra)
 66        - Returns:
 67            - `length`:
 68                - The distance between the two points on the grid
 69            - `coordinate_path`:
 70                - A list of dicts `{"x": x, "y": y}` representing the path taken through the grid
 71    - Arbitrary start and end points are supported
 72        - Start and end points do not need to be in the graph
 73    - Arbitrary connection matricies are supported
 74        - Cardinal connections (up, down, left, right) and diagonal connections (up-left, up-right, down-left, down-right) are used by default
 75        - Custom connection matricies can be used to change the connections between grid items
 76    - Cached shortest path calculations can be used for very fast repetative calculations to or from the same point in a GridGraph.
 77- Other Useful Features:
 78    - Graph
 79        - A low level graph object that has methods for validating graphs, calculating shortest paths, and more
 80    - CacheGraphs
 81        - A graph extension that caches spanning trees for fast shortest path calculations on repeat calls from the same origin node
 82
 83
 84## Setup
 85
 86Make sure you have Python 3.10.x (or higher) installed on your system. You can download it [here](https://www.python.org/downloads/).
 87
 88Note: Support for python3.6-python3.9 is available up to version 2.2.0.
 89
 90## Installation
 91
 92```
 93pip install scgraph
 94```
 95
 96## Use with Google Colab
 97
 98- [Getting Started](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/getting_started.ipynb)
 99- [Creating A Multi Path Geojson](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/multi_path_geojson.ipynb)
100- [Modifying A Geograph](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/geograph_modifications.ipynb)
101
102
103# Getting Started
104
105## Basic Geograph Usage
106
107Get the shortest path between two points on earth using a latitude / longitude pair.
108
109In this case, calculate the shortest maritime path between Shanghai, China and Savannah, Georgia, USA.
110
111```py
112# Use a maritime network geograph
113from scgraph.geographs.marnet import marnet_geograph
114
115# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
116# Note: The origin and destination nodes can be any latitude / longitude pair
117output = marnet_geograph.get_shortest_path(
118    origin_node={"latitude": 31.23,"longitude": 121.47},
119    destination_node={"latitude": 32.08,"longitude": -81.09},
120    output_units='km',
121    # Optional: Cache the origin node's spanning tree for faster calculations on future calls from the same origin node when cache=True
122    # Note: This will make the first call slower, but future calls using this origin node will be substantially faster.
123    cache=True,
124)
125print('Length: ',output['length']) #=> Length:  19596.4653
126```
127
128Adding in a few additional parameters to the `get_shortest_path` function can change the output format as well as performance of the calculation.
129```py
130# Use a maritime network geograph
131from scgraph.geographs.marnet import marnet_geograph
132
133# Get the shortest maritime path between Shanghai, China and Savannah, Georgia, USA
134output = marnet_geograph.get_shortest_path(
135    origin_node={"latitude": 31.23,"longitude": 121.47},
136    destination_node={"latitude": 32.08,"longitude": -81.09},
137    output_units='km',
138    node_addition_lat_lon_bound=180, # Optional: The maximum distance in degrees to consider nodes when attempting to add the origin and destination nodes to the graph
139    node_addition_type='quadrant', # Optional: Instead of connecting the origin node to the graph by the closest node, connect it to the closest node in each direction (NE, NW, SE, SW) if found within the node_addition_lat_lon_bound
140    destination_node_addition_type='all', # Optional: Instead of connecting the destination node to the graph by the closest node, connect it to all nodes found within the node_addition_lat_lon_bound
141    # When destination_node_addition_type='all' is set with a node_addition_lat_lon_bound=180, this will guarantee a solution can be found since the destination node will also connect to the origin node
142)
143print('Length: ',output['length']) #=> Length:  19596.4653
144```
145
146In the above example, the `output` variable is a dictionary with two keys: `length` and `coordinate_path`.
147
148- `length`: The distance between the passed origin and destination when traversing the graph along the shortest path
149    - Notes:
150        - This will be in the units specified by the `output_units` parameter.
151        - `output_units` options:
152            - `km` (kilometers - default)
153            - `m` (meters)
154            - `mi` (miles)
155            - `ft` (feet)
156- `coordinate_path`: A list of lists [`latitude`,`longitude`] that make up the shortest path
157
158
159You can also use the efficient distance matrix function to quickly get the distances between multiple points on the graph.
160```py
161from scgraph.geographs.us_freeway import us_freeway_geograph
162
163cities = [
164    {"latitude": 34.0522, "longitude": -118.2437},  # Los Angeles
165    {"latitude": 40.7128, "longitude": -74.0060},   # New York City
166    {"latitude": 41.8781, "longitude": -87.6298},   # Chicago
167    {"latitude": 29.7604, "longitude": -95.3698},   # Houston
168]
169
170distance_matrix = us_freeway_geograph.distance_matrix(cities, output_units='km')
171# [
172#  [0.0, 4510.965665644833, 3270.3864033755776, 2502.886438995942],
173#  [4510.9656656448415, 0.0, 1288.473118634311, 2637.5821542546687],
174#  [3270.3864033755744, 1288.4731186343113, 0.0, 1913.1928919854067],
175#  [2502.886438995935, 2637.5821542546687, 1913.1928919854076, 0.0],
176# ]
177```
178
179For more examples including viewing the output on a map, see the [example notebook](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/getting_started.ipynb).
180
181## Included GeoGraphs
182
183- marnet_geograph:
184    - What: A maritime network data set from searoute
185    - Use: `from scgraph.geographs.marnet import marnet_geograph`
186    - Attribution: [searoute](https://github.com/genthalili/searoute-py)
187    - Length Measurement: Kilometers
188    - [Marnet Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/marnet.png)
189- oak_ridge_maritime_geograph:
190    - What: A maritime data set from the Oak Ridge National Laboratory campus
191    - Use: `from scgraph.geographs.oak_ridge_maritime import oak_ridge_maritime_geograph`
192    - Attribution: [Oak Ridge National Laboratory](https://www.ornl.gov/) with data from [Geocommons](http://geocommons.com/datasets?id=25)
193    - Length Measurement: Kilometers
194    - [Oak Ridge Maritime Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/oak_ridge_maritime.png)
195- north_america_rail_geograph:
196    - What: Class 1 Rail network for North America
197    - Use: `from scgraph.geographs.north_america_rail import north_america_rail_geograph`
198    - Attribution: [U.S. Department of Transportation: ArcGIS Online](https://geodata.bts.gov/datasets/usdot::north-american-rail-network-lines-class-i-freight-railroads-view/about)
199    - Length Measurement: Kilometers
200    - [North America Rail Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/north_america_rail.png)
201- us_freeway_geograph:
202    - What: Freeway network for the United States
203    - Use: `from scgraph.geographs.us_freeway import us_freeway_geograph`
204    - Attribution: [U.S. Department of Transportation: ArcGIS Online](https://hub.arcgis.com/datasets/esri::usa-freeway-system-over-1500k/about)
205    - Length Measurement: Kilometers
206    - [US Freeway Picture](https://raw.githubusercontent.com/connor-makowski/scgraph/main/static/us_freeway.png)
207- `scgraph_data` geographs:
208    - What: Additional geographs are available in the `scgraph_data` package
209        - Note: These include larger geographs like the world highways geograph and world railways geograph.
210    - Installation: `pip install scgraph_data`
211    - Use: `from scgraph_data.world_highways import world_highways_geograph`
212    - See: [scgraph_data](https://github.com/connor-makowski/scgraph_data) for more information and all available geographs.
213
214## GridGraph usage
215
216Example:
217- Create a grid of 20x20 cells.
218    - This creates a grid based graph with connections to all 8 neighbors for each grid item.
219    - Each grid item has 4 cardinal connections at length 1 and 4 diagonal connections at length sqrt(2)
220- Create a wall from (10,5) to (10,19).
221    - This would foce any path to go to the bottom of the graph to get around the wall.
222- Get the shortest path between (2,10) and (18,10)
223    - Note: The length of this path should be 16 without the wall and 20.9704 with the wall.
224
225```py
226from scgraph import GridGraph
227
228x_size = 20
229y_size = 20
230blocks = [(10, i) for i in range(5, y_size)]
231
232# Create the GridGraph object
233gridGraph = GridGraph(
234    x_size=x_size,
235    y_size=y_size,
236    blocks=blocks,
237    add_exterior_walls=True,
238)
239
240# Solve the shortest path between two points
241output = gridGraph.get_shortest_path(
242    origin_node={"x": 2, "y": 10},
243    destination_node={"x": 18, "y": 10},
244    # Optional: Specify the output coodinate format (default is 'list_of_dicts)
245    output_coordinate_path="list_of_lists",
246    # Optional: Cache the origin point spanning_tree for faster calculations on future calls
247    cache=True,
248)
249
250print(output)
251#=> {'length': 20.9704, 'coordinate_path': [[2, 10], [3, 9], [4, 8], [5, 8], [6, 7], [7, 6], [8, 5], [9, 4], [10, 4], [11, 4], [12, 5], [13, 6], [14, 7], [15, 7], [16, 8], [17, 9], [18, 10]]}
252```
253
254## Advanced Usage
255
256### Using scgraph_data geographs
257
258Using `scgraph_data` geographs:
259- Note: Make sure to install the `scgraph_data` package before using these geographs
260```py
261from scgraph_data.world_railways import world_railways_geograph
262from scgraph import Graph
263
264# Get the shortest path between Kalamazoo Michigan and Detroit Michigan by Train
265output = world_railways_geograph.get_shortest_path(
266    origin_node={"latitude": 42.29,"longitude": -85.58},
267    destination_node={"latitude": 42.33,"longitude": -83.05},
268    # Optional: Use the A* algorithm
269    algorithm_fn=Graph.a_star,
270    # Optional: Pass the haversine function as the heuristic function to the A* algorithm
271    algorithm_kwargs={"heuristic_fn": world_railways_geograph.haversine},
272)
273```
274
275### Using Geographs for Visualization
276Get a geojson line path of an output for easy visualization:
277- Note: `mapshaper.org` and `geojson.io` are good tools for visualizing geojson files
278```py
279from scgraph.geographs.marnet import marnet_geograph
280from scgraph.utils import get_line_path
281
282 # Get the shortest sea path between Sri Lanka and Somalia
283output = marnet_geograph.get_shortest_path(
284    origin_node={"latitude": 7.87,"longitude": 80.77},
285    destination_node={"latitude": 5.15,"longitude": 46.20}
286)
287# Write the output to a geojson file
288get_line_path(output, filename='output.geojson')
289```
290
291### Building your own Geographs from Open Source Data
292You can build your own geographs using various tools and data sources. For example, you can use OpenStreetMap data to create a high fidelity geograph for a specific area.
293
294Expand the secion below for a step by step guide on how to create a geograph from OpenStreetMap data.
295<details>
296<summary>Click to see an example for Michigan, USA</summary>
297
298For this example, we will use some various tools to create a geograph for highways (including seconday highways) in Michigan, USA.
299
300Download an OSM PBF file using the AWS CLI:
301- Geofabrik is a good source for smaller OSM PBF files. See: https://download.geofabrik.de/
302- To keep things generalizable, you can also download the entire planet OSM PBF file using AWS. But you should consider downloading a smaller region if you are only interested in a specific area.
303    - Note: For this, you will need to install the AWS CLI.
304    - Note: The planet OSM PBF file is very large (About 100GB)
305        ```
306        aws s3 cp s3://osm-pds/planet-latest.osm.pbf .
307        ```
308- Use Osmium to filter and extract the highways from the OSM PBF file.
309    - Install osmium on macOS:
310        ```
311        brew install osmium-tool
312        ```
313    - Install osmium on Ubuntu:
314        ```
315        sudo apt-get install osmium-tool
316        ```
317- Download a Poly file for the area you are interested in. This is a polygon file that defines the area you want to extract from the OSM PBF file.
318    - For Michigan, you can download the poly file from Geofabrik:
319        ```
320        curl https://download.geofabrik.de/north-america/us/michigan.poly > michigan.poly
321        ```
322    - Google around to find an appropriate poly file for your area of interest.
323- Filter and extract as GeoJSON (EG: Michigan) substituting the poly and pbf file names as needed:
324    ```
325    osmium extract -p michigan.poly --overwrite -o michigan.osm.pbf planet-latest.osm.pbf
326    ```
327- Filter the OSM PBF file to only areas of interest and export to GeoJSON:
328    - See: https://wiki.openstreetmap.org/wiki/
329    - EG For Highways, see: https://wiki.openstreetmap.org/wiki/Key:highway
330    ```
331    osmium tags-filter michigan.osm.pbf w/highway=motorway,trunk,primary,motorway_link,trunk_link,primary_link,secondary,secondary_link,tertiary,tertiary_link -t --overwrite -o michigan_roads.osm.pbf
332    osmium export michigan_roads.osm.pbf -f geojson --overwrite -o michigan_roads.geojson
333    ```
334
335- Simplify the geojson
336    - This uses some tools in the SCGraph library as well as Mapshaper to simplify the geojson files.
337    - Mapshaper is a CLI and web tool for simplifying and editing geojson files.
338    - To install Mapshaper for CLI use, use NPM:
339        ```
340        npm install -g mapshaper
341        ```
342    - Mapshaper is particularly helpful since it repairs intersections in the lines which is crutial for geographs to work properly.
343    - Mapshaper, however, does not handle larger files very well, so it is recommended to simplify the geojson file first using the `scgraph.helpers.geojson.simplify_geojson` function first to reduce the size of the file.
344    - Make sure to tailor the parameters to your needs.
345    ```
346    python -c "from scgraph.helpers.geojson import simplify_geojson; simplify_geojson('michigan_roads.geojson', 'michigan_roads_simple.geojson', precision=4, pct_to_keep=100, min_points=3, silent=False)"
347    mapshaper michigan_roads_simple.geojson -simplify 10% -filter-fields -o force michigan_roads_simple.geojson
348    mapshaper michigan_roads_simple.geojson -snap -clean -o force michigan_roads_simple.geojson
349    ```
350- Load the newly created geojson file as a geograph:
351    - Note: The `GeoGraph.load_from_geojson` function is used to load the geojson file as a geograph.
352    - This will create a geograph that can be used to calculate shortest paths between points on the graph.
353    ```
354    from scgraph import GeoGraph
355    michigan_roads_geograph = GeoGraph.load_from_geojson('michigan_roads_simple.geojson')
356    ```
357</details>
358
359### Custom Graphs and Geographs
360Modify an existing geograph: See the notebook [here](https://colab.research.google.com/github/connor-makowski/scgraph/blob/main/examples/geograph_modifications.ipynb)
361
362
363You can specify your own custom graphs for direct access to the solving algorithms. This requires the use of the low level `Graph` class
364
365```py
366from scgraph import Graph
367
368# Define an arbitrary graph
369# See the graph definitions here:
370# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
371graph = [
372    {1: 5, 2: 1},
373    {0: 5, 2: 2, 3: 1},
374    {0: 1, 1: 2, 3: 4, 4: 8},
375    {1: 1, 2: 4, 4: 3, 5: 6},
376    {2: 8, 3: 3},
377    {3: 6}
378]
379
380# Optional: Validate your graph
381Graph.validate_graph(graph=graph)
382
383# Get the shortest path between idx 0 and idx 5
384output = Graph.dijkstra_makowski(graph=graph, origin_id=0, destination_id=5)
385#=> {'path': [0, 2, 1, 3, 5], 'length': 10}
386```
387
388You can also use a slightly higher level `GeoGraph` class to work with latitude / longitude pairs
389
390```py
391from scgraph import GeoGraph
392
393# Define nodes
394# See the nodes definitions here:
395# https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.__init__
396nodes = [
397    # London
398    [51.5074, -0.1278],
399    # Paris
400    [48.8566, 2.3522],
401    # Berlin
402    [52.5200, 13.4050],
403    # Rome
404    [41.9028, 12.4964],
405    # Madrid
406    [40.4168, -3.7038],
407    # Lisbon
408    [38.7223, -9.1393]
409]
410# Define a graph
411# See the graph definitions here:
412# https://connor-makowski.github.io/scgraph/scgraph/graph.html#Graph.validate_graph
413graph = [
414    # From London
415    {
416        # To Paris
417        1: 311,
418    },
419    # From Paris
420    {
421        # To London
422        0: 311,
423        # To Berlin
424        2: 878,
425        # To Rome
426        3: 1439,
427        # To Madrid
428        4: 1053
429    },
430    # From Berlin
431    {
432        # To Paris
433        1: 878,
434        # To Rome
435        3: 1181,
436    },
437    # From Rome
438    {
439        # To Paris
440        1: 1439,
441        # To Berlin
442        2: 1181,
443    },
444    # From Madrid
445    {
446        # To Paris
447        1: 1053,
448        # To Lisbon
449        5: 623
450    },
451    # From Lisbon
452    {
453        # To Madrid
454        4: 623
455    }
456]
457
458# Create a GeoGraph object
459my_geograph = GeoGraph(nodes=nodes, graph=graph)
460
461# Optional: Validate your graph
462my_geograph.validate_graph()
463
464# Optional: Validate your nodes
465my_geograph.validate_nodes()
466
467# Get the shortest path between two points
468# In this case, Birmingham England and Zaragoza Spain
469# Since Birmingham and Zaragoza are not in the graph,
470# the algorithm will add them into the graph.
471# See: https://connor-makowski.github.io/scgraph/scgraph/geograph.html#GeoGraph.get_shortest_path
472# Expected output would be to go from
473# Birmingham -> London -> Paris -> Madrid -> Zaragoza
474
475output = my_geograph.get_shortest_path(
476    # Birmingham England
477    origin_node = {'latitude': 52.4862, 'longitude': -1.8904},
478    # Zaragoza Spain
479    destination_node = {'latitude': 41.6488, 'longitude': -0.8891}
480)
481print(output)
482# {
483#     'length': 1799.4323,
484#     'coordinate_path': [
485#         [52.4862, -1.8904],
486#         [51.5074, -0.1278],
487#         [48.8566, 2.3522],
488#         [40.4168, -3.7038],
489#         [41.6488, -0.8891]
490#     ]
491# }
492
493```
494
495# Development
496## Running Tests, Prettifying Code, and Updating Docs
497
498Make sure Docker is installed and running on a Unix system (Linux, MacOS, WSL2).
499
500- Create a docker container and drop into a shell
501    - `./run.sh`
502- Run all tests (see ./utils/test.sh)
503    - `./run.sh test`
504- Prettify the code (see ./utils/prettify.sh)
505    - `./run.sh prettify`
506- Update the docs (see ./utils/docs.sh)
507    - `./run.sh docs`
508
509- Note: You can and should modify the `Dockerfile` to test different python versions.
510
511
512## Attributions and Thanks
513Originally inspired by [searoute](https://github.com/genthalili/searoute-py) including the use of one of their [datasets](https://github.com/genthalili/searoute-py/blob/main/searoute/data/marnet_densified_v2_old.geojson) that has been modified to work properly with this package.
514"""
515
516from .graph import Graph
517from .geograph import GeoGraph
518from .grid import GridGraph