According to Fermat’s principle, seismic waves follow paths of least resistance. Thus, shortest path algorithms such as Dijkstra’s can be used to locate seismic sources. Conversely, inferring the parameters of a mathematical program from an observed optimal path defines the inverse shortest path problem, a key area within inverse optimization.
With its wide range of applicability, inverse optimization has attracted considerable interest. One of the earliest topics in this field was the inverse shortest path problem, with Burton and Toint (1992) laying its foundations. Since then, this problem has been explored across several applications and areas, with various mathematical formulations and algorithms proposed.
This repository provides attempts to solve the Inverse Shortest Path Problem, where the goal is to infer edge weights of an undirected graph which makes given desired paths optimal. It makes use of three algorithms:
- Column Generation as proposed by Zhang et al. (1995)
- A quadratic programming algorithm developed by Burton and Toint (1992)
- A deep inverse optimization technique, based on deep learning, developed by Tan et al. (2018).
The repository contains multiple scripts implementing these algorithms. It also includes parsing and preprocessing of data from the ISC bulletin. Additionally, it features visualization of earthquake-to-station paths along with outputs such as plots and statistical tests for all algorithms. The code is modularized into several components to enhance clarity and maintainability.
1. Parsing.py
- Reads and parses a seismic event file (isc.txt) line by line.
- Detects new earthquake events and extracts origin time, latitude, and longitude.
- Extracts station picks (P and S seismic phases) along with their coordinates.
- Aggregates event and station pick data into a structured format using pandas.
- Saves the parsed data into an Excel file (Nodes.xlsx) with separate sheets for all data, P picks, and S picks.
2. Preprocessing.py
- Reads seismic P and S phase picks from Nodes.xlsx and filters nodes by distance from a fixed point.
- Deduplicates picks to retain only one pick per Event-Node combination.
- Constructs complete graphs of all nodes by calculating pairwise distances between them.
- Builds per-event “desired paths” connecting epicenters and stations when at least two nodes are present.
- Saves filtered picks, graphs, paths, and summary into a Graph.xlsx with multiple sheets.
3. Plots.py
- Loads filtered seismic pick and path summary data from Graph.xlsx.
- Converts path strings into geospatial line segments (LineStrings).
- Creates GeoDataFrames for epicenters (sources) and seismic stations.
- Builds a fully connected graph showing all possible node-to-node links.
- Plots multiple maps showing epicenters, stations, shortest paths, and the full network graph with basemaps.
4. Column_Generation.py
- Uses an undirected graph with weighted edges representing distances.
- Builds and solves an initial RMP which minimizes the slack variables related to edge weight adjustments.
- Detects violated shortest paths to add new columns.
- Iterates until no further violated paths are found.
5. Burton_Toint.py
- Initialize edge weights and empty active set.
- Detect the most violated constraint (desired path not shortest).
- Project and normalize the constraint vector.
- Decide between primal step (adding a new constraint) or dual step (dropping a constraint).
- Update edge weights accordingly.
- Optionally postprocess weights (commented out by default).
- Return the optimized weights and a detailed log.
6. Deep_Inverse_Optimization.py
- Parse edges and desired paths from dataframes.
- Represent paths as flow vectors (x_targets).
- Build LP constraints.
- Initialize weights (w0) and edge list.
- Hyperparameter Selection (CV)
- Initialize Training - Clone w0 as learnable w.
- Setup optimizer (Adam) and LR scheduler.
- Create batch loader from desired paths.
- For each epoch and batch:
- Solve LPs (via linprog_ip_batch) using current weights.
- Compute loss: MSE between LP solution and target path
- Backpropagate and update weights
- Compute full dataset loss
- Check if all desired paths are shortest under current weights
- Apply early stopping if the loss improvement is below the tolerance for a given number of patience epochs
- Save Results
7. Output.py
- Performs a comprehensive statistical and visual analysis of the results obtained from all algorithms.
- Computes global statistics, including Friedman's tests and Nemenyi's post-hoc test.
- Generates boxplots, correlation matrices.
- Produces iteration plots for each algorithm’s metrics.
- Interpolates and visualizes geographic edge weight distributions on maps.
- Exports final weight and difference maps.
To run the code, make sure you have Python installed (recommended 3.7+). The project requires the following packages:
numpypandasnetworkxscipygeopandasshapelymatplotlibcontextilyreostimeitertoolsseabornscikit-posthocsstatsmodelscartopyYou can install missing packages via pip, for example:
pip install numpy pandas networkx scipy geopandas shapely matplotlib seaborn scikit-posthocs statsmodels cartopy
To reproduce the full pipeline, follow these steps:
1. Prepare Seismic Data
- Download raw data from the ISC Bulletin
- Save the file as isc.txt in the project directory
- Run Parsing.py:
python Parsing.py
- Output:
Nodes.xlsxwhich contains parsed earthquake events and seismic station picks
2. Filter Data and Build Graphs
- Input: Nodes.xlsx
- Run Preprocessing.py:
python Preprocessing.py
- Output: Graph.xlsx - includes graphs, nodes, and desired paths per event
3. Visualize Earthquake-Station Paths
- Input: Graph.xlsx
- Run Plots.py:
python Plots.py
- Output: plots of the earthquake-station paths for each P and S phases and the full connected graph when considering all stations and sources
4. Solve Inverse Shortest Path Problem
- Run any one of the following:
-- Column_Generation.py:
python Column_Generation.py
-- Burton_Toint.py:
python Burton_Toint.py
-- Deep_Inverse_Optimization.py
python Deep_Inverse_Optimization.py
- Outputs: Results (inferred weights, logs, and stats) are saved into Excel files within output subfolders
5. Analyze and Visualize Algorithm Outputs
- Input: Results from the selected optimization algorithm, along with Nodes.xlsx
- Run Output.py:
python Output.py
- Output:
- Global statistical tests and boxplots comparing algorithm performance
- Per-algorithm exploratory statistics, normality checks, and regression diagnostics
- Smoothed iteration plots over training
- Interpolated geographic maps of final edge weights and differences
- Figures saved as .png files and printed summaries/statistics
Burton, D. and Toint, Ph., 1997. On the use of an inverse shortest path algorithm for recovering linearly correlated costs. Mathematical Programming, 63(1-3), pp.1-22.
Duan, X., 2022. Modifying a path into the shortest path, dissertation. MSc thesis. Delft University of Technology, Netherlands.
Storchak, D.A., Harris, J., Brown, L., Lieser, K., Shumba, B., Verney, R., Di Giacomo, D. and Korger, E.I.M., 2017. Rebuild of the bulletin of the International Seismological Centre (ISC) – part 2: 1964 -1979. Geoscience Letters, 4(32), pp. 1-14.
Storchak, D.A., Harris, J., Brown, L., Lieser, K., Shumba, B. and Di Giacomo, D., 2020. Rebuild of the bulletin of the International Seismological Centre (ISC) – part 2: 1980 -2010. Geoscience Letters, 7(18), pp. 1-21.
Tan, Y., Delong, A. and Terekhov, D., 2018. Deep inverse optimization. arXiv preprint,arXiv:1812.00804.
Zhang, J, Zhongfan, M. and Yang, C., 1995. A column generation method for inverse shortest path problems. Zeitschrift für Operations Research, 41, pp. 347-358.
MIT License (See License)
For questions or feedback, please contact Jeremy Farrugia on jeremy.farrugia.13@um.edu.mt