ICS 311 #18: Single Source Shortest Paths


Outline

Today's Theme: Relax!

  1. Shortest Paths Problems
  2. Bellman-Ford Algorithm
  3. Shortest Paths in a DAG
  4. Dijkstra's Algorithm

Readings and Screencasts


Shortest Paths Problems

or how to get there from here ...

Definition

Input is a directed graph G = (V, E) and a weight function w: E → ℜ.

Define the path weight w(p) of path p = ⟨v0, v1, ... vk⟩ to be the sum of edge weights on the path:

Then the shortest path weight from u to v is:

A shortest path from u to v is any path such that w(p) = δ(u, v).

Note that every vertex has a path of length 0 to itself. Therefore, when all weights are positive, δ(v, v) = 0 (not ∞), since the sum of 0 weights is 0. (Later we discuss the case where there are other paths leaving and returning to v that have negative weights.)

Examples

In our examples the shortest paths will always start from s, the source. The δ values will appear inside the vertices, and shaded edges show the shortest paths.

The image on the left shades in shortest paths from s to all other vertices. The image in the right shows an alternate path of weight 11 from s to z (and also alternative paths from s to y and x): shortest paths are not unique.

Variations on the Shortest Paths Problem

Negative Weight Edges

Negative weight edges are OK as long as no negative-weight cycles are reachable from the source s. Fill in the blanks for the vertices on the right:

If a negative-weight cycle is accessible, it can be iterated to make w(s, v) arbitarily small (that is, −∞) for all v on the cycle: see cycle invovling e and f and its effect on g below:

Therefore shortest paths are undefined when there are negative weight cycles (as in the graph above). When there are no negative weight cycles, we still have δ(v, v) = 0 as the only way it can be less is via a negative weight cycle.

If s is the start vertex, vertices on cycle ⟨h, i, j⟩ above remain ∞ cost as they are not reachable from s.

Some algorithms can detect negative-weight cycles and others cannot, but when they are present shortest paths are not well defined.

Cycles

Shortest paths cannot contain cycles.

Optimal Substructure

The shortest paths problem exhibits optimal substructure, suggesting that greedy algorithms and dynamic programming may apply. Turns out we will see examples of both: Dijkstra's algorithm for single-source shortest paths is greedy, and Floyd-Warshall for all pairs shortest paths uses dynamic programming.

Lemma: Any subpath of a shortest path is a shortest path.

Proof is by cut and paste. Let path puv be a shortest path from u to v, and that it includes subpath pxy (this represents subproblems):

Then δ(u, v) = w(p) = w(pux) + w(pxy) + w(pyv).

Now, for proof by contradiction, suppose that substructure is not optimal, meaning that for some choice of these paths there exists a shorter path p'xy from x to y that is shorter than pxy. Then w(p'xy) < w(pxy).

From this, we can construct p':

Then

which contradicts the assumption that puv is a shortest path.

Algorithms

All the algorithms we consider will have the following in common.

Output

For each vertex vV, we maintain these attributes:

v.d is called the shortest path estimate.

v.π = the predecessor of v by which it was reached on the shortest path known so far.

Initialization

All the single-source shortest-paths algorithms start with this:

Relaxation

They all apply the relaxation procedure, which essentially asks: can we improve the current shortest-path estimate for v by going through u and taking (u, v)? (The parameters u and v are vertices but w is a weight function.)

The algorithms differ in the order in which they relax each edge and how many times they do that.

Shortest Paths Properties

All but the first of these properties assume that INITIALIZE-SINGLE-SOURCE has been called once, and then RELAX is called zero or more times.

Triangle Inequality
For any edge (u,v) ∈ E, we have δ(s,v) ≤ δ(s,u) + w(u,v).

Upper-Bound Property
We always have v.d ≥ δ(s,v) for all vertices vV, and once v.d achieves the value δ(s,v), it never changes.

No-Path Property
If there is no path from s to v, then we always have v.d = δ(s,v) = ∞

Convergence Property
If suv is a shortest path in G for some u, vV, and if u.d = δ(s,u) at any time prior to relaxing edge (u,v), then v.d = δ(s,v) at all times afterward.

Path-Relaxation Property
If p = ⟨v0, v1, ..., vk⟩ is a shortest path from s = v0 to vk, and we relax the edges of p in the order (v0, v1), (v1, v2), ..., (vk−1, vk), then vk.d = δ(s,vk). This property holds regardless of any other relaxation steps that occur, even if they are intermixed with relaxations of the edges of p.

Predecessor-Subgraph Property
Once v.d = δ(s,v) for all vV, the predecessor subgraph is a shortest-paths tree rooted at s.

Proofs are available in the CLRS text. Try to explain informally why these are necessarily true.


Bellman-Ford Algorithm

The path-relaxation property leads to a brute force strategy: relax systematically enough times that you can be sure you are done. The Bellman-Ford algorithm takes this approach. This algorithm:

The first nested for loops relax edges a sufficient number of times such that there is a subsequence of relaxations corresponding to every possible path, since the longest possible path is |G.V| − 1 long. Therefore, by the Path-Relaxation Property all shortest paths must be found (proof detailed below).

What does the last for loop do, and how does it work?

Analysis:

RELAX is O(1), and the nested for loops relax all |E| edges |V| - 1 times, so BELLMAN-FORD is Θ(VE).

Examples:

Example from the text, relaxed in order (t,x), (t,y), (t,z), (x,t), (y,x) (y,z), (z,x), (z,s), (s,t), (s,y):

Try this other example (click for answer):

Correctness

The values for v.d and v.π are guaranteed to converge on shortest paths after |V| - 1 passes, assuming no negative-weight cycles.

This can be proven with the Path-Relaxation Property, which states that if we relax the edges of a shortest path ⟨v0, v1, ... vk⟩ in order, even if interleaved with other edges, then vk.d = δ(s,vk) after vk is relaxed.

Since the list of edges is relaxed as many times as the longest possible shortest path (|V|−1), it must converge by this property.

The Bellman Ford algorithm can also be considered to be a bottom-up dynamic programming algorithm that stores subproblem solutions directly in the graph annotations v.d and v.π:

up until n−1, which is the longest possible path.

We also must show that the True/False values are correct. Informally, we can see that if v.d can still be made smaller in line 6 after it should have converged (by argument above), then there must be a negative weight cycle that continues to decrement the path.

The full proof of correctness may be found in the CLRS book.

The values computed on each pass and how quickly it converges depends on order of relaxation: it may converge earlier.

How can we use this fact to speed the algorithm up a bit?


Shortest Paths in a DAG

Life is easy when you are a DAG ...

There are no cycles in a Directed Acyclic Graph. Thus, negative weights are not a problem. Also, vertices must occur on shortest paths in an order consistent with a topological sort.

We can do something like Bellman-Ford, but don't need to do it as many times, and don't need to check for negative weight cycles:

Analysis:

Given that topological sort is Θ(V + E), what's the complexity of DAG-SHORTEST-PATHS? This one's on you: what's the run-time complexity? Use aggregate analysis on line 4 ...

Correctness:

Because we process vertices in topologically sorted order, edges of any path must be relaxed in order of appearance in the path.

Therefore edges on any shortest path are relaxed in order by one pass over the out-incident edges.

Therefore, by the Path-Relaxation Property, the algorithm terminates with correct values.

Examples

From the text:

Notice we could not reach r!

Let's try another example (click for answer):


Dijkstra's Algorithm

The algorithm is essentially a weighted version of breadth-first search: BFS uses a FIFO queue; while this version of Dijkstra's algorithm uses a priority queue.

It also has similarities to Prim's algorithm, being greedy, and with similar iteration. The essential difference is that Dijkstra's orders the vertices on the priority queue by estimated distance v.d of the total path to reach each vertex v, not just the weight w(u,v) of a single edge that reaches each vertex.

Importantly (and unlike Prim's algorithm), it assumes there are no negative-weight edges.

Algorithm

Here is the algorithm as given by CLRS (and the code for Relax), with Prim on the right for comparison.

Notice that lines 1-4 of Prim's are nearly identical to Initialize-Single-Source, and lines 9-11 of Prim's play the same role as Relax. Line 3 of Dijkstra and line 5 of MST-Prim insert all the vertices in the priority queue, and the remaining while loops to extract vertices from the queue and for loops to access adjacent vertices serve the same functions.

Dijkstra's algorithm is greedy in choosing the closest vertex in V - S to add to S each iteration. The difference is that

The above CLRS specification of the algorithm can be improved. Relax(u,v,w) updates the shortest path estimates v.d of the vertices that are in the priority queue. To make sure the keys of the priority queue are updated properly, we must call DecreaseKey on the vertex being updated. We can only know whether to do this if Relax tells us whether there was a change. Therefore the following modifications are needed:

Relax(u,v,w)
1 if v.d > u.d + w(u,v)
2     v.d = u.d + w(u,v)
3     v.π = u
4     return TRUE
5 else
6     return FALSE

Then change Dijkstra as follows:

8        if Relax(u, v, w)
9            DecreaseKey(Q, v, v.d)  

Examples

From the text (black vertices are set S; white vertices are on Q; shaded vertex is the min valued one chosen next iteration):

Let's try another example (click for answer):

Here's a graph with a negative weight: try it from s and see what happens:

What went wrong?

It's not simply a question of whether vertices are upated after they are taken off Q. If you traced it correctly, you will see one vertex updated after it is put in S, but this update is not propagated to another vertex on S.

Correctness

As you just saw, we need to assume that there are no negative weights.

The proof is based on the following loop invariant at the start of the while loop:

v.d = δ(s, v) for all vS.

Initialization: Initially S = ∅, so trivially true.

Maintenance: We just sketch this part (see text for details). We need to show that u.d = δ(s, u) when u is added to S in each iteration. The Upper Bound Property says it will stay the same thereafter.

Suppose (for proof by contradiction) that ∃ u such that u.d ≠ δ(s, u) when added to S. Without loss of generality, let u be the first such vertex added to S.

Termination: At the end, Q is empty, so S = V, so v.d = δ(s, v) for all vV

Analysis

The run time depends on the implementation of the priority queue.

If binary min-heaps are used:

With Fibonacci heaps (which were developed specifically to speed up this algorithm), O(V lg V + E) is possible. (Do not use this result on homeworks and exams unless you are specifically using Fibonacci heaps!)


Dan Suthers (with edits suggested by Nodari Sitchinava)
Last modified: Sat Nov 14 02:41:06 HST 2020
Images are from the instructor's material for Cormen et al. Introduction to Algorithms, Third Edition.