Today's Theme: Relax!

- Shortest Paths Problems
- Bellman-Ford Algorithm
- Shortest Paths in a DAG
- Dijkstra's Algorithm

- Required: CLRS 3rd Ed., Sections 24.1-24.3.
- See also: Sedgewick (1984) Chapter 31 for light conceptual introduction (in Laulima), or Sedgewick & Wayne (2001) Algorithms Chapter 4 for code and application examples.
- Screencasts 18 A Intro, 18 B Bellman-Ford , 18 C Dijkstra.

or how to get there from here ...

Input is a directed graph *G* = (*V*, *E*) and a *weight function**w*: *E* → ℜ.

Define the ** path weight w(p) ** of path

Then the ** shortest path weight** from

A **shortest path** from *u* to *v* is any path such that *w*(*p*) =
δ(*u*, *v*).

Note that every vertex has a path of length 0 to itself, so δ(*v*, *v*) = 0 (not ∞), even
if there are other paths leaving and returning to *v*.

In our examples the shortest paths will always start from *s*, the ** source**. The
δ values will appear inside the vertices, and shaded edges show the shortest paths.

The image on the left shades in shortest paths from *s* to all other vertices. The image in
the right shows an alternate path of weight 11 from *s* to z (and also alternative
paths from *s* to *y* and *x*): shortest paths are not unique.

from*Single-Source:**s*to__every__*v*∈*V*(the version we consider)from every*Single-Destination:**v*∈*V*to some*d*. (Solve by reversing the links and solving single source.)from some*Single-Pair:**u*to__some__*v*. Every known algorithm takes just as long as solving Single-Source. (*Why might that be the case?*)for every pair*All-Pairs:**u*,*v*∈*V*. Next lecture.

**Negative weight edges** are OK as long as no negative-weight cycles are reachable from the
source *s*. Fill in the blanks for the vertices on the right:

If a **negative-weight cycle** is accessible, it can be iterated to make *w*(*s*,
*v*) arbitarily small (that is, −∞) for all *v* on the cycle: see cycle
invovling *e* and *f* and its effect on *g* below:

Vertices on cycle *h*, *i*, *j* remain ∞ cost as they are not reachable from
*s*.

Some algorithms can detect negative-weight cycles and others cannot, but when they are present shortest paths are not well defined.

Shortest paths cannot contain cycles.

- We already ruled out negative-weight cycles.
- If there is a positive-weight cycle we can get a shorter path by omitting the cycle, so it can't be a shortest path with the cycle.
- If there is a zero-weight cycle, it does not affect the cost to omit them, so we will assume that solutions won't use them.

The shortest paths problem exhibits ** optimal substructure**, suggesting that greedy
algorithms and dynamic programming may apply. Turns out we will see examples of both: Dijkstra's
algorithm for single-source shortest paths is greedy, and Floyd-Warshall for all pairs shortest
paths uses dynamic programming.

*Lemma:* Any subpath of a shortest path is a shortest path.

** Proof** is by cut and paste. Let path

Then δ(*u*, *v*) = *w*(*p*) = *w*(*p _{ux}*) +

Now, for proof by contradiction, suppose that substructure is not optimal, meaning that for some
choice of these paths there exists a shorter path *p' _{xy}* from

From this, we can construct *p'*:

Then

which contradicts the assumption that *p _{uv}* is a shortest path.

All the algorithms we consider will have the following in common.

For each vertex *v* ∈ *V*, we maintain these attributes:

** v.d** is called the

- Initially,
*v.d*= ∞ *v.d*may be reduced as the algorithm progresses, but*v.d*≥ δ(*s*,*v*) is always true.- We want to show that at the conclusion of our algorithms,
*v.d*= δ(*s*,*v*).

** v.π** = the predecessor of

- If there is no predecessor,
*v.*π = NIL. - We want to show that at the conclusion of our algorithms,
*v.*π = the predecessor of*v*on the shortest path from*s*. - If that is true, π induces a
on*shortest path tree**G*. (See CLRS text for proofs of properties of π.)

All the single-source shortest-paths algorithms start with this:

They all apply the relaxation procedure, which essentially asks: can we improve the current
shortest-path estimate for *v* by going through *u* and taking (*u*, *v*)? (The
parameters *u* and *v* are vertices but *w* is a weight function.)

The algorithms differ in the order in which they relax each edge and how many times they do that.

All but the first of these properties assume that `INITIALIZE-SINGLE-SOURCE` has been called
once, and then `RELAX` is called zero or more times.

Proofs are available in the text. Try to explain informally why these are necessarily true.

The path-relaxation property leads to a **brute force strategy**: relax systematically enough
times that you can be sure you are done. The Bellman-ford algorithm takes this approach. This algorithm:

- Allows negative-weight edges
- Computes
*v*.*d*and*v*.π for all*v*∈*V*. - Returns True (and a solution embedded in the graph) if no negative-weight cycles are reachable
from
*s*, and False otherwise.

The first nested `for` loops relax edges a sufficient number of times such that there is a
subsequence of relaxations corresponding to *every possible* path, and so by the path-relaxation
property all shortest paths must be found.

*What does the last for loop do, and how does it work?*

`RELAX` is O(1), and the nested `for` loops relax all |*E*| edges |*V*| -
1 times, so `BELLMAN-FORD` is Θ(*VE*).

Example from the text, relaxed in order (t,x), (t,y), (t,z), (x,t), (y,x) (y,z), (z,x), (z,s), (s,t), (s,y):

Try this other example (click for answer):

The values for *v*.*d* and *v*.π are guaranteed to converge on shortest paths
after |*V*| - 1 passes, assuming no negative-weight cycles.

This can be proven with the path-relaxation property, which states that if we relax the edges of
a shortest path ⟨*v*_{0}, *v*_{1}, ... *v _{k}*⟩ in
order, even if interleaved with other edges, then

Since the list of edges is relaxed as many times as the longest possible
shortest path (|*V*|−1), it must converge by this property.

- First iteration relaxes (
*v*_{0},*v*_{1}) - Second iteration relaxes (
*v*_{1},*v*_{2}) - ...
*k*th iteration relaxes (*v*_{k-1},*v*_{k})

The Bellman Ford algorithm can also be considered to be a bottom-up dynamic programming algorithm
that stores subproblem solutions directly in the graph annotations *v*.d and *v*.π:

- After the first pass, all paths of length 1 are guaranteed to be correct and are used to construct longer paths;
- after the second pass, paths of length 2 are guaranteed to be correct and are used to construct longer paths;
- ... etc.

up until *n*−1, which is the longest possible path.

We also must show that the True/False values are correct. Informally, we can see that if
*v*.*d* can still be made smaller in line 6 after it should have converged (by argument
above), then there must be a negative weight cycle that continues to decrement the path.

The full proof of correctness may be found in the text.

The values computed on each pass and how quickly it converges depends on order of relaxation: it may converge earlier.

*How can we use this fact to speed the algorithm up a
bit?*

Life is easy when you are a DAG ...

There are no cycles in a Directed Acyclic Graph. Thus, negative weights are not a problem. Also, vertices must occur on shortest paths in an order consistent with a topological sort.

We can do something like Bellman-Ford, but don't need to do it as many times, and don't need to check for negative weight cycles:

Given that topological sort is Θ(*V* + *E*), what's the complexity of
`DAG-SHORTEST-PATHS`? *This one's on you: what's the run-time complexity?* Use aggregate
analysis on line 4 ...

Because we process vertices in topologically sorted order, edges of *any* path must be
relaxed in order of appearance in the path.

Therefore edges on any shortest path are relaxed in order by one pass over the out-incident edges.

Therefore, by the path-relaxation property, the algorithm terminates with correct values.

From the text:

Notice we could not reach *r*!

Let's try another example (click for answer):

The algorithm is essentially a weighted version of breadth-first search: BFS uses a FIFO queue; while this version of Dijkstra's algorithm uses a priority queue.

It also has similarities to Prim's algorithm, being greedy, and
with similar iteration. The essential difference is that Dijkstra's orders the vertices on the
priority queue by estimated distance *v*.*d* of the __total path__ to reach each vertex
*v*, not just the weight *w*(*u*,*v*) of a single edge that reaches each
vertex.

Importantly, it assumes there are no negative-weight edges.

*S*= set of vertices whose final shortest-path weights have been determined.*Q*=*V*-*S*is the priority queue.- Priority queue keys are shortest path estimates
*v*.*d*.

Here is the algorithm as given by CLRS (and the code for `Relax`), with `Prim` on
the right for comparison.

Notice that lines 1-4 of Prim's are nearly identical to `Initialize-Single-Source`, and
lines 9-11 of Prim's play the same role as `Relax`. Line 3 of `Dijkstra` and line 5
of `MST-Prim` insert all the vertices in the priority queue, and the remaining `while`
loops to extract vertices from the queue and `for` loops to access adjacent vertices serve
the same functions.

Dijkstra's algorithm is greedy in choosing the closest vertex in *V* - *S* to add to
*S* each iteration. The difference is that

- For
`MST-Prim`"close" means the cost to take one step to include the next cheapest vertex:

`if`*w*(*u*,*v*) <*v*.key - for
`Dijkstra`"close" means the cost from the source vertex*s*to*v*: this is in the RELAX`if`.*v*.*d*>*u*.*d*+*w*(*u*,*v*)

The above CLRS specification of the algorithm can be improved. Relax(*u*,*v*,*w*)
updates the shortest path estimates *v.d* of the vertices that are in the priority queue. To
make sure the keys of the priority queue are updated properly, we must call `DecreaseKey` on
the vertex being updated. We can only know whether to do this if Relax tells us whether there was a
change. Therefore the following modifications are needed:

Relax(u,v,w) 1ifv.d>u.d+w(u,v) 2v.d=u.d+w(u,v) 3v.π =u4returnTRUE 5else6returnFALSE

Then change Dijkstra as follows:

8ifRelax(u,v,w) 9 DecreaseKey(Q,v,v.d)

From the text (black vertices are set *S*; white vertices are on *Q*; shaded vertex is
the min valued one chosen next iteration):

Let's try another example (click for answer):

Here's a graph with a negative weight: try it from *s* and see what happens:

*What went wrong?*

As you just saw, we need to assume that there are no negative weights.

The proof is based on the following loop invariant at the start of the `while` loop:

v.d= δ(s,v) for allv∈S.

** Initialization:** Initially

** Maintenance:** We just sketch this part (see text for details). We need to show that

Suppose (for proof by contradiction) that ∃ *u* such that *u*.*d* ≠
δ(*s*, *u*) when added to *S*. Without loss of generality, let *u* be the
first such vertex added to *S*.

*u*≠*s*, since*s*.*d*= δ(*s*,*s*) = 0. Therefore*s*∈*S*≠ ∅ (the set*S*is not empty, and includes*s*).- So there is a path from
*s*to*u*. This means there must be a shortest path*p*from*s*to*u*. - The proof decomposes
*p*into a path*s*to*x*, which is in*S*; the single edge (*x*,*y*); and a path from*y*to*u*. (Some but not all of these can be null, e.g.,*s*can be*x*.) *y*.*d*= δ(*s*,*y*) when*u*added to*S*. (By hypothesis,*x*.*d*= δ(*s*,*x*) when*x*was added. Relaxation of (*x*,*y*) extends this to*y*by the convergence property.)- Since
*y*appears before*u*on a shortest path with non-negative weights, δ(*s*,*y*) ≤ δ(*s*,*u*), and we can show that*y*.*d*≤*u*.*d*by the triangle inequality and upper-bound properties. - But
*u*being chosen first from*Q*means*u*.*d*≤*y*.*d*; so must be that*u*.*d*=*y*.*d*. - Therefore
*y*.*d*= δ(*s*,*y*) = δ(*s*,*u*) =*u*.*d*. - This contradicts the assumption that
*u*.*d*≠ δ(*s*,*u*)

** Termination:** At the end,

The run time depends on the implementation of the priority queue.

If ** binary min-heaps** are used:

- Queue initiallization in line 3 requires O(
*V*) if we use`Build-Max-Heap`(and indeed Ω(*V*) to enqueue |*V*| elements). - The
`EXTRACT-MIN`in line 5 and the implicit`DECREASE-KEY`operation that results from relaxation in line 8 are each O(lg*V*). - The
`while`loop over |*V*| elements of*Q*invokes |*V*| O(log*V*)`EXTRACT-MIN`operations. - Switching to aggregate analysis for the
`for`loop in lines 7-8, there is a call to`RELAX`for each of O(*E*) edges, and each call may result in an O(log*V*)`DECREASE-KEY`. - The total is
**O((**.*V*+*E*) lg*V*) - If the graph is connected, there are at least as many edges as vertices, and this can be
simplified to
**O(**, which is faster than*E*lg*V*)`BELLMAN-FORD`'s O(*E**V*).

With ** Fibonacci heaps** (which were developed specifically to speed up this
algorithm), O(

Dan Suthers (with edits suggested by Nodari Sitchinava) Last modified: Sun Nov 5 03:45:15 HST 2017

Images are from the instructor's material for Cormen et al. Introduction to Algorithms, Third Edition.