- More examples of Dynamic Programming: Longest Common Subsequence
- Summary and Comments on Optimal Substructure

- Read CLRS 15.4-15.5. The focus is on the problem solving strategy: Read the examples primarily to understand the Dynamic Programming strategy rather than to memorize the specifics of each problem (although you might be asked to trace through some of the algorithms).
- Screencasts 12B (5:00-end), 12C, 12D (6:17-end) (also in Laulima)

A **subsequence** of sequence *S* leaves out zero or more elements but preserves
order.

*Z* is a ** common subsequence ** of *X* and *Y* if *Z* is a subsequence
of both *X* and *Y*.

*Z* is a **longest common subsequence** if it is a
subsequence of maximal length.

Given two sequences
*X* = ⟨ *x*_{1}, ..., *x*_{m}
⟩ and
*Y* = ⟨ *y*_{1}, ..., *y*_{n}
⟩, find a subsequence common to both whose length is longest. Solutions to this problem have
applications to DNA analysis in bioinformatics. The analysis of optimal substructure is elegant.

For every subsequence of *X* = ⟨ *x*_{1}, ..., *x*_{m}
⟩, check whether it is a subsequence of *Y* = ⟨ *y*_{1}, ...,
*y*_{n} ⟩, and record it if it is longer than the longest previously
found.

- There are 2
^{m}subsequences of*X*to check. - For each subsequence, scan
*Y*for the first letter. From there scan for the second letter, etc., up to the*n*letters of*Y*. - Therefore, Θ(
*n*2^{m}).

This involves a lot of redundant work.

- If a subsequence
*Z*of*X*fails to match*Y*, then any subsequence having*Z*as a prefix will also fail. - If a subsequence
*Z*of*X*matches*Y*, then there is no need to check prefixes of*Z*.

Many problems to which dynamic programming applies have exponential brute force solutions that can be improved on by exploiting redundancy in subproblem solutions.

The first step is to characterize the structure of an optimal solution, hopefully to show it exhibits optiomal stubstructure.

Often when solving a problem we start with what is known and then figure out how to contruct a
solution. The optimal substructure analysis takes the reverse strategy: __ assume you have
found an optional solution__ (Z below)

Notation:

*X*= prefix ⟨_{i}*x*_{1}, ...,*x*_{i}⟩*Y*= prefix ⟨_{i}*y*_{1}, ...,*y*_{i}⟩

** Theorem: **
Let

- If
*x*=_{m}*y*, then_{n}*z*=_{k}*x*=_{m}*y*, and_{n}*Z*_{k-1}is an LCS of*X*_{m-1}and*Y*_{n-1}. - If
*x*≠_{m}*y*, then_{n}*z*≠_{k}*x*⇒_{m}*Z*is an LCS of*X*_{m-1}and*Y*. - If
*x*≠_{m}*y*, then_{n}*z*≠_{k}*y*⇒_{n}*Z*is an LCS of*X*and*Y*_{n-1}.

*Sketch of proofs:*

(1) can be proven by contradiction: if the last characters of *X* and *Y* are not
included in *Z*, then a longer LCS can be constructed by adding this character to *Z*, a
contradiction.

(2) and (3) have symmetric proofs: Suppose there exists a subsequence *W* of
*X*_{m-1} and *Y* (or of *X* and *Y*_{n-1}) with
length > *k*. Then *W* is a common subsequence of *X* and *Y*, contradicting
*Z* being an LCS.

Therefore, **an LCS of two sequences contains as prefix an LCS of prefixes of the sequences.**
We can now use this fact construct a recursive formula for the value of an LCS.

Let *c*[*i*, *j*] be the length of the LCS of prefixes *X _{i}* and

We want to find *c*[*m*, *n*].

A recursive algorithm based on this formulation would have lots of repeated subproblems, for example, on strings of length 4 and 3:

Dynamic programming avoids the redundant computations by storing the results in a table. We use
*c*[*i*,*j*] for the length of the LCS of prefixes *X _{i}* and

Try to find the correspondence betweeen the code below and the recursive definition shown in the box above.

This is a bottom-up solution: Indices *i* and *j* increase through the loops, and
references to *c* always involve either *i*-1 or *j*-1, so the needed subproblems
have already been computed.

It is clearly **Θ( mn)**;

In the process of computing the *value* of the optimal solution we can also record the
*choices* that led to this solution. Step 4 is to add this latter record of choices and a way
of recovering the optimal solution at the end.

Table *b*[*i*, *j*] is updated above to remember whether each entry is

- a common substring of
*X*_{i-1}and*Y*_{j-1}(diagonal arrow), in which case the common character*x*=_{i}*y*is included in the LCS;_{j} - a common substring of
*X*_{i-1}and*Y*(↑); or - a common substring of
*X*and*Y*_{j-1}(←).

We reconstruct the path by calling Print-LCS(*b*, *X*, *n*, *m*) and
following the arrows, printing out characters of *X* that correspond to the diagonal
arrows (a Θ(*n* + *m*) traversal from the lower right of the matrix to the
origin):

What do "spanking" and "amputation" have in common?

Another application of Dynamic Programming is covered in the Cormen et al. textbook (Section 15.4). I briefly describe the problem here, but you are responsible for reading the details of the solution in the book. Many more applications are listed in the problems at the end of the Chapter 15.

We saw in Topic 8 that an unfortunate order of insertions of keys
into a binary search tree (BST) can result in poor performance (e.g., linear in *n*). If we
know all the keys in advance and also the probability that they will be searched, we can optimize
the construction of the BST to minimize search time in the aggregate over a series of queries. An
example application is when we want to construct a dictionary from a set of terms that are known in
advance along with their frequency in the language. The reader need only try problem 15.5-2 from the
Cormen et al. text (manual simulation of the algorithm) to appreciate why we want to leave this
tedium to computers!

To use dynamic programming, we must show that any optimal solution involves making a choice that leaves one or more subproblems to solve, and the solutions to the subproblems used within the optimal solution must themselves be optimal.

We may not know what that first choice is. Consequently:

- To show that there is optimal substructure, we suppose that the choice has been made, and show that the subproblems that result must also be solved optimally. This argument is often made using a cut-and-paste proof by contradiction.
- Then when writing the code, we must ensure that enough potential choices and hence their supbproblems are considered that we find the optimal first choice. This usually shows up as iteration in which we find the maximum or minimum according to some objective function across all choices.

How many subproblems are used in an optimal solution may vary:

- Rod Cutting: 1 subproblem (of size
*n*-*i*) - LCS: 1 subproblem (LCS of the prefix sequence(s).)
- Optimal BST: 2 subproblems (given
*k*has been chosen as the root,_{r}*k*...,_{i}*k*_{r-1}and*k*_{r+1}...,*k*)_{j}

How many choices in determining which subproblem(s) to use may vary:

- Rod cutting:
*n*choices (for each value of*i*) - LCS: Either 1 choice (if
*x*=_{i}*y*, take LCS of_{j}*X*_{i-1}and*Y*_{j-1}), or 2 choices (if*x*≠_{i}*y*, check both LCS of_{j}*X*_{i-1}and*Y*, and LCS of*X*and*Y*_{j-1}) - Optimal BST:
*j*-*i*+ 1 choices for the root*k*in_{r}*k*...,_{i}*k*: see text._{j}

Informally, running time depends on (# of subproblems overall) x (# of choices).

- Rod Cutting: Θ(
*n*) subproblems overall, ≤*n*choices for each ⇒ O(*n*^{2}) running time. - LCS: Θ(
*m**n*) subproblems overall; ≤ 2 choices for each ⇒ O(*m**n*) running time. - Optimal BST: Θ(
*n*^{2}) subproblems overall; O(*n*) choices for each ⇒ O(*n*^{3}) running time.

(We'll have a better understanding of "overall" when we cover amortized analysis.)

When we study graphs, we'll see that finding the **shortest path** between two vertices in a
graph has optimal substructure: if *p* = *p*_{1} + *p*_{2} is a
shortest path between *u* and *v* then *p*_{1} must be a shortest path
between *u* and *w* (etc.). Proof by cut and paste.

But finding the **longest simple path** (the longest path not repeating any edges) between two
vertices is not likely to have optimal substructure.

For example, *q* → *s* → *t* → *r* is
longest simple path from *q* to *r*, and *r* → *q* → *s* →
*t* is longest simple path from *r* to *t*, but the composed path is not even
legal: the criterion of simplicity is violated.

Dynamic programming requires *overlapping* yet *independently solveable*
subproblems.

Longest simple path is NP-complete, a topic we will cover at the end of the semester, so is unlikely to have any efficient solution.

Although we wrote the code both ways, in terms of the order in which solutions are found, dynamic
programming *first* finds optimal solutions to subproblems and *then* choses which to
use in an optimal solution to the problem. It applies when one cannot make the top level choice
until subproblem solutions are known.

In Topic 13, we'll see that ** greedy algorithms** work top down:
*first* make a choice that looks best, *then* solve the resulting subproblem. Greedy
algorithms apply when one can make the top level choice without knowing how subproblems will be
solved.

Dynamic Programming applies when the problem has these characteristics:

**Recursive Decomposition**- The problem has recursive structure: it breaks down into smaller problems of the same type.
*This characterisic is shared with divide and conquer, but dynamic programming is distinguished from divide and conquer by the next item.* **Overlapping Subproblems**- The subproblems solved by a recursive solution overlap (the same subproblems are revisited
more than once).
*This means we can save time by preventing the redundant computations.* **Optimal Substructure**- Any optimal solution involves making a choice that leaves one or more subproblems to solve,
and the solutions to the subproblems used within the optimal solution must themselves be
optimal.
*This means that optimized recursive solutions can be used to construct optimized larger solutions.*

Dynamic programming can be approached top-down or bottom-up:

**Top-Down with memoization:**- Write a recursive procedure to solve the problem, computing subproblems as needed. Each time a sub-problem is encountered, see whether you have stored it in a table, and if not, solve it and store the solution.
**Bottom-Up:**- Order the subproblems such that "smaller" problems are solved first, so their solutions are available in the table before "larger" problems need them. (This ordering need not be based on literal size.)

Both have the same asympotic running time. The top-down procedure has the overhead of recursion, but computes only the subproblems that are actually needed. Bottom-up is used the most in practice.

We problem solve with dynamic programming in four steps:

**Characterize the structure of an optimal solution**:- How are optimal solutions composed of optimal solutions to subproblems?

**Recursively define the value of an optimal solution**:- Write a recursive cost function that reflects the above structure

**Compute the value of an optimal solution**:- Write code to compute the recursive values, memoizing or solving smaller problems first to avoid redundant computation

**Construct an optimal solution from the computed information**:- Augment the code as needed to record the structure of the solution

There is an online presentation focusing on LCS at http://www.csanimated.com/animation.php?t=Dynamic_programming.

In the next Topic 13 we look at a related optimization strategy: greedy algorithms.

Nodari Sitchinava (based on material by Dan Suthers)

Images are from the instructor's material for Cormen et al. Introduction to Algorithms, Third Edition.