An automatic spelling checker works as follows. Given a word $w$, first check if $w$ is found in the dictionary. If $w$ is not in the dictionary, compute a dictionary entry that is close to $w$. For instance if the user types $\mathsf{ocurrance}$, the spelling checker should suggest $\mathsf{occurence}$, which belongs to the dictionary. Similarity between words such as $\mathsf{occurrence}$ and $\mathsf{occurrance}$ is quantified in terms of $alignment$.
An alignment between two strings $w1$ and $w2$ (over the alphabet $\{ \mathsf{a, b, c, ...., z} \}$) is obtained by inserting hyphens in the two strings such that the modified strings $align$ (i.e.,the modified strings are of equal length, and at each position, either both strings have the same letter or one of the strings has a hyphen).
here are three examples of alignments. The first is between $\mathsf{ocurrance}$ and $\mathsf{occurrence}$ and the second and third are between $\mathsf{ctatg}$ and $\mathsf{ttaagc}$.
(1) |
oc-urr-ance
occurre-nce
|
(2) |
ct-at-g-
-tta-agc
|
(3) |
ctat---g-
---ttaagc
|
A $mismatch$ in an alignment is a position where one of modified strings has a hyphen and the other does not. There are three mismatches in the first alignment given above, five mismatches in the second, and seven mismatches in the third.
Use dynamic programming to give an efficient algorithm that takes two strings $x$ and $y$ (over the alphabet $\{ \mathsf{a, b, c, ... , z} \}$ as its input, and computes the minimum number of mismatches among all alignments of $x$ and $y$. What is the running time of your algorithm (in terms of the lengths of $x$ and $y)?$