72. Edit Distance
Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.
You have the following 3 operations permitted on a word:
- Insert a character
- Delete a character
- Replace a character
Example 1:
Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')
Example 2:
Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation:
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')
本题要求两个字符串的编辑距离,很经典的一道动态规划题。 两个字符串的编辑距离是指通过插入、删除和替换这三种操作,将word1变换为word2(或者将word2变换为word1)。 编辑距离本科算法课就学过了,把word1和word2摆成一个表格的形式,然后从左往右,从上往下填表格,然后编辑距离就是表格右下角的值。 假设word1水平方向摆放,word2竖直方向摆放,令dp[row][col]表示把word2[:row]变换为word1[:col]需要的编辑距离,则这种变换可以通过三种方法获得:
- 删掉字符word2[row],然后把word2[:row-1]变换为word1[:col],删除代价为1,所以dp[row][col]=dp[row-1][col]+1
- 在word2[:row]变换为word1[:col-1]的基础上,在word2[:row]末尾插入字符word1[col],这样就把word2[:row]变为了word1[:col],插入代价为1,所以dp[row][col]=dp[row][col-1]+1
- 如果word1[col]==word2[row],则word2[:row]变为word1[:col]就相当于word2[:row-1]变为word1[:col-1],此时dp[row][col]=dp[row-1][col-1];如果word1[col]!=word2[row],则可以在word2[:row-1]变为word1[:col-1]的基础上,将word2[row]替换为word1[col],替换代价为1,所以dp[row][col]=dp[row-1][col-1]+1
综合以上三种情况(实际是四种情况),得到的递归公式为:
dp[row][col]=min(min(dp[row-1][col],dp[row][col-1])+1,dp[row-1][col-1]+(word1[col]==word2[row]?0:1))
完整的代码如下:
class Solution {
public:
int minDistance(string word1, string word2)
{
if (word1.size() == 0 || word2.size() == 0)
return max(word1.size(), word2.size());
vector<vector<int> > dp(word2.size() + 1, vector<int>(word1.size() + 1, 0));
for (int row = 1; row <= word2.size(); row++)
dp[row][0] = row;
for (int col = 1; col <= word1.size(); col++)
dp[0][col] = col;
for (int row = 1; row <= word2.size(); row++) {
for (int col = 1; col <= word1.size(); col++) {
dp[row][col] = min(dp[row – 1][col], dp[row][col – 1]) + 1;
dp[row][col] = min(dp[row][col], dp[row – 1][col – 1] + (word1[col – 1] == word2[row – 1] ? 0 : 1));
}
}
return dp[word2.size()][word1.size()];
}
};
本代码提交AC,用时28MS。
DP题一般都要填表格,很多情况下,填第row行表格可能只需要利用第row-1行的信息,所以可以只保留两行的结果,用于缩减空间复杂度。下面这种实现只利用了额外的2*min(row,col)空间:
class Solution {
public:
int minDistance(string word1, string word2)
{
if (word2.size() < word1.size()) {
swap(word1, word2);
}
if (word1.size() == 0)
return word2.size();
vector<int> dp1(word1.size() + 1, 0), dp2(word1.size() + 1, 0);
for (int col = 1; col <= word1.size(); col++)
dp1[col] = col;
for (int row = 1; row <= word2.size(); row++) {
dp1[0] = row – 1; // caution
dp2[0] = row; // caution
for (int col = 1; col <= word1.size(); col++) {
dp2[col] = min(dp1[col], dp2[col – 1]) + 1;
dp2[col] = min(dp2[col], dp1[col – 1] + (word1[col – 1] == word2[row – 1] ? 0 : 1));
}
swap(dp1, dp2);
}
return dp1[word1.size()];
}
};
本代码提交AC,用时19MS,确实要比之前的快一点。