Tag Archives: 字符串

LeetCode Add Binary

Given two binary strings, return their sum (also a binary string).

The input strings are both non-empty and contains only characters 1 or 0.

Example 1:

Input: a = "11", b = "1"
Output: "100"

Example 2:

Input: a = "1010", b = "1011"
Output: "10101"

Constraints:

Each string consists only of '0' or '1' characters.
1 <= a.length, b.length <= 10^4
Each string is either "0" or doesn’t contain any leading zero.

把两个二进制字符串相加，注意进位即可。把while (i >= 0 && j >= 0)改成while (i >= 0 || j >= 0)，这样就只需要一个while循环了，只不过循环体里面求sum需要判断一下i,j。完整代码如下：

class Solution {
public:
    string addBinary(string a, string b)
    {
        int i = a.size() – 1, j = b.size() – 1;
        string ans = "";
        bool carry = false;
        while (i >= 0 || j >= 0) {
            int sum = (i >= 0 ? (a[i] – ‘0’) : 0) + (j >= 0 ? (b[j] – ‘0’) : 0);
            sum += carry ? 1 : 0;
            if (sum >= 2) {
                ans.push_back(sum – 2 + ‘0’);
                carry = true;
            }
            else {
                ans.push_back(‘0’ + sum);
                carry = false;
            }
            i–;
            j–;
        }
        if (carry)
            ans.push_back(‘1’);
        reverse(ans.begin(), ans.end());
        return ans;
    }
};

本代码提交AC，用时3MS。

二刷。更简洁的代码：

class Solution {
public:
	string addBinary(string a, string b) {
		int m = a.size(), n = b.size();
		string ans = "";
		int i = m - 1, j = n - 1;
		int carry = 0;
		while (i >= 0 || j >= 0) {
			int u = i >= 0 ? a[i] - '0' : 0;
			int v = j >= 0 ? b[j] - '0' : 0;
			int sum = u + v + carry;
			carry = sum / 2;
			ans.push_back('0' + (sum % 2));
			--i;
			--j;
		}
		if (carry != 0)ans.push_back('1');
		reverse(ans.begin(), ans.end());
		return ans;
	}
};

本代码提交AC，用时4MS。

LeetCode Group Anagrams

Leave a reply

49. Group Anagrams

Given an array of strings, group anagrams together.

Example:

Input: ["eat", "tea", "tan", "ate", "nat", "bat"],
Output:
[
  ["ate","eat","tea"],
  ["nat","tan"],
  ["bat"]
]

Note:

All inputs will be in lowercase.
The order of your output does not matter.

Anagrams是指颠倒字母而成的字，比如ate和eat，颠倒字母包括随意乱序颠倒，比如nat和tan。
本题要求把所有anagrams分组找出来，思路是对每个单词内部排序，然后对排序完的所有单词再外部排序，这样相同的anagrams就会排在一起，只要把anagrams相同的放在一个group就可以。比如ate、eat、tea都内部排序成aet，然后3个aet外部排序时会排在一起，这样就能把ate、eat、tea组合成一个group。

完整代码如下：

struct Word {
    string key, value;
    bool operator<(const Word& w) { return this->key < w.key; }
};
class Solution {
public:
    vector<vector<string> > groupAnagrams(vector<string>& strs)
    {
        vector<Word> words;
        for (int i = 0; i < strs.size(); i++) {
            Word tmp;
            tmp.value = strs[i];
            sort(strs[i].begin(), strs[i].end());
            tmp.key = strs[i];
            words.push_back(tmp);
        }
        sort(words.begin(), words.end());
        vector<string> group;
        vector<vector<string> > ans;
        group.push_back(words[0].value);
        for (int i = 1; i < words.size(); i++) {
            if (words[i].key == words[i – 1].key) {
                group.push_back(words[i].value);
            }
            else {
                ans.push_back(group);
                group.clear();
                group.push_back(words[i].value);
            }
        }
        ans.push_back(group);
        return ans;
    }
};

本代码提交AC，用时64MS。

二刷。不自定义数据结构，使用map<string, vector<string=””>>，完整代码如下：

class Solution {
public:
    vector<vector<string> > groupAnagrams(vector<string>& strs)
    {
        unordered_map<string, vector<string> > hash;
        for (const auto& s : strs) {
            string tmp = s;
            sort(tmp.begin(), tmp.end());
            hash[tmp].push_back(s);
        }
        vector<vector<string> > ans;
        for (const auto& it : hash) {
            ans.push_back(it.second);
        }
        return ans;
    }
};

本代码提交AC，用时26MS。

LeetCode Length of Last Word

Leave a reply

58. Length of Last Word

Given a string s consists of upper/lower-case alphabets and empty space characters ' ', return the length of last word (last word means the last appearing word if we loop from left to right) in the string.

If the last word does not exist, return 0.

Note: A word is defined as a maximal substring consisting of non-space characters only.

Example:

Input: "Hello World"
Output: 5

本题要求一个字符串中最后一个单词的长度，比如”Hello World”，则返回5。需要注意结尾包含空格的情况，比如”Hello World “，还是要返回5。所以需要先删除尾部空格，然后再从后往前查找第一个空格，返回空格到结尾的长度。完整代码如下：

class Solution {
public:
    int lengthOfLastWord(string s)
    {
        int e = s.size() – 1;
        while (e >= 0 && s[e] == ‘ ‘) {
            e–;
        }
        if (e < s.size() – 1) {
            s = s.substr(0, e + 1);
        }
        size_t tPos = s.find_last_of(‘ ‘);
        if (tPos == string::npos)
            return s.size();
        return s.size() – tPos – 1;
    }
};

本代码提交AC，用时4MS。

二刷。上述代码也过于复杂，还有字符串的substr，下面是更简洁的版本：

class Solution {
public:
    int lengthOfLastWord(string s)
    {
        int i = s.size() – 1;
        while (i >= 0 && s[i] == ‘ ‘)
            –i;
        if (i < 0)
            return 0;
        int j = i – 1;
        while (j >= 0 && s[j] != ‘ ‘)
            –j;
        return i – j;
    }
};

本代码提交AC，用时3MS。

LeetCode Longest Valid Parentheses

Leave a reply

32. Longest Valid Parentheses

Given a string containing just the characters '(' and ')', find the length of the longest valid (well-formed) parentheses substring.

Example 1:

Input: "(()"
Output: 2
Explanation: The longest valid parentheses substring is "()"

Example 2:

Input: ")()())"
Output: 4
Explanation: The longest valid parentheses substring is "()()"

此题要求一个括号字符串中最长的合法的括号串的长度。遇到括号问题，想当然的用栈来解决，可以尝试每一个以'(‘开头的字符串，用栈求出其最长合法括号串的长度，然后求全局最大值，如下：

class Solution {
public:
    int longestValidParentheses(string s)
    {
        int n = s.size(), maxval = 0;
        for (int i = 0; i < n; i++) { if (s[i] == ‘(‘) {
                stack<char> sc;
                sc.push(s[i]);
                int cur = 0;
                for (int j = i + 1; j < n; j++) {
                    char t = sc.top();
                    if (s[j] == ‘)’)
                        { if (t == ‘(‘) {
                                sc.pop();
                                cur++; } else break;
                        }
                    sc.push(s[j]);
                }
                if (cur > maxval)
                    maxval = cur; }
        }
        return maxval * 2;
    }
};

但是这个算法是$O(n^2)$的，TLE。

后来看题解，居然有$O(n)$的解法，厉害，如下：

class Solution {
public:
	int longestValidParentheses(string s)
	{
		int n = s.size(), maxval = 0, start = -1; // start初始化为-1
		stack<int> si;
		for (int i = 0; i < n; i++) {
			if (s[i] == '(') {
				si.push(i); //左括号无条件压栈 
			} else { //右括号不压栈
				if (si.empty()) {
					start = i; //以右括号作为新的起点
				} else {
					si.pop();
					if (si.empty()) // ()()() or (()())
						maxval = max(maxval, i - start);
					else // (() 用当前栈顶索引作为起点
						maxval = max(maxval, i - si.top());
				}
			}
		}
		return maxval;
	}
};

本算法提交AC，用时12MS。

它也是用堆栈，但是堆栈里存的不是括号，而是括号的索引。左括号无条件压栈，右括号不压栈，只是作为计算括号对的长度的起点（其实起点应该是右括号索引+1，但是计算长度的时候用右括号索引更方便）。

所以栈里面只有左括号，当右括号来了，且栈为空时，说明之前已经配对完了，要开始新的配对，所以设置新的起点。

如果栈不为空，则pop。如果pop完之后栈为空，也说明配对完成，可以更新最佳长度；如果pop完之后栈里还有左括号，则是(((()这种类型，左括号更多，则匹配的起点应该是栈顶的索引，而不是之前设置的右括号索引+1，如下：

(((() ^ ^

二刷。比较容易理解的是动态规划的方法。设dp[i]表示以字符i结尾的最长合法括号串的长度，则所有左括号对应的dp[i]都等于0，因为合法的括号串都是以右括号结尾的。

如果s[i]==’)’，有两种情况。第一种情况是s[i-1]='(‘，则s[i]和s[i-1]配对，所以dp[i]=dp[i-2]+2。第二种情况是s[i-1]也=’)’，则s[i]无法和s[i-1]配对，但是如果向前跨过s[i-1]对应的合法字符串后有左括号出现，则s[i]有望作为一个外围的右大括号配对成功，如下图所示，最右边的是第s[i]=’)’，向前跨过了(xxx)，和前一个左括号配对成功。这种情况下，dp[i]=dp[i-1]+2，又因为配对的左括号前面还有可能是合法的括号串，所以还要加上dp[i-dp[i-1]-1-1]。

xxx((xxxx))

完整代码如下：

class Solution {
public:
	int longestValidParentheses(string s) {
		int n = s.size();
		vector<int> dp(n, 0);
		int ans = 0;
		for (int i = 0; i < n; ++i) {
			if (s[i] == ')') {
				if (i > 0 && s[i - 1] == '(') {
					dp[i] += 2;
					if (i >= 2)dp[i] += dp[i - 2];
				}
				else if (i > 0 && s[i - 1] == ')') {
					int left = i - dp[i - 1] - 1;
					if (left >= 0 && s[left] == '(') {
						dp[i] = dp[i - 1] + 2;
						if (left > 0)dp[i] += dp[left - 1];
					}
				}
			}
			ans = max(ans, dp[i]);
		}
		return ans;
	}
};

本代码提交AC，用时4MS。

至于用栈的方法，是这样的：栈中只插入左括号的下标，以便来了右括号后，把配对的左括号弹出，剩下栈顶左括号下标作为之前弹出的合法括号串的起始位置。如果栈顶为空，且来了右括号，则这个右括号肯定配对失败，所以也把它的下标入栈，用来充当后续合法括号串的起始位置。

更简洁易懂的代码如下：

class Solution {
public:
	int longestValidParentheses(string s) {
		int n = s.size();
		int ans = 0;
		stack<int> stk;
		stk.push(-1);
		for (int i = 0; i < n; ++i) {
			if (s[i] == '(') {
				stk.push(i);
			}
			else {
				if (!stk.empty())stk.pop();
				if (stk.empty()) {
					stk.push(i);
				}
				else {
					ans = max(ans, i - stk.top());
				}
			}
		}
		return ans;
	}
};

本代码提交AC，用时4MS。

LeetCode Implement strStr()

1 Reply

28. Implement strStr()

Implement strStr().

Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example 1:

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2:

Input: haystack = "aaaaa", needle = "bba"
Output: -1

Clarification:

What should we return when needle is an empty string? This is a great question to ask during an interview.

For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C’s strstr() and Java’s indexOf().

串的模式匹配问题，首先献上暴力方法：

class Solution {
public:
    int strStr(string haystack, string needle)
    {
        int n1 = haystack.size(), n2 = needle.size();
        if (n2 > n1)
            return -1;
        for (int i = 0; i < n1 – n2 + 1; i++) {
            bool found = true;
            for (int j = 0; j < n2; j++) {
                if (needle[j] != haystack[i + j]) {
                    found = false;
                    break;
                }
            }
            if (found)
                return i;
        }
        return -1;
    }
};

本代码提交AC了，用时6MS。

然后使用神器KMP算法，如下：

class Solution {
public:
    void getNextVal(vector<int>& nextval, string t)
    {
        int n = t.size();
        nextval.resize(n);
        nextval[0] = -1;
        int j = 0, k = -1;
        while (j < n – 1) {
            if (k == -1 || t[j] == t[k]) {
                j++;
                k++;
                if (t[j] != t[k])
                    nextval[j] = k;
                else
                    nextval[j] = nextval[k];
            }
            else
                k = nextval[k];
        }
    }
    int strStr(string haystack, string needle)
    {
        if (needle == "")
            return 0;
        int n1 = haystack.size(), n2 = needle.size();
        if (n2 > n1)
            return -1;
        vector<int> nextval;
        getNextVal(nextval, needle);
        int i = 0, j = 0;
        while (i < n1 && j < n2) {
            if (j == -1 || haystack[i] == needle[j]) {
                i++;
                j++;
            }
            else
                j = nextval[j];
        }
        if (j >= n2)
            return i – n2;
        else
            return -1;
    }
};

本代码提交AC，用时4MS。

下面还是仔细回顾一下KMP算法，免得以后忘了不好学。

假设主串为s，模式串为t，暴力的方法是依次检查以s[i]开头能否和t匹配成功，如果匹配不成功，则从s的下一个位置s[i+1]开始重新匹配。

但是如果遇到下图的情况：

很显然，从s的下一个位置’B’开始和t的开头重新匹配时，一定会失败，因为之前匹配的串是”ABCDAB”，要想再次成功匹配，可以把AB的前缀移到AB后缀的地方，也就是指向主串的i指针不变，j指针指向’C’重新开始下一轮匹配。

那么怎样才能知道此时应该j指针应该调整到指向’C’呢？KMP算法的关键就是计算next数组了，next数组的作用就是当s[i]和t[j]失配时，i指针不动，j指针调整为next[j]继续匹配。

通过上面两张图的例子，我们知道next[j]的含义就是在t[j]之前的串中，相同的前缀和后缀的最大长度。比如最开始的图匹配到ABCDABD时失配，此时j指向最后一个字母’D’，’D’之前的串ABCDAB中最长的相同的前缀后缀是’AB’，所以next[j]=2，下一次，我们令j=next[j]=2，这样就直接从t[j]=t[2]=’C’开始匹配了。

（t[0,…,j]的前缀要以t[0]开头，后缀要以t[j]结尾，但都不能完全等于t[0,…,j]，即前缀不能以t[j]结尾，后缀不能以t[0]开头。）

$$next[j]=\begin{cases} \max\{k|0<k<j|t_0t_1…t_{k-1}==t_{j-k}t_{j-k+1}…t_{j-1}\}, & \mbox{if this set is nonempty}\\-1, & \mbox{if }j==0\\ 0, & \mbox{otherwise}\end{cases}$$

既然next[j]的含义就是在t[j]之前的串中，相同的前缀和后缀的最大长度。假设我们已经知道next[0,…,j]了，怎样求解next[j+1]呢？假设next[j]=k，说明t[0,…,j-1]最长的相同前缀后缀长度为k。如果t[j]==t[k]，则t[0,…,j]的最长的相同前缀后缀长度就是k+1了，即next[j+1]=k+1，这个很好理解。

如果t[j]!=t[k]，则以t[j]结尾的长度为k的后缀就没戏了，因为t[0,…,k]!=t[j-k,…,j]，但是后缀又必须以t[j]结尾，所以我们只能缩短长度，从next[k]处继续递归往前查找。即k=next[k]递归查找，注意，递归时j不变，只是k往前递归。

常规的next数组求解就是这样的，但是有一个地方还可以改进，下面引用july的博客：
比如图2和图3，如果用之前的next 数组方法求模式串“abab”的next 数组，可得其next 数组为-1 0 0 1（0 0 1 2整体右移一位，初值赋为-1），当它在图2的文本串去匹配的时候，发现b跟c失配，于是模式串右移j – next[j] = 3 – 1 =2位。

右移2位后，b又跟c失配（图3）。事实上，因为在上一步的匹配中，已经得知t[3] = b，与s[3] = c失配，而右移两位之后，让t[ next[3] ] = t[1] = b 再跟s[3]匹配时，必然失配。问题出在哪呢？

问题出在不该出现t[j] = t[ next[j] ]。为什么呢？理由是：当t[j] != s[i] 时，下次匹配必然是t[ next [j]] 跟s[i]匹配，如果t[j] = t[ next[j] ]，必然导致后一步匹配失败（因为t[j]已经跟s[i]失配，然后你还用跟t[j]等同的值t[next[j]]去跟s[i]匹配，很显然，必然失配），所以不能允许t[j] = t[ next[j ]]。如果出现了t[j] = t[ next[j] ]咋办呢？如果出现了，则需要再次递归，即令next[j] = next[ next[j] ]。
所以，咱们得修改下求next 数组的代码。

//优化过后的next 数组求法
void GetNextval(char* p, int next[])
{
    int pLen = strlen(p);
    next[0] = -1;
    int k = -1;
    int j = 0;
    while (j < pLen – 1) {
        //p[k]表示前缀，p[j]表示后缀
        if (k == -1 || p[j] == p[k]) {
            ++j;
            ++k;
            //较之前next数组求法，改动在下面4行
            if (p[j] != p[k])
                next[j] = k; //之前只有这一行
            else //因为不能出现p[j] = p[ next[j ]]，所以当出现时需要继续递归，k = next[k] = next[next[k]]
                next[j] = next[k];
        }
        else {
            k = next[k];
        }
    }
}

得到next数组就好办了，一旦t[j]!=s[i]失配，i不动，j=next[j]继续匹配。

二刷。上面的解读理解起来太费劲了，也记不住。在知乎上看到一个很好的解读：https://www.zhihu.com/question/21923021/answer/281346746

首先，KMP算法在主串和模式串的匹配阶段的原理很好理解，这个不用多讲。关键是如何求解模式串的next数组。

请看下图。核心思想是，要把求next数组的过程理解为模式串自己和自己匹配的过程，更具体来说，是模式串的后缀和前缀匹配的过程，因为next数组的含义就是后缀和前缀相等的最大长度。

如下图所示，首先令i和j错一个位置开始匹配，这就保证了j是从头开始匹配，匹配前缀；而i是从中间某个位置匹配，表示后缀。

下图就可以很直观的看出来j匹配的是前缀，i匹配的是后缀，只要i和j一直匹配，则j的长度就表示主串以i结尾时前缀和后缀相等的长度。所以，只要一直匹配，则next[i]=j。

如果匹配失败，则根据KMP的思路和next数组的定义，直接令j=next[j]进行跳转，i保持不动就行了，这就是KMP主串和模式串匹配的原理。

基于这个思想，其实求解next数组的过程和KMP匹配的过程很相似。完整代码如下：

class Solution {
public:
	vector<int> GetNext(string s) {
		int n = s.size();
		vector<int> next(n, 0);
		next[0] = -1;
		int i = 0, j = -1;
		while (i < n - 1) {
			if (j == -1 || s[i] == s[j]) {
				++i;
				++j;
				next[i] = j;
			}
			else {
				j = next[j];
			}
		}
		return next;
	}
	int strStr(string haystack, string needle) {
		if (needle == "")return 0;
		vector<int> next = GetNext(needle);
		int m = haystack.size(), n = needle.size();
		if (n > m)return -1;
		int i = 0, j = 0;
		while (i < m&&j < n) {
			if (j == -1 || haystack[i] == needle[j]) {
				++i;
				++j;
			}
			else {
				j = next[j];
			}
		}
		if (j == n)return i - n;
		else return -1;
	}
};

hihoCoder week 84-1-Lucky Substrings

Leave a reply

hihoCoder week 84-1-Lucky Substrings 题目1 : Lucky Substrings 时间限制:10000ms 单点时限:1000ms 内存限制:256MB 描述 A string s is LUCKY if and only if the number of different characters in s is a fibonacci number. Given a string consisting of only lower case letters, output all its lucky non-empty substrings in lexicographical order. Same substrings should be printed once. 输入 A string consisting no more than 100 lower case letters. 输出 Output the lucky substrings in lexicographical order, one per line. Same substrings should be printed once. 样例输入 aabcd 样例输出 a aa aab aabc ab abc b bc bcd c cd d

如果一个字符串中不同字母的个数是斐波那契数，则称这个字符串是Lucky的，问字符串S中哪些子串是Lucky子串。要正确理解Lucky的定义，是不同字母的个数，而不是不同字母的个数序列。比如aabc是Lucky的，因为其有a,b,c共3种字母，而3在斐波那契数列中，所以aabc是Lucky的。不能理解为aabc中a出现2次，b,c分别出现1次，构成1,1,3是斐波那契数列，所以aabc是Lucky的。所以只能枚举所有子串，判断每个子串是否为Lucky的。如果知道了S[i,…,j]中不同字母的个数，判断S[i,…,j+1]时，只要判断S[j+1]这个字母是否在之前出现过。所以通过保存之前不同字母的个数，可以快速判断S[i,…,j+1]的情况。完整代码如下： [cpp] #include<iostream> #include<string> #include<set> #include<vector> using namespace std; set<int> FIB_NUM = { 1,2,3,5,8,13,21 }; int main() { string s; cin >> s; set<string> ans; int n = s.size(); for (int i = 0; i < n; i++) { vector<bool> alphabet(26, false); int cnt = 0; for (int j = i; j < n; j++) { if (!alphabet[s[j] – ‘a’]) { alphabet[s[j] – ‘a’] = true; cnt++; } if (FIB_NUM.find(cnt) != FIB_NUM.end()) { ans.insert(s.substr(i, j – i + 1)); } } } set<string>::iterator it = ans.begin(); while (it != ans.end()) { cout << *it << endl; it++; } return 0; } [/cpp] 本代码提交AC，用时17MS，内存0MB。]]>

LeetCode Longest Common Prefix

Leave a reply

14. Longest Common Prefix

Write a function to find the longest common prefix string amongst an array of strings.

If there is no common prefix, return an empty string "".

Example 1:

Input: ["flower","flow","flight"]
Output: "fl"

Example 2:

Input: ["dog","racecar","car"]
Output: ""
Explanation: There is no common prefix among the input strings.

Note:

All given inputs are in lowercase letters a-z.

class Solution {
public:
    string longestCommonPrefix(vector<string>& strs)
    {
        if (strs.size() == 0)
            return "";
        int max_common = INT_MAX;
        string longest_common = "";
        for (int i = 0; i < strs.size(); i++)
            if (strs[i].size() < max_common)
                max_common = strs[i].size();
        for (int i = 0; i < max_common; i++) {
            for (int j = 1; j < strs.size(); j++) {
                if (strs[j][i] != strs[0][i])
                    return longest_common;
            }
            longest_common += strs[0][i];
        }
        return longest_common;
    }
};

本代码提交AC，用时4MS。

二刷。简洁版代码如下：

class Solution {
public:
	string longestCommonPrefix(vector<string>& strs) {
		int n = strs.size(), i = 0;
		if (n == 0)return "";
		while (true) {
			bool valid = true;
			for (int j = 0; j < n; ++j) {
				if (i >= strs[j].size() || strs[j][i] != strs[0][i]) {
					valid = false;
					break;
				}
			}
			if (!valid)break;
			else ++i;
		}
		return strs[0].substr(0, i);
	}
};

本代码提交AC，用时4MS。

LeetCode Roman to Integer

Leave a reply

13. Roman to Integer

Roman numerals are represented by seven different symbols: I, V, X, L, C, D and M.

Symbol       Value
I             1
V             5
X             10
L             50
C             100
D             500
M             1000

For example, two is written as II in Roman numeral, just two one’s added together. Twelve is written as, XII, which is simply X + II. The number twenty seven is written as XXVII, which is XX + V + II.

Roman numerals are usually written largest to smallest from left to right. However, the numeral for four is not IIII. Instead, the number four is written as IV. Because the one is before the five we subtract it making four. The same principle applies to the number nine, which is written as IX. There are six instances where subtraction is used:

I can be placed before V (5) and X (10) to make 4 and 9.
X can be placed before L (50) and C (100) to make 40 and 90.
C can be placed before D (500) and M (1000) to make 400 and 900.

Given a roman numeral, convert it to an integer. Input is guaranteed to be within the range from 1 to 3999.

Example 1:

Input: "III"
Output: 3

Example 2:

Input: "IV"
Output: 4

Example 3:

Input: "IX"
Output: 9

Example 4:

Input: "LVIII"
Output: 58
Explanation: L = 50, V= 5, III = 3.

Example 5:

Input: "MCMXCIV"
Output: 1994
Explanation: M = 1000, CM = 900, XC = 90 and IV = 4.

这一题是LeetCode Integer to Roman的姊妹篇，只需从左往右依次解析字符串即可，优雅版本如下：

class Solution {
public:
    int romanToInt(string s)
    {
        vector<int> nums = { 1, 4, 5, 9, 10, 40, 50, 90, 100, 400, 500, 900, 1000 };
        vector<string> symbol = { "I", "IV", "V", "IX", "X", "XL", "L", "XC", "C", "CD", "D", "CM", "M" };
        int ans = 0;
        for (int i = nums.size() – 1; i >= 0; i–) {
            int sz = symbol[i].size();
            while (s.size() >= sz && s.substr(0, sz) == symbol[i]) {
                ans += nums[i];
                s = s.substr(sz);
            }
        }
        return ans;
    }
};

本代码提交AC，用时68MS。

LeetCode Distinct Subsequences

Leave a reply

115. Distinct Subsequences

Given a string S and a string T, count the number of distinct subsequences of S which equals T.

A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).

It’s guaranteed the answer fits on a 32-bit signed integer.

Example 1:

Input: S = "rabbbit", T = "rabbit"
Output: 3
Explanation:
As shown below, there are 3 ways you can generate "rabbit" from S.
(The caret symbol ^ means the chosen letters)

rabbbit
^^^^ ^^
rabbbit
^^ ^^^^
rabbbit
^^^ ^^^

Example 2:

Input: S = "babgbag", T = "bag"
Output: 5
Explanation:
As shown below, there are 5 ways you can generate "bag" from S.
(The caret symbol ^ means the chosen letters)

babgbag
^^ ^
babgbag
^^    ^
babgbag
^    ^^
babgbag
  ^  ^^
babgbag
    ^^^

这问题描述不清，题目想问S中有多少个不同的subsequence等于T，这里的不同是指形成subsequence的方法不同，比如样例中，rabbbit可以去掉中间任意一个b都能形成T，所以不同的方法有3种。这是典型的动态规划问题。假设dp[i][j]表示由S[1,…,i]变到T[1,…,j]有多少种subsequence方法。此时我们有两种选择，如果S[i]!=T[j]，则等价于由S[1,…,i-1]变到T[1,…,j]，所以dp[i][j]=dp[i-1][j]；如果S[i]==T[j]，则还可以由S[1,…,i-1]变到T[1,…,j-1]，所以dp[i][j]+=dp[i-1][j-1]。
DP转移公式如下：
$$dp[i][j]=\begin{cases}dp[i-1][j] & \text{if $S[i]\neq T[j]$}\\dp[i-1][j]+dp[i-1][j-1] & \text{if $S[i]=T[j]$}\\\end{cases}$$
为了便于理解，我们先在s和t的开头添加一个哨兵字符’^’，初始值dp[i][0]=1，因为由任意字符串转换为空字符串只有一种方法，就是把s中的字符全删了；dp[0][j]=0，因为空字符串的subsequence不可能是是某个非空字符串。
完整代码如下：

class Solution {
public:
    int numDistinct(string s, string t)
    {
        if (s == "")
            return 0;
        if (t == "")
            return 1;
        s = "^" + s;
        t = "^" + t;
        int n = s.size(), m = t.size();
        vector<vector<int> > dp(n, vector<int>(m, 0));
        for (int i = 0; i < n; i++)
            dp[i][0] = 1; // ‘*’ -> ” 1 sub
        for (int i = 1; i < n; i++) {
            for (int j = 1; j < m; j++) {
                dp[i][j] = dp[i – 1][j];
                if (s[i] == t[j])
                    dp[i][j] += dp[i – 1][j – 1];
            }
        }
        return dp[n – 1][m – 1];
    }
};

本代码提交AC，用时20MS。

二刷。不添加哨兵，vector增加1也是可以的，代码如下：

class Solution {
public:
    int numDistinct(string s, string t) {
        int m = s.size(), n = t.size();
        if(n > m) return 0;
        vector<vector<long long>> dp(m + 1, vector<long long>(n + 1, 0));
        for(int i = 0; i < m; ++i) dp[i][0] = 1;
        for(int i = 1; i <= m; ++i) {
            for(int j = 1; j <= n; ++j) {
                if(s[i - 1] == t[j - 1]) {
                    dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j];
                } else {
                    dp[i][j] = dp[i - 1][j];
                }
            }
        }
        return dp[m][n];
    }
};

本代码提交AC，用时20MS。

LeetCode Longest Substring Without Repeating Characters

Leave a reply

3. Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"
Output: 3 
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. 
             Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

本题要求找出字符串中没有重复字母的最长连续子串。最快想到的是两重遍历i,j，每次用set判断是否重复，重复就进行下一次遍历i++，但是这样显然超时。因为字符串中只有ASCII，所以可以用bool hash[128]来判断是否有重复字符，比set要快（用空间换时间）。
另外，假设遍历状态是abcdefgc，发现有两个’c’，下一次遍历的时候没必要从b开始遍历，因为从b开始肯定也会遇到后面的两个’c’，但此时字符串长度比之前的少1，所以可以直接从上次的’c’的下一个字母’d’开始重新遍历，这样可以节省不少时间。
完整代码如下：

class Solution {
public:
    int lengthOfLongestSubstring(string s)
    {
        int longest = 0, n = s.size(), cur = 0;
        vector<int> hash(128, 0);
        for (int i = 0; i < n; i++) {
            if (hash[s[i]]) {
                if (cur > longest)
                    longest = cur;
                cur = 0;
                i = hash[s[i]]; //从重复字符下一个字符开始遍历
                //memset(&hash[0], 0, 128*sizeof(int));//比fill更快，但是没fill安全
                fill(hash.begin(), hash.end(), 0);
            }
            hash[s[i]] = i + 1; //记录下一个字符下标
            cur++;
        }
        return (longest > cur) ? longest : cur;
    }
};

本代码提交AC，用时68MS。
二刷，上面我的解法太过复杂。其实我们可以用一个map保存出现字符的最大下标，start保存一个下标，使得从start到当前字符这个区间不存在重复字符。
每来一个字符，我们去map中找一下，如果该字符之前存在，则更新start为最大下标（为了满足从start到当前字符这个区间不存在重复字符），然后不重复子串的长度就是i-start。
完整代码如下，因为s只包含字符，所以也可以用一个256长的数组表示map。

class Solution {
public:
    int lengthOfLongestSubstring(string s)
    {
        map<char, int> visited;
        int ans = 0, start = -1;
        for (int i = 0; i < s.size(); ++i) {
            if (visited.find(s[i]) != visited.end())
                start = max(start, visited[s[i]]);
            visited[s[i]] = i;
            ans = max(ans, i – start);
        }
        return ans;
    }
};

本代码提交AC，用时22MS。

三刷。首尾指针法，令[i,j)为不含重复字符的子串的区间，用一个hash数组存储某个字符是否出现过。开始的时候，j不断往前走，直到遇到区间[i,j)中出现过的字符，假设这个字符下标为k的话，则需要把[i,k]之间的hash全部重置为0，i跳到k+1，此后j继续往前走。

完整代码如下：

class Solution {
public:
	int lengthOfLongestSubstring(string s) {
		int n = s.size();
		if (n == 0 || n == 1)return n;
		vector<int> hash(256, 0);
		int i = 0, j = 1;
		hash[s[0]] = 1;
		int ans = 0;
		while (j < n) {
			while (j < n&&hash[s[j]] == 0) {
				hash[s[j]] = 1;
				++j;
			}
			if (j - i > ans)ans = j - i;
			if (hash[s[j]] == 1) {
				while (s[i] != s[j]) {
					hash[s[i]] = 0;
					++i;
				}
				hash[s[i]] = 0;
				++i;
			}
		}
		return ans;
	}
};

本代码提交AC，用时8MS，虽然比解法2快，但解法2代码最简单了。

bitJoy > code

Keep calm and carry on coding

Tag Archives: 字符串

LeetCode Add Binary

LeetCode Group Anagrams

LeetCode Length of Last Word

LeetCode Longest Valid Parentheses

LeetCode Implement strStr()

hihoCoder week 84-1-Lucky Substrings

LeetCode Longest Common Prefix

LeetCode Roman to Integer

LeetCode Distinct Subsequences

LeetCode Longest Substring Without Repeating Characters