Tag Archives: 字符串

LeetCode Isomorphic Strings

205. Isomorphic Strings

Given two strings s and t, determine if they are isomorphic.

Two strings are isomorphic if the characters in s can be replaced to get t.

All occurrences of a character must be replaced with another character while preserving the order of characters. No two characters may map to the same character but a character may map to itself.

Example 1:

Input: s = "egg", t = "add"
Output: true

Example 2:

Input: s = "foo", t = "bar"
Output: false

Example 3:

Input: s = "paper", t = "title"
Output: true

Note:
You may assume both and have the same length.


判断两个字符串是否同构。同构的意思是字符串s和t中的字母能找到一种一一映射的关系,使得通过这个映射之后s能变成t,t也能变成s。 简单题,维护两个hash表,一个保存s到t的映射,另一个保存t到s的映射。在遍历字符串的过程中,同时判断两个映射是否满足一一映射。 注意不能只用一个hash,因为可能出现s=ab,t=aa的情况,看s->t,a映射到a,b映射到b,没有问题。但是看t->s时,有问题,出现了a既映射到a又映射到b的情况。所以需要同时保存s到t和t到s的映射。 代码如下:

class Solution {
public:
    bool isIsomorphic(string s, string t)
    {
        unordered_map<char, char> hash1, hash2;
        for (int i = 0; i < s.size(); ++i) {
            if (hash1.find(s[i]) == hash1.end())
                hash1[s[i]] = t[i];
            else {
                if (hash1[s[i]] != t[i])
                    return false;
            }
            if (hash2.find(t[i]) == hash2.end())
                hash2[t[i]] = s[i];
            else {
                if (hash2[t[i]] != s[i])
                    return false;
            }
        }
        return true;
    }
};

本代码提交AC,用时13MS。

LeetCode Reverse String II

LeetCode Reverse String II Given a string and an integer k, you need to reverse the first k characters for every 2k characters counting from the start of the string. If there are less than k characters left, reverse all of them. If there are less than 2k but greater than or equal to k characters, then reverse the first k characters and left the other as original. Example:

Input: s = "abcdefg", k = 2
Output: "bacdfeg"
Restrictions:
  1. The string consists of lower English letters only.
  2. Length of the given string and k will in the range [1, 10000]

LeetCode Reverse String的基础上,稍微增加了一点难度,要求每2k个字母逆序其前k个字母。当剩余字母少于k个时,都逆序。当剩余字母在k个~2k个之间时,逆序前k个。 直接借用STL的reverse函数,迭代器的结束位置为min(s.begin() + start + k,s.end()),因为string是线性存储的,其迭代器可以比较大小。 代码如下: [cpp] class Solution { public: string reverseStr(string s, int k) { int n = s.size(); for (int start = 0; start < n; start += 2 * k) { reverse(s.begin() + start, min(s.begin() + start + k,s.end())); } return s; } }; [/cpp] 本代码提交AC,用时9MS。]]>

LeetCode Relative Ranks

LeetCode Relative Ranks Given scores of N athletes, find their relative ranks and the people with the top three highest scores, who will be awarded medals: “Gold Medal”, “Silver Medal” and “Bronze Medal”. Example 1:

Input: [5, 4, 3, 2, 1]
Output: ["Gold Medal", "Silver Medal", "Bronze Medal", "4", "5"]
Explanation: The first three athletes got the top three highest scores, so they got "Gold Medal", "Silver Medal" and "Bronze Medal".
For the left two athletes, you just need to output their relative ranks according to their scores.
Note:
  1. N is a positive integer and won’t exceed 10,000.
  2. All the scores of athletes are guaranteed to be unique.

给定N个运动员的得分,要求输出他们的相对排序,前三名还需要颁发金银铜奖牌。 简单题,构造分数和下标的pair,然后对他们排序,得到相对排名之后,转换为string输出。 代码如下: [cpp] class Solution { public: vector<string> findRelativeRanks(vector<int>& nums) { vector<pair<int, int>> scores; int n = nums.size(); for (int i = 0; i < n; ++i)scores.push_back(pair<int, int>(nums[i], i)); sort(scores.begin(), scores.end(), greater<pair<int,int>>()); vector<string> ans(n, ""); vector<string> medals = { "Gold Medal", "Silver Medal", "Bronze Medal" }; for (int i = 0; i < min(3, n); ++i)ans[scores[i].second] = medals[i]; for (int i = 3; i < n; ++i)ans[scores[i].second] = to_string(i + 1); return ans; } }; [/cpp] 本代码提交AC,用时13MS。]]>

LeetCode Detect Capital

LeetCode Detect Capital Given a word, you need to judge whether the usage of capitals in it is right or not. We define the usage of capitals in a word to be right when one of the following cases holds:

  1. All letters in this word are capitals, like “USA”.
  2. All letters in this word are not capitals, like “leetcode”.
  3. Only the first letter in this word is capital if it has more than one letter, like “Google”.
Otherwise, we define that this word doesn’t use capitals in a right way. Example 1:
Input: "USA"
Output: True
Example 2:
Input: "FlaG"
Output: False
Note: The input will be a non-empty word consisting of uppercase and lowercase latin letters.
判断一个单词的大写形式是否正确,大写的规则有三:
  1. 所有字母都是大写
  2. 所有字母都是小写
  3. 首字母大写,剩余字母全小写
简单题,直接根据规则写代码就好。如果首字母是小写,则剩余字母要全是小写。如果首字母是大写,则剩余字母要么都是大写,要么都是小写。 代码如下: [cpp] class Solution { private: inline bool myisupper(const char& c) { return c >= ‘A’&&c <= ‘Z’; } public: bool detectCapitalUse(string word) { int n = word.size(); if (n == 1)return true; bool firstUpper = myisupper(word[0]); if (firstUpper) { bool second = myisupper(word[1]); for (int i = 2; i < n; ++i) { if (second != myisupper(word[i]))return false; } return true; } else { for (int i = 1; i < n; ++i) { if (myisupper(word[i]))return false; } return true; } } }; [/cpp] 本代码提交AC,用时12MS。 奇怪的是,我用系统的isupper遇到“USA”单词时会判断错误,但是用自己实现的myisupper就好了。 网上还有一种简洁的方法,统计单词中大写字母的个数upper_cnt,如果upper_cnt==0,说明全是小写,正确;如果upper_cnt==n,说明全是大写,正确;如果upper_cnt==1,且首字母是大写,也是正确;其他情况都是错误。逻辑清楚,代码简洁: [cpp] class Solution { public: bool detectCapitalUse(string word) { int n = word.size(), upper_cnt = 0; for (int i = 0; i < n; ++i) { if (isupper(word[i]))++upper_cnt; } if (upper_cnt == 0 || n == upper_cnt)return true; if (upper_cnt == 1 && isupper(word[0]))return true; return false; } }; [/cpp] 本代码提交AC,用时6MS。]]>

LeetCode Summary Ranges

228. Summary Ranges2

Given a sorted integer array without duplicates, return the summary of its ranges.

Example 1:

Input:  [0,1,2,4,5,7]
Output: ["0->2","4->5","7"]
Explanation: 0,1,2 form a continuous range; 4,5 form a continuous range.

Example 2:

Input:  [0,2,3,4,6,8,9] Output: ["0","2->4","6","8->9"] Explanation: 2,3,4 form a continuous range; 8,9 form a continuous range.28. Summary Ranges

给定一个数组,把数组汇总成若干个连续的区间,以字符串的形式给出。
方法是判断两个相邻的数之差是否为1,不是1则前面的构成一个区间,转换为字符串输出。判断前一个区间只有一个数还是有多个数的方法是区间的起止位置i,j是否相差1,如果是,则只有一个数,否则有多个数。
注意最后一个区间需要在while循环外部判断。代码如下:

class Solution {
public:
    vector<string> summaryRanges(vector<int>& nums)
    {
        vector<string> ans;
        int n = nums.size(), i = 0, j = 1;
        if (n == 0)
            return ans;
        while (j < n) {
            if (nums[j] – nums[j – 1] == 1)
                ++j;
            else {
                if (j – i == 1)
                    ans.push_back(to_string(nums[i]));
                else
                    ans.push_back(to_string(nums[i]) + "->" + to_string(nums[j – 1]));
                i = j++;
            }
        }
        if (j – i == 1)
            ans.push_back(to_string(nums[i]));
        else
            ans.push_back(to_string(nums[i]) + "->" + to_string(nums[j – 1]));
        return ans;
    }
};

本代码提交AC,用时0MS。
还有一种解法利用了二分查找的思路,很有意思,参考讨论区
假如给定的数组是[1,2,3,4,5,…,10000,20000],如果还是用上述方法,时间复杂度是O(n),至少需要遍历一遍数组。但是数组的前10000个数是严格有序且连续递增的,如果能把这一部分识别出来,直接转换为”1->10000″,则速度会大大提高。
二分查找的思路是,对于给定区间[a,…,b],假设其在数组中的下标起点分别为[i,…,j],则如果b-a==j-i,说明[a,b]之间是类似上面的[1,2,…,10000]的情况的,因为数组中没有重复元素,所以区间有j-i个元素,且元素最大值和最小值的差b-a也是j-i,说明他们是一个连续的有序区间,我们直接把这个区间转换为”a->b”。
否则如果j-i!=b-a,则取中点mid=(i+j)/2,递归在[i,mid]和[mid+1,j]进行。
有一种情况需要注意的是,分割出来的区间可能需要合并,比如[1,2,3,4,5,6,8],初始i[1,..,8]不满足b-a==j-i,所以递归在[1,2,3,4]和[5,6,8]进行。左边转换为了”1->4″,右边还需要递归,假设转换为了”5->6″和”8″,那么”1->4″和”5->6″是需要合并的。所以我们在插入”5->6″时,看看之前得到的区间”1->4″的end是否和当前区间”5->6″的start只差1,如果是,则需要合并,更新之前区间的end为现在要插入区间的end,变成了”1->6″。
完整代码如下:

class Solution {
private:
    struct Range {
        int start, end;
        Range(int s, int e)
            : start(s)
            , end(e){};
    };
    void add2Ans(int a, int b, vector<Range>& ranges)
    {
        if (ranges.empty() || ranges[ranges.size() – 1].end != a – 1) {
            ranges.push_back(Range(a, b));
        }
        else {
            ranges[ranges.size() – 1].end = b;
        }
    }
    void helper(vector<int>& nums, int start, int end, vector<Range>& ranges)
    {
        if (start == end || end – start == nums[end] – nums[start]) {
            add2Ans(nums[start], nums[end], ranges);
            return;
        }
        int mid = start + (end – start) / 2;
        helper(nums, start, mid, ranges); // 先左区间
        helper(nums, mid + 1, end, ranges); // 后右区间
    }

public:
    vector<string> summaryRanges(vector<int>& nums)
    {
        vector<string> ans;
        if (nums.empty())
            return ans;
        vector<Range> ranges;
        helper(nums, 0, nums.size() – 1, ranges);
        for (const auto& r : ranges) {
            if (r.start == r.end)
                ans.push_back(to_string(r.start));
            else
                ans.push_back(to_string(r.start) + "->" + to_string(r.end));
        }
        return ans;
    }
};

本代码提交AC,用时0MS。
这个代码在数据有很多连续区间时,接近O(lgn)的复杂度。但是当数据全是不连续的时,比如[1,3,5,7,9],则只有到递归到最深层start==end(即针对单个数字)时,才得到一个区间,所以退化为O(n)的算法。
如果再follow up可能有重复元素时,上述二分查找的方法就不管用了,因为属于一个区间的不一定满足b-a==j-i,比如[1,2,2,2,3],b-a=2,而j-i=4。此时只能用第一种O(n)的方法:

class Solution {
public:
    vector<string> summaryRanges(vector<int>& nums)
    {
        vector<string> ans;
        if (nums.empty())
            return ans;
        int i = 0, j = 1, n = nums.size();
        while (j < n) {
            while (j < n && (nums[j] == nums[j – 1] || nums[j] – nums[j – 1] == 1))
                ++j; // 考虑重复元素
            if (j >= n)
                break;
            if (j – 1 == i)
                ans.push_back(to_string(nums[i]));
            else
                ans.push_back(to_string(nums[i]) + "->" + to_string(nums[j – 1]));
            i = j++;
        }
        if (j – 1 == i)
            ans.push_back(to_string(nums[i]));
        else
            ans.push_back(to_string(nums[i]) + "->" + to_string(nums[j – 1]));
        return ans;
    }
};

本代码提交AC,用时0MS。

二刷。更加鲁邦容易理解的代码:

class Solution {
public:
	vector<string> summaryRanges(vector<int>& nums) {
		vector<vector<int>> ranges;
		int i = 0, n = nums.size();
		while (i < n) {
			int j = i + 1;
			while (j < n&&nums[j] == nums[j - 1] + 1)++j;
			ranges.push_back({ nums[i],nums[j - 1] });
			i = j;
		}
		vector<string> ans;
		for (int i = 0; i < ranges.size(); ++i) {
			if (ranges[i][0] == ranges[i][1]) {
				ans.push_back(to_string(ranges[i][0]));
			}
			else {
				ans.push_back(to_string(ranges[i][0]) + "->" + to_string(ranges[i][1]));
			}
		}
		return ans;
	}
};

本代码提交AC,用时0MS。

LeetCode Longest Substring with At Least K Repeating Characters

LeetCode Longest Substring with At Least K Repeating Characters Find the length of the longest substring T of a given string (consists of lowercase letters only) such that every character in T appears no less than k times. Example 1:

Input:
s = "aaabb", k = 3
Output:
3
The longest substring is "aaa", as 'a' is repeated 3 times.
Example 2:
Input:
s = "ababbc", k = 2
Output:
5
The longest substring is "ababb", as 'a' is repeated 2 times and 'b' is repeated 3 times.

给定一个字符串s和数字k,求s的子串中每个字母都至少出现k次的最长子串的长度。 因为题目规定字符串中只有26个小写字母,所以可以用一个int来Hash某个子串,如果某字母出现次数小于k次,则对应位为1,否则对应位为0。最后,在遍历的过程中,判断子串的Hash值是否等于0,如果是则该子串满足要求,更新最大长度。 完整代码如下: [cpp] class Solution { public: int longestSubstring(string s, int k) { int maxLen = 0; for (int i = 0; i < s.size(); ++i) { vector<int> mask(26, 0); int j = i, flag = 0; while (j < s.size()) { ++mask[s[j] – ‘a’]; if (mask[s[j] – ‘a’] < k)flag |= (1 << (s[j] – ‘a’)); else flag &= (~(1 << (s[j] – ‘a’))); if (flag == 0)maxLen = max(maxLen, j – i + 1); ++j; } } return maxLen; } }; [/cpp] 本代码提交AC,用时196MS。]]>

LeetCode Maximum Product of Word Lengths

LeetCode Maximum Product of Word Lengths Given a string array words, find the maximum value of length(word[i]) * length(word[j]) where the two words do not share common letters. You may assume that each word will contain only lower case letters. If no such two words exist, return 0. Example 1: Given ["abcw", "baz", "foo", "bar", "xtfn", "abcdef"] Return 16 The two words can be "abcw", "xtfn". Example 2: Given ["a", "ab", "abc", "d", "cd", "bcd", "abcd"] Return 4 The two words can be "ab", "cd". Example 3: Given ["a", "aa", "aaa", "aaaa"] Return 0 No such pair of words.


给定一个字符串数组,问两个没有共同字母的字符串的长度之积最大是多少,如果不存在这样两个字符串,则返回0。 常规思路是两层循环,判断两个字符串是否有共同字符,没有就把它俩的长度乘起来,更新最大值。 问题的关键就是如果快速的判断两个字符串是否有共同字符,常规思路是把一个字符串Hash,然后对另一个字符串的每个字符,看是否在前一个字符串中出现。快速方法是,因为题目说明字符串中只有小写字母,也就是只有26种情况,所以我们可以对这26个字母编码成一个26长的bit位,用一个int存储足够。这样判断两个字符串是否有共同字符,只需要把两个int相与,等于0表示没有共同字符。 完整代码如下: [cpp] class Solution { public: int maxProduct(vector<string>& words) { int ans = 0; vector<int> mask(words.size(), 0); for (int i = 0; i < words.size(); ++i) { for (int j = 0; j < words[i].size(); ++j) { mask[i] |= 1 << (words[i][j] – ‘a’); } for (int k = 0; k < i; ++k) { if ((mask[i] & mask[k]) == 0) // 注意 == 的优先级高于 & ans = max(ans, int(words[i].size()*words[k].size())); } } return ans; } }; [/cpp] 本代码提交AC,用时69MS。]]>

LeetCode Find All Anagrams in a String

LeetCode Find All Anagrams in a String Given a string s and a non-empty string p, find all the start indices of p‘s anagrams in s. Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100. The order of output does not matter. Example 1:

Input:
s: "cbaebabacd" p: "abc"
Output:
[0, 6]
Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".
Example 2:
Input:
s: "abab" p: "ab"
Output:
[0, 1, 2]
Explanation:
The substring with start index = 0 is "ab", which is an anagram of "ab".
The substring with start index = 1 is "ba", which is an anagram of "ab".
The substring with start index = 2 is "ab", which is an anagram of "ab".

给定两个字符串s和p,要从s中找出所有和p是变位词的子串的起始位置。 判断两个字符串是否为变位词,可以用Hash的方法,在LeetCode Group Anagrams中已经介绍过了。 本题的思路也很简单,遍历s,每次从左端删掉一个字符,加入右端的一个字符,判断s和p的Hash表是否相等,相等就说明是一个变位词。 代码如下: [cpp] class Solution { public: vector<int> findAnagrams(string s, string p) { int slen = s.size(), plen = p.size(); if (slen < plen)return vector<int>(); vector<int> ans; vector<int> hash1(26, 0), hash2(26, 0); for (int i = 0; i < plen; ++i) { ++hash1[s[i] – ‘a’]; ++hash2[p[i] – ‘a’]; } if (hash1 == hash2)ans.push_back(0); for (int i = plen; i < slen; ++i) { –hash1[s[i – plen] – ‘a’]; ++hash1[s[i] – ‘a’]; if (hash1 == hash2)ans.push_back(i – plen + 1); } return ans; } }; [/cpp] 本代码提交AC,用时33MS。]]>

LeetCode Repeated DNA Sequences

187. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

题意:给定一个DNA序列,问有哪些长度为10的子串出现过至少两次。 解法:遍历字符串,用Hash表记录所有长度为10的子串的出现频率,如果频率等于2,则把该子串加入到结果数组中。时间复杂度为O(n),空间复杂度为Hash表占用的空间。 完整代码如下:

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s)
    {
        vector<string> ans;
        unordered_map<string, int> hash;
        for (size_t i = 0; i + 10 <= s.size(); ++i) {
            string seq = s.substr(i, 10);
            ++hash[seq];
            if (hash[seq] == 2)
                ans.push_back(seq);
        }
        return ans;
    }
};

本代码提交AC,用时73MS。
优化思路:
因为碱基只有4个:A,T,G,C,所以可以用两个bit表示所有碱基,子串的长度为10,所以可以用20bits表示一个子串,也就是用一个int就可以表示一个子串了。所以我们在将长度为10的子串插入Hash表时,可以先将其编码成一个int。这样Hash表的key就变成一个int了,Hash存储空间应该会更小,而且int也更利于Hash的查找。
完整代码如下:

class Solution {
private:
    unsigned int encode(const string& s)
    {
        unsigned int code = 0;
        for (size_t i = 0; i < s.size(); ++i) {
            code <<= 2;
            switch (s[i]) {
            case ‘A’:
                code += 0;
                break;
            case ‘T’:
                code += 1;
                break;
            case ‘C’:
                code += 2;
                break;
            case ‘G’:
                code += 3;
                break;
            }
        }
        return code;
    }
public:
    vector<string> findRepeatedDnaSequences(string s)
    {
        vector<string> ans;
        unordered_map<unsigned int, int> hash;
        for (size_t i = 0; i + 10 <= s.size(); ++i) {
            string seq = s.substr(i, 10);
            unsigned int code = encode(seq);
            ++hash[code];
            if (hash[code] == 2)
                ans.push_back(seq);
        }
        return ans;
    }
};

本代码提交AC,用时59MS,快了一点。

LeetCode Largest Number

179. Largest Number

Given a list of non negative integers, arrange them such that they form the largest number.

Example 1:

Input: [10,2]
Output: "210"

Example 2:

Input: [3,30,34,5,9]
Output: "9534330"

Note: The result may be very large, so you need to return a string instead of an integer.


给定一个数组,要把数组中所有的数拼成一个长的数,问能够拼成的最大的数是多少,要求返回这个最大数的字符串形式。

要想长数越大,则数组中“越大的数”应该排在前面,这里的“大”是指字典序,比如9就比34大,所以9应该在字符串中排前面。所以大的思路是先把数字数组转换为对应的字符串数组,然后按字典序对字符串数组排序,最后按先后顺序拼接起来。

还有一个问题是,遇到30和34时,应该把34排在30的前面,即3430>3034,这符合字典序,没什么问题。

另一个问题是,遇到一个数是另一个数的前缀的情况,应该怎么排序呢,比如34和349时,应该是349排在34的前面,因为34934>34349,这也符合字典序。但是,如果是34和341,正确的排序应该是34排在341的前面,因为34341>34134,此时就不符合字典的降序了。所以需要自定义一个对字符串的排序算法,此算法要对字符串s1和s2进行比较,方法是判断s1+s2和s2+s1的字典序,因为这就是这两个字符串按不同顺序拼接之后的大小。

最后一个问题,如果遇到数组是[0,0],即两个0时,能拼成的最大数是0,但是如果先转换成字符数组之后,拼成的数就是00了。所以在最后,我们判断一下开头的字符是否为’0’,如果是,说明这个数就是0了,不管后面是否还有多余的字符’0’,我们都只返回”0″。如果开头的字符不是’0’,则多个’0’是有效的,比如30和300是不同的两个数。

完整代码如下:

bool comp(const string& s1, const string& s2)
{
    string tmp1 = s1 + s2, tmp2 = s2 + s1;
    return tmp1 > tmp2;
}
class Solution {
public:
    string largestNumber(vector<int>& nums)
    {
        vector<string> snums;
        for (size_t i = 0; i < nums.size(); ++i)
            snums.push_back(to_string(nums[i]));
        sort(snums.begin(), snums.end(), comp);
        string ans = "";
        for (size_t i = 0; i < snums.size(); ++i)
            ans += snums[i];
        if (!ans.empty() && ans[0] == ‘0’)
            return "0";
        return ans;
    }
};

本代码提交AC,用时6MS。

代码中注意一点,自定义的比较函数不能是类的成员函数,要不然会报错“非标准语法;请使用 “&” 来创建指向成员的指针”,这个错误提示很奇怪。后来才知道比较函数必须定义成全局函数或者静态函数,因为sort自己就是全局或静态函数,无法调用成员函数