LeetCode Design Search Autocomplete System
Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'
). For each character they type except ‘#’, you need to return the top 3 historical hot sentences that have prefix the same as the part of sentence already typed. Here are the specific rules:
- The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
- The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same degree of hot, you need to use ASCII-code order (smaller one appears first).
- If less than 3 hot sentences exist, then just return as many as you can.
- When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.
Your job is to implement the following functions:
The constructor function:
AutocompleteSystem(String[] sentences, int[] times):
This is the constructor. The input is
historical data.
Sentences
is a string array consists of previously typed sentences.
Times
is the corresponding times a sentence has been typed. Your system should record these historical data.
Now, the user wants to input a new sentence. The following function will provide the next character the user types:
List<String> input(char c):
The input
c
is the next character typed by the user. The character will only be lower-case letters (
'a'
to
'z'
), blank space (
' '
) or a special character (
'#'
). Also, the previously typed sentence should be recorded in your system. The output will be the
top 3 historical hot sentences that have prefix the same as the part of sentence already typed.
Example:
Operation: AutocompleteSystem([“i love you”, “island”,”ironman”, “i love leetcode”], [5,3,2,2])
The system have already tracked down the following sentences and their corresponding times:
"i love you"
:
5
times
"island"
:
3
times
"ironman"
:
2
times
"i love leetcode"
:
2
times
Now, the user begins another search:
Operation: input(‘i’)
Output: [“i love you”, “island”,”i love leetcode”]
Explanation:
There are four sentences that have prefix
"i"
. Among them, “ironman” and “i love leetcode” have same hot degree. Since
' '
has ASCII code 32 and
'r'
has ASCII code 114, “i love leetcode” should be in front of “ironman”. Also we only need to output top 3 hot sentences, so “ironman” will be ignored.
Operation: input(‘ ‘)
Output: [“i love you”,”i love leetcode”]
Explanation:
There are only two sentences that have prefix
"i "
.
Operation: input(‘a’)
Output: []
Explanation:
There are no sentences that have prefix
"i a"
.
Operation: input(‘#’)
Output: []
Explanation:
The user finished the input, the sentence
"i a"
should be saved as a historical sentence in system. And the following input will be counted as a new search.
Note:
- The input sentence will always start with a letter and end with ‘#’, and only one blank space will exist between two words.
- The number of complete sentences that to be searched won’t exceed 100. The length of each sentence including those in the historical data won’t exceed 100.
- Please use double-quote instead of single-quote when you write test cases even for a character input.
- Please remember to RESET your class variables declared in class AutocompleteSystem, as static/class variables are persisted across multiple test cases. Please see herefor more details.
本题要求实现一个搜索引擎的自动补全功能,正好是我想在我的
新闻搜索引擎里实现的功能。首先会给出一些历史数据,就是哪些句子被访问了多少次。现在模拟用户输入,每输入一个字符,给出以输入的所有字符为前缀的历史句子,并且只给出被访频率前3的句子作为自动补全的候选。
Trie树的应用题。Trie树的节点属性包含is_sentence_以该字符结尾是否是一个句子,cnt_该句子的频率。首先把历史数据插入到Trie树中,记录好相应的is_sentence_和cnt_。
用一个成员变量cur_sentence_记录当前输入的字符串前缀。查找的时候,用cur_sentence_在Trie树中找,先找到这个前缀,然后在26个字母+一个空格中递归查找所有孩子节点,把能形成句子的字符串及其频率插入到一个优先队列中。该优先队列先以句子频率排序,如果频率相等,再以字典序排列。这样我们直接从优先队列中取出前三个堆顶句子,就是自动补全的候选。
如果遇到#,说明输入结束,把cur_sentence_及其频率1也插入到Trie树中。注意插入之后树中节点的频率是累加的,即第34行。
注意AutocompleteSystem类初始化的时候需要清空cur_sentence_和Trie树,更多的细节请看代码中的caution。
[cpp]
const int N = 26;
class AutocompleteSystem {
private:
struct Node {
bool is_sentence_;
int cnt_;
vector<Node*> children_;
Node() :is_sentence_(false), cnt_(0){
for (int i = 0; i < N + 1; ++i)children_.push_back(NULL);
}
};
Node *root;
string cur_sentence_;
struct Candidate {
int cnt_;
string sentence_;
Candidate(int &cnt, string &sentence) :cnt_(cnt), sentence_(sentence) {};
bool operator<(const Candidate& cand) const {
return (cnt_ < cand.cnt_) || (cnt_ == cand.cnt_&&sentence_ > cand.sentence_); // caution
}
};
void AddSentence(const string& sentence, const int& time) {
Node *cur = root;
for (const auto& c : sentence) {
int idx = c – ‘a’;
if (c == ‘ ‘)idx = N;
if (cur->children_[idx] == NULL)cur->children_[idx] = new Node();
cur = cur->children_[idx];
}
cur->is_sentence_ = true;
cur->cnt_ += time; // caution
}
void FindSentences(Node *root, string &sentence, priority_queue<Candidate>& candidates) {
if (root != NULL&&root->is_sentence_)candidates.push(Candidate(root->cnt_, sentence));
if (root == NULL)return;
for (int i = 0; i < N + 1; ++i) {
if (root->children_[i] != NULL) {
if (i == N)
sentence.push_back(‘ ‘);
else
sentence.push_back(‘a’ + i);
FindSentences(root->children_[i], sentence, candidates);
sentence.pop_back();
}
}
}
void StartWith(priority_queue<Candidate>& candidates) {
Node *cur = root;
for (const auto& c : cur_sentence_) {
int idx = c – ‘a’;
if (c == ‘ ‘)idx = N;
if (cur->children_[idx] == NULL)return;
cur = cur->children_[idx];
}
string sentence = cur_sentence_;
FindSentences(cur, sentence, candidates);
}
public:
AutocompleteSystem(vector<string> sentences, vector<int> times) {
root = new Node(); // caution
cur_sentence_ = ""; // caution
for (int i = 0; i < sentences.size(); ++i)AddSentence(sentences[i], times[i]);
}
vector<string> input(char c) {
if (c == ‘#’) {
AddSentence(cur_sentence_, 1); // caution
cur_sentence_ = ""; // caution
return{};
}
else {
cur_sentence_.push_back(c);
priority_queue<Candidate> candidates;
StartWith(candidates);
if (candidates.empty())return{};
vector<string> ans;
while (!candidates.empty()) {
ans.push_back(candidates.top().sentence_);
candidates.pop();
if (ans.size() == 3)break;
}
return ans;
}
}
};
[/cpp]
本代码提交AC,用时329MS。]]>