Tag Archives: Hash

LeetCode How Many Numbers Are Smaller Than the Current Number

1365. How Many Numbers Are Smaller Than the Current Number

Given the array nums, for each nums[i] find out how many numbers in the array are smaller than it. That is, for each nums[i] you have to count the number of valid j's such that j != i and nums[j] < nums[i].

Return the answer in an array.

Example 1:

Input: nums = [8,1,2,2,3]
Output: [4,0,1,1,3]
Explanation: 
For nums[0]=8 there exist four smaller numbers than it (1, 2, 2 and 3). 
For nums[1]=1 does not exist any smaller number than it.
For nums[2]=2 there exist one smaller number than it (1). 
For nums[3]=2 there exist one smaller number than it (1). 
For nums[4]=3 there exist three smaller numbers than it (1, 2 and 2).

Example 2:

Input: nums = [6,5,4,8]
Output: [2,1,0,3]

Example 3:

Input: nums = [7,7,7,7]
Output: [0,0,0,0]

Constraints:

  • 2 <= nums.length <= 500
  • 0 <= nums[i] <= 100

给定一个数组,对于数组中的每个元素,计算数组中有多少个数小于这个元素。

简单题。因为数组元素的范围很小,在[0, 100]之间,所以直接用hash计算出每个元素出现的次数,然后算出前n项累加和,即可得出小于当前项的元素个数之和,最后查表即可。

完整代码如下:

class Solution {
public:
	vector<int> smallerNumbersThanCurrent(vector<int>& nums) {
		const int MAXN = 101;
		vector<int> ans(nums.size(), 0);
		vector<int> cnt(MAXN, 0);
		for (int i = 0; i < nums.size(); ++i) {
			++cnt[nums[i]];
		}
		for (int i = 1; i < MAXN; ++i) {
			cnt[i] += cnt[i - 1];
		}
		for (int i = 0; i < nums.size(); ++i) {
			if (nums[i] > 0) {
				ans[i] = cnt[nums[i] - 1];
			}
		}
		return ans;
	}
};

本代码提交AC,用时12MS。

LeetCode Find Duplicate Subtrees

LeetCode Find Duplicate Subtrees Given a binary tree, return all duplicate subtrees. For each kind of duplicate subtrees, you only need to return the root node of any one of them. Two trees are duplicate if they have the same structure with same node values. Example 1: 

        1
       / \
      2   3
     /   / \
    4   2   4
       /
      4
The following are two duplicate subtrees:
      2
     /
    4
and
    4
Therefore, you need to return above trees’ root in the form of a list.
给定一棵二叉树,要求找出其中的重复子树,每组重复子树输出其中一个子树即可。 去重最好的办法就是hash,所以思想是把每个子树hash到一个unordered_map中,如果有多个子树hash到同一个桶,则出现了重复。 所以问题的关键是怎样对一个子树hash。我们可以把一棵子树表示成 (left_hash, root, right_hash),其中left_hash和right_hash分别是左子树和右子树的hash字符串,root表示当前根节点的hash字符串。所以这是一个递归定义。规定空节点的hash值为空串””。 根据子树hash的定义,叶子节点的hash值是(“”,val,””)。求解子树hash值的过程可以用中序遍历,左根右。为什么子树hash中需要逗号和括号呢,就是为了保证不同严格区分不同的子树,因为单纯的中序遍历是无法区分某些子树的,比如下面的两个子树,中序遍历结果都是2,1,无法区分;但是用本文的hash方法,得到的hash值分别是(2,1,)和(,2,1),这两个字符串是有区别的。
  1
 /
2
 2
  \
   1
在DFS的过程中,把每个子树的hash值记录到一个unordered_map中,然后遍历unordered_map,如果该桶放了多个子树,则取出其中一个即可。 完整代码如下: [cpp] class Solution { private: string DFS(TreeNode* root, unordered_map<string, vector<TreeNode*>>& inorders) { if (root == NULL)return ""; string left = DFS(root->left, inorders); string right = DFS(root->right, inorders); string cur = "(" + left + "," + to_string(root->val) + "," + right + ")"; inorders[cur].push_back(root); return cur; } public: vector<TreeNode*> findDuplicateSubtrees(TreeNode* root) { unordered_map<string, vector<TreeNode*>> inorders; DFS(root, inorders); vector<TreeNode*> ans; for (auto &it : inorders) { if (it.second.size() > 1) ans.push_back(it.second[0]); } return ans; } }; [/cpp] 本代码提交AC,用时52MS。]]>

LeetCode Set Mismatch

LeetCode Set Mismatch The set S originally contains numbers from 1 to n. But unfortunately, due to the data error, one of the numbers in the set got duplicated to another number in the set, which results in repetition of one number and loss of another number. Given an array nums representing the data status of this set after the error. Your task is to firstly find the number occurs twice and then find the number that is missing. Return them in the form of an array. Example 1:

Input: nums = [1,2,2,4]
Output: [2,3]
Note:
  1. The given array size will in the range [2, 10000].
  2. The given array’s numbers won’t have any order.

一个大小为n的数组,本来包含1~n这n个数,但是某个数x“突变”成y了,这就导致数组中丢了x,而y重复出现了两次,现在要找出丢掉的x和重复的y。 简单题,首先用hash可以找到重复的y,然后根据数学关系可以找到x,本来1~n的和是S=n*(n+1)/2,设现在的和是T,则有x=y-T+S。 代码如下: [cpp] class Solution { public: vector<int> findErrorNums(vector<int>& nums) { vector<int> ans(2, 0); unordered_set<int> hash; int sum = 0, n = nums.size(); for (const auto& num : nums) { if (hash.find(num) != hash.end())ans[0] = num; hash.insert(num); sum += num; } ans[1] = ans[0] – sum + n*(n + 1) / 2; return ans; } }; [/cpp] 本代码提交AC,用时59MS。 还有一种空间复杂度O(1)的方法,不过需要改变原有数组。 因为数组元素是1~n的,所以遍历数组,把数值-1当作下标访问数组,使该处的元素变为原来的相反数,如果遇到某个数y-1访问的那个数是负数,说明y重复了,之前有一个y把y-1处的数变成了负数。 遍历完一遍之后,丢失的那个数x指向的x-1处的数应该是正数,因为x丢掉了,没有经过上面的过程,所以再遍历一遍可以找到x。 整个过程不需要借助额外的空间,代码如下: [cpp] class Solution { public: vector<int> findErrorNums(vector<int>& nums) { vector<int> ans(2, 0); for (int i = 0; i < nums.size(); ++i) { int idx = nums[i] < 0 ? -nums[i] : nums[i]; if (nums[idx – 1] < 0) { ans[0] = idx; } else { nums[idx – 1] = -nums[idx – 1]; } } for (int i = 0; i < nums.size(); ++i) { if (nums[i] > 0) { ans[1] = i + 1; break; } } return ans; } }; [/cpp] 本代码提交AC,用时36MS。]]>

LeetCode Continuous Subarray Sum

LeetCode Continuous Subarray Sum Given a list of non-negative numbers and a target integer k, write a function to check if the array has a continuous subarray of size at least 2 that sums up to the multiple of k, that is, sums up to n*k where n is also an integer. Example 1:

Input: [23, 2, 4, 6, 7],  k=6
Output: True
Explanation: Because [2, 4] is a continuous subarray of size 2 and sums up to 6.
Example 2:
Input: [23, 2, 6, 4, 7],  k=6
Output: True
Explanation: Because [23, 2, 6, 4, 7] is an continuous subarray of size 5 and sums up to 42.
Note:
  1. The length of the array won’t exceed 10,000.
  2. You may assume the sum of all the numbers is in the range of a signed 32-bit integer.

给定一个非负整数数组和k,问数组中是否存在长度至少为2的连续子数组,子数组的和是k的n倍,n也是一个整数。 这个题之前好像在哪遇到过。遇到连续子数组的问题,首先要想到前缀和。因为这里要求和是k的n倍,有点麻烦。 首先我们需要知道一个数学知识,如果到i的前缀和accusum1和到j的前缀和accusum2除以k的余数相等,那么(i,j]的连续子数组的和就是k的整数倍。比如第一个样例中,连续子数组的和以及连续子数组的和除以k的余数:
  • [23,25,29,35,42]
  • [5,1,5,5,0]
如果下标从0开始,则下标为0和2的accusum%k都等于5,说明(0,2]的连续子数组的和是k的整数倍。accusum(0,2]=2+4=6,确实是6的整数倍。 这个道理其实很好理解,accusum0%k=5,到了accusum2%k还等于5,说明中间只加了整数倍的k,才导致余数没变,因为整数倍的k 模k是等于0的。 知道了这个道理就好办了,我们用map保存accusum%k的值和计算到现在的accusum的下标,如果后来遇到一个accusum模k的余数在map中,说明找到了一个可能,我们再判断一下他们之间的距离是否至少为2即可。 还有一个特殊的地方需要注意的是,如果k等于0,则不能取模运算。 最后还要注意的一点是,第12行,把当前余数和下标插入map是在else分支的,只有当map中不存在这个余数才插入,否则不插入。比如上面的例子,我们遇到多个accusum模k的余数是5,但是我们map中保存的应该是第0个下标,后续的余数等于5的下标不能更新0这个下标,因为这样才能使得后续的余数相等的下标和map中的下标的差越大,即更好的满足距离至少为2的约束。 完整代码如下: [cpp] class Solution { public: bool checkSubarraySum(vector<int>& nums, int k) { unordered_map<int, int> reminders; reminders[0] = -1; int accusum = 0; for (int i = 0; i < nums.size(); ++i) { accusum += nums[i]; if (k != 0)accusum %= k; // 注意k为0时,不能取模 if (reminders.find(accusum) != reminders.end()) { if (i – reminders[accusum] >= 2)return true; } else reminders[accusum] = i; // 注意必须是在else分支 } return false; } }; [/cpp] 本代码提交AC,用时26MS。]]>

LeetCode Design Twitter

LeetCode Design Twitter

Design a simplified version of Twitter where users can post tweets, follow/unfollow another user and is able to see the 10 most recent tweets in the user’s news feed. Your design should support the following methods:
  1. postTweet(userId, tweetId): Compose a new tweet.
  2. getNewsFeed(userId): Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent.
  3. follow(followerId, followeeId): Follower follows a followee.
  4. unfollow(followerId, followeeId): Follower unfollows a followee.
Example:
Twitter twitter = new Twitter();
// User 1 posts a new tweet (id = 5).
twitter.postTweet(1, 5);
// User 1's news feed should return a list with 1 tweet id -> [5].
twitter.getNewsFeed(1);
// User 1 follows user 2.
twitter.follow(1, 2);
// User 2 posts a new tweet (id = 6).
twitter.postTweet(2, 6);
// User 1's news feed should return a list with 2 tweet ids -> [6, 5].
// Tweet id 6 should precede tweet id 5 because it is posted after tweet id 5.
twitter.getNewsFeed(1);
// User 1 unfollows user 2.
twitter.unfollow(1, 2);
// User 1's news feed should return a list with 1 tweet id -> [5],
// since user 1 is no longer following user 2.
twitter.getNewsFeed(1);

设计一个Twitter系统,实现三个接口,postTweet发文,follow某人,unfollow某人,getNewsFeed从自己以及follow的人中取出最近发表的三篇推文,按时间从近到远排序。比较麻烦的是getNewsFeed函数的实现。 先来个暴力版本,用unordered_map<int, unordered_set<int>>保存某个人follow的所有人,这样follow和unfollow的操作都可以在O(1)实现。 用vector<Tweet>保存所有推文,而且是最新postTweet的推文都push_back到末尾,所以该容器保存了全局从远到近的所有推文。那么getNewsFeed时,直接从容器末尾往前遍历所有推文,判断该推文是否是自己或者自己follow的用户发表的,如果是,则收集起来,直到收集到10篇推文。 思路和实现都非常简单,代码如下: [cpp] class Twitter { private: struct Tweet { int userId; int tweetId; Tweet(int u, int t) :userId(u), tweetId(t) {}; }; vector<Tweet> Tweets; unordered_map<int, unordered_set<int>> following; public: /** Initialize your data structure here. */ Twitter() { } /** Compose a new tweet. */ void postTweet(int userId, int tweetId) { Tweets.push_back({ userId,tweetId }); } /** Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent. */ vector<int> getNewsFeed(int userId) { vector<int> ans; for (int i = Tweets.size() – 1; i >= 0; –i) { int &uid = Tweets[i].userId; if (uid == userId || following[userId].find(uid) != following[userId].end()) { ans.push_back(Tweets[i].tweetId); if (ans.size() == 10)break; } } return ans; } /** Follower follows a followee. If the operation is invalid, it should be a no-op. */ void follow(int followerId, int followeeId) { following[followerId].insert(followeeId); } /** Follower unfollows a followee. If the operation is invalid, it should be a no-op. */ void unfollow(int followerId, int followeeId) { following[followerId].erase(followeeId); } }; [/cpp] 本代码提交AC,用时143MS。 上面的版本比较慢,主要是每次getNewsFeed时都要遍历全局所有的推文。讨论区有一种做法是把每个人发表的推文也保存到一个unordered_map<int, vector<Tweet>>中,那么在getNewsFeed时,只需要拿出自己或自己follow的用户发表的推文,放到一个优先队列中,优先队列自动按时间先后顺序排序了,从优先队列中取出10条推文即可。 为了按时间排序,需要一个全局时间戳,并且重载优先队列的比较运算符。 代码如下: [cpp] class Twitter { private: int timestamp; struct Tweet { int tweetId; int timestamp; Tweet(int tid, int ts) :tweetId(tid), timestamp(ts) {}; bool operator<(const Tweet& t)const { return timestamp < t.timestamp; } }; unordered_map<int, vector<Tweet>> Tweets; unordered_map<int, unordered_set<int>> following; public: /** Initialize your data structure here. */ Twitter() :timestamp(0) { } /** Compose a new tweet. */ void postTweet(int userId, int tweetId) { Tweets[userId].push_back({ tweetId,timestamp++ }); } /** Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent. */ vector<int> getNewsFeed(int userId) { priority_queue<Tweet> pq; for (const auto& uid : following[userId]) { for (const auto& t : Tweets[uid]) { pq.push(t); } } for (const auto& t : Tweets[userId]) { pq.push(t); } vector<int> ans; while (!pq.empty() && ans.size() < 10) { ans.push_back(pq.top().tweetId); pq.pop(); } return ans; } /** Follower follows a followee. If the operation is invalid, it should be a no-op. */ void follow(int followerId, int followeeId) { if (followerId != followeeId)following[followerId].insert(followeeId); // map中不考虑自己follow自己,因为getNewsFeed单独考虑了 } /** Follower unfollows a followee. If the operation is invalid, it should be a no-op. */ void unfollow(int followerId, int followeeId) { following[followerId].erase(followeeId); } }; [/cpp] 本代码提交AC,用时33MS,快了好几倍!]]>

LeetCode Clone Graph

133. Clone Graph

Given a reference of a node in a connected undirected graph.

Return a deep copy (clone) of the graph.

Each node in the graph contains a val (int) and a list (List[Node]) of its neighbors.

class Node {
    public int val;
    public List<Node> neighbors;
}

Test case format:

For simplicity sake, each node’s value is the same as the node’s index (1-indexed). For example, the first node with val = 1, the second node with val = 2, and so on. The graph is represented in the test case using an adjacency list.

Adjacency list is a collection of unordered lists used to represent a finite graph. Each list describes the set of neighbors of a node in the graph.

The given node will always be the first node with val = 1. You must return the copy of the given node as a reference to the cloned graph.

Example 1:

Input: adjList = [[2,4],[1,3],[2,4],[1,3]]
Output: [[2,4],[1,3],[2,4],[1,3]]
Explanation: There are 4 nodes in the graph.
1st node (val = 1)'s neighbors are 2nd node (val = 2) and 4th node (val = 4).
2nd node (val = 2)'s neighbors are 1st node (val = 1) and 3rd node (val = 3).
3rd node (val = 3)'s neighbors are 2nd node (val = 2) and 4th node (val = 4).
4th node (val = 4)'s neighbors are 1st node (val = 1) and 3rd node (val = 3).

Example 2:

Input: adjList = [[]]
Output: [[]]
Explanation: Note that the input contains one empty list. The graph consists of only one node with val = 1 and it does not have any neighbors.

Example 3:

Input: adjList = []
Output: []
Explanation: This an empty graph, it does not have any nodes.

Example 4:

Input: adjList = [[2],[1]]
Output: [[2],[1]]

Constraints:

  • 1 <= Node.val <= 100
  • Node.val is unique for each node.
  • Number of Nodes will not exceed 100.
  • There is no repeated edges and no self-loops in the graph.
  • The Graph is connected and all nodes can be visited starting from the given node.

给定一个无向图,要求克隆一个一样的图,也就是深拷贝。 图的问题,想到应该是DFS或者BFS,但是没太想明白,因为DFS过程中可能遇到之前已经new过的节点,不知道怎么判断比较好,查看讨论区,发现每new一个节点,把这个节点存到一个hash表中就好了,下回先查hash表,不中才new新的。 因为每个节点的label是unique的,所以hash的key可以是label,value是node*。 完整代码如下,注意第9行,new出来的节点要先加入hash,再递归DFS,否则递归DFS在hash表中找不到当前的节点,比如{0,0,0}这个样例就会出错。

class Solution {
private:
    unordered_map<int, UndirectedGraphNode*> graph;
public:
    UndirectedGraphNode* cloneGraph(UndirectedGraphNode* node)
    {
        if (node == NULL)
            return NULL;
        if (graph.find(node->label) != graph.end())
            return graph[node->label];
        UndirectedGraphNode* cur = new UndirectedGraphNode(node->label);
        graph[cur->label] = cur;
        for (auto& n : node->neighbors) {
            cur->neighbors.push_back(cloneGraph(n));
        }
        return cur;
    }
};

本代码提交AC,用时26MS。

二刷。换了代码模板,方法是一样的,Hash+DFS:

class Solution {
public:
	Node* clone(Node* node, unordered_map<int, Node*> &hash) {
		int val = node->val;
		if (hash.find(val) != hash.end())return hash[val];
		Node *cp = new Node(node->val);
		hash[val] = cp;
		for (int i = 0; i < node->neighbors.size(); ++i) {
			int child = node->neighbors[i]->val;
			if (hash.find(child) != hash.end()) {
				cp->neighbors.push_back(hash[child]);
			}
			else {
				cp->neighbors.push_back(clone(node->neighbors[i], hash));
			}
		}
		return cp;
	}

	Node* cloneGraph(Node* node) {
		if (node == NULL)return NULL;
		unordered_map<int, Node*> hash;
		return clone(node, hash);
	}
};

本代码提交AC,用时12MS。

LeetCode Design Excel Sum Formula

LeetCode Design Excel Sum Formula Your task is to design the basic function of Excel and implement the function of sum formula. Specifically, you need to implement the following functions: Excel(int H, char W): This is the constructor. The inputs represents the height and width of the Excel form. H is a positive integer, range from 1 to 26. It represents the height. W is a character range from ‘A’ to ‘Z’. It represents that the width is the number of characters from ‘A’ to W. The Excel form content is represented by a height * width 2D integer array C, it should be initialized to zero. You should assume that the first row of C starts from 1, and the first column of C starts from ‘A’.   void Set(int row, char column, int val): Change the value at C(row, column) to be val.   int Get(int row, char column): Return the value at C(row, column).   int Sum(int row, char column, List of Strings : numbers): This function calculate and set the value at C(row, column), where the value should be the sum of cells represented by numbers. This function return the sum result at C(row, column). This sum formula should exist until this cell is overlapped by another value or another sum formula. numbers is a list of strings that each string represent a cell or a range of cells. If the string represent a single cell, then it has the following format : ColRow. For example, “F7” represents the cell at (7, F). If the string represent a range of cells, then it has the following format : ColRow1:ColRow2. The range will always be a rectangle, and ColRow1 represent the position of the top-left cell, and ColRow2 represents the position of the bottom-right cell.   Example 1:

Excel(3,"C");
// construct a 3*3 2D array with all zero.
//   A B C
// 1 0 0 0
// 2 0 0 0
// 3 0 0 0
Set(1, "A", 2);
// set C(1,"A") to be 2.
//   A B C
// 1 2 0 0
// 2 0 0 0
// 3 0 0 0
Sum(3, "C", ["A1", "A1:B2"]);
// set C(3,"C") to be the sum of value at C(1,"A") and the values sum of the rectangle range whose top-left cell is C(1,"A") and bottom-right cell is C(2,"B"). Return 4.
//   A B C
// 1 2 0 0
// 2 0 0 0
// 3 0 0 4
Set(2, "B", 2);
// set C(2,"B") to be 2. Note C(3, "C") should also be changed.
//   A B C
// 1 2 0 0
// 2 0 2 0
// 3 0 0 6
Note:
  1. You could assume that there won’t be any circular sum reference. For example, A1 = sum(B1) and B1 = sum(A1).
  2. The test cases are using double-quotes to represent a character.
  3. Please remember to RESET your class variables declared in class Excel, as static/class variables are persisted across multiple test cases. Please see here for more details.

本题要求设计一个简单的Excel表格求和功能。主要实现三个接口:
  • Get(int row, char column),获取坐标为(row,column)的cell的值
  • Set(int row, char column, int val),把坐标为(row,column)的值设置为val
  • Sum(int row, char column, List of Strings : numbers),numbers表示一系列求和公式,把公式计算结果填入坐标(row,column)中,并且当公式中的格子更新时,该公式计算出来的值也要更新。
本题的难点是,如果C3=A1+B2,如果更新了B2,下次get(C3)时,得到的结果也必须是用更新过的B2的值。而且还有可能A1的值也是用某个公式计算出来的。 当时比赛的时候,有一些思路,但是逻辑不是很清晰,后来参考这个题解,逻辑很清楚。 Excel包含两个成员,二维矩阵matrix表示一个excel表格,hashmap formula的key为某个格子,value为该格子对应的求和公式。如果某个格子的值是实实在在的值,不是用公式计算出来的,则不在这个hashmap中。
  • 对于get,如果坐标不在formula中,说明该格子是实实在在的值,直接返回matrix中的值。否则需要从formula中取出求和公式进行计算。
  • 对于set,直接把matrix对应坐标设置为value就好,注意的是,因为set之后就变成了实实在在的值,所以要清空formula中该格子的公式(如果有的话)。
  • 对于sum,计算字符串公式的值,把结果填入对应的格子,然后在formula中设置该格子的公式。
问题的难点是怎样对一个公式求值。说穿了其实就是不停的递归调用get函数,因为get函数如果该cell没有在formula中,则返回实实在在的值,否则递归计算formula的值。想象一下,就是不停的对一个坐标递归求值,直到不能递归时,返回matrix中的值,然后递归累加起来。想明白了其实很简单,代码注意把公共的计算部分抽象出来。 完整代码如下: [cpp] class Excel { private: vector<vector<int>> matrix; unordered_map<int, vector<string>> formula; // 把坐标hash成一个int int id(int r, char c) { return r * 27 + c – ‘A’ + 1; } void parse(string& s, int& r, char& c) { c = s[0]; r = stoi(s.substr(1)); } int get_cell(string& s) { int r; char c; parse(s, r, c); return get(r, c); } int get_range(string& s) { size_t pos = s.find(‘:’); string s1 = s.substr(0, pos), s2 = s.substr(pos + 1); int r1, r2; char c1, c2; parse(s1, r1, c1); parse(s2, r2, c2); int ans = 0; for (int i = r1; i <= r2; ++i) { for (char c = c1; c <= c2; ++c) { ans += get(i, c); } } return ans; } int get_cells(vector<string>& strs) { int ans = 0; for (auto& s : strs) { if (s.find(‘:’) == string::npos) ans += get_cell(s); else ans += get_range(s); } return ans; } public: Excel(int H, char W) { matrix.clear(); formula.clear(); for (int i = 0; i <= H; ++i) { matrix.push_back(vector<int>(W – ‘A’ + 1, 0)); } } void set(int r, char c, int v) { matrix[r][c - 'A'] = v; formula.erase(id(r, c)); // caution } int get(int r, char c) { if (formula.find(id(r, c)) == formula.end())return matrix[r][c - 'A']; return get_cells(formula[id(r, c)]); } int sum(int r, char c, vector<string> strs) { int ans = get_cells(strs); matrix[r][c - 'A'] = ans; formula[id(r, c)] = strs; return ans; } }; [/cpp] 本代码提交AC,用时6MS。]]>

LeetCode LRU Cache

146. LRU Cache

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and put.

get(key) – Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
put(key, value) – Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

The cache is initialized with a positive capacity.

Follow up:
Could you do both operations in O(1) time complexity?

Example:

LRUCache cache = new LRUCache( 2 /* capacity */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // returns 1
cache.put(3, 3);    // evicts key 2
cache.get(2);       // returns -1 (not found)
cache.put(4, 4);    // evicts key 1
cache.get(1);       // returns -1 (not found)
cache.get(3);       // returns 3
cache.get(4);       // returns 4

实现一个LUR cache。 我们知道数据的访问都有时间局部性和空间局部性,所以为了加快访问速度,把经常访问的数据放到高速缓存cache中,这样先查cache,不命中再查内存或者外存。 但是cache一般都很小,当新数据需要插入cache时,如果cache满,则需要替换掉cache中旧的元素。此时替换哪个元素就有很多策略了,可以把最久没访问的替换出去,也可以把访问频率最低的替换出去,还可以把最先进入的替换出去等等,各种策略就形成了不同的缓存替换算法。 LRU的意思是Least Recently Used (LRU),即把最近最久没访问的元素替换出去,是指时间上最久没访问的,而不是频率上访问最少的哦。 题目要求实现LRU的两个接口,分别是

  1. get(key),从LRU中获取key对应的value,如果LRU中不存在key,则返回-1
  2. put(key,value),把key和对应的value插入到LRU中,如果LRU满,则要替换掉最近最久没访问的元素

要求这两个操作都在O(1)时间内完成。 我们来分析一下应该用哪些数据结构使这两个操作都在O(1)时间内完成。首先get需要在O(1)时间内判断key是否在LRU中,所以可能需要用hash来存储<key,value>对。其次,需要缓存替换时,要在O(1)时间内把最久没访问的<key,value>删掉,同时插入新的<key,value>,删除和插入能在O(1)时间内完成的就是链表了。

综合上面的分析,我们可以用unordered_map+list来实现LRU,STL的list是一个双向链表。 参考这份代码。我们使用list<int> used来保存当前在LRU中的那些key,注意是key;用unordered_map<int, pair<int, list<int>::iterator>> cache来保存真实的缓存,cache的key就是LRU中现存的key,value是一个pair,保存了key对应的value,以及这个key在链表中的迭代器(指针)。

两个操作的实现如下:

put(key,value)。每当有新<key,value>进入时,如果key已经在cache中,则只需要更新key对应的value,同时,刚访问的元素下次很可能再次访问,所以需要把这个key提到链表的表头(为了不让这个key沉到表尾)。如果key不在cache中,但cache已满,则需要把链表尾部的key对应的数据在cache中删掉,同时也需要在链表中删掉这个尾节点;如果cache不满,则把新key插入到链表表头,同时把<key,value>插入到cache中,且key在链表中的位置是list.begin()。

get(key)。如果key不在cache中,直接返回-1。如果key在cache中,直接从cache中得到对应的value,同时根据迭代器,把key提到链表表头。 根据上述算法描述,我们需要抽象一个函数,moveToHead(list<int>::iterator &it),即把迭代器it指向的那个key移到表头,包括两个操作,把it从it位置删掉,把it的值插到表头。

完整代码如下:

class LRUCache {
private:
    unordered_map<int, pair<int, list<int>::iterator> > cache;
    list<int> used;
    int _capacity; // 把it节点**剪切**到表头
    void moveToHead(list<int>::iterator& it)
    {
        int key = *it;
        used.erase(it);
        used.push_front(key);
        cache[key].second = used.begin();
    }
public:
    LRUCache(int capacity)
        : _capacity(capacity)
    {
    }
    int get(int key)
    {
        if (cache.find(key) == cache.end())
            return -1;
        int value = cache[key].first;
        moveToHead(cache[key].second);
        return value;
    }
    void put(int key, int value)
    {
        if (cache.find(key) != cache.end()) {
            cache[key].first = value;
            moveToHead(cache[key].second);
        }
        else {
            if (used.size() == _capacity) {
                cache.erase(used.back());
                used.pop_back();
            }
            used.push_front(key);
            cache[key] = { value, used.begin() };
        }
    }
};

本代码提交AC,用时96MS。

我们还可以自己实现list,就是一个双向链表。这个双向链表包括头结点head和尾节点tail。为了插入删除的方便,我们设置了一个用不到的head和tail,所以真正的数据其实是在head和tail之间的,如下,节点1~3才是真正的数据节点

头 --> 节 --> 节 --> 节 --> 尾
节     点     点     点     节
点 <-- 1  <-- 2 <-- 3  <-- 点

这里我们抽象出了3个函数,removeFromList和moveToHead完成上一种实现中的moveToHead功能,即先把节点从原链表中删除(相当于摘下来),然后把这个节点安放到表头(上图节点1)。removeTail就是在需要替换时,删除表尾节点(上图节点3)。

完整代码如下:

class LRUCache {
private:
    struct Node {
        int key, value;
        Node *pre, *next;
        Node(int k, int v)
            : key(k)
            , value(v)
            , pre(NULL)
            , next(NULL){};
    };
    unordered_map<int, Node*> cache;
    Node *head, *tail;
    int _capacity; // 从原链表中删除节点node(摘下)
    void removeFromList(Node* node)
    {
        node->pre->next = node->next;
        node->next->pre = node->pre;
    }
    // 把节点node放到“表头”(安放)
    void moveToHead(Node* node)
    {
        node->next = head->next;
        head->next->pre = node;
        node->pre = head;
        head->next = node;
    }
    // 删除“尾节点”
    void removeTail()
    {
        Node* target = tail->pre;
        target->pre->next = target->next;
        target->next->pre = target->pre;
        delete target;
    }

public:
    LRUCache(int capacity)
        : _capacity(capacity)
    {
        head = new Node(0, 0); // 辅助头结点
        tail = new Node(0, 0); // 辅助尾节点
        head->next = tail;
        tail->pre = head;
    }
    int get(int key)
    {
        if (cache.find(key) == cache.end())
            return -1;
        int value = cache[key]->value;
        removeFromList(cache[key]);
        moveToHead(cache[key]);
        return value;
    }
    void put(int key, int value)
    {
        if (cache.find(key) != cache.end()) {
            cache[key]->value = value;
            removeFromList(cache[key]);
            moveToHead(cache[key]);
        }
        else {
            if (cache.size() == _capacity) {
                cache.erase(tail->pre->key);
                removeTail();
            }
            Node* node = new Node(key, value);
            cache[key] = node;
            moveToHead(node);
        }
    }
};

本代码提交AC,用时98MS。

LeetCode Lowest Common Ancestor of a Binary Tree

LeetCode Lowest Common Ancestor of a Binary Tree Given a binary tree, find the lowest common ancestor (LCA) of two given nodes in the tree. According to the definition of LCA on Wikipedia: “The lowest common ancestor is defined between two nodes v and w as the lowest node in T that has both v and w as descendants (where we allow a node to be a descendant of itself).”

        _______3______
       /              \
    ___5__          ___1__
   /      \        /      \
   6      _2       0       8
         /  \
         7   4
For example, the lowest common ancestor (LCA) of nodes 5 and 1 is 3. Another example is LCA of nodes 5 and 4 is 5, since a node can be a descendant of itself according to the LCA definition.
给定一棵二叉树和两个节点p,q,问p和q的最近公共祖先节点是哪个。LCA问题。 思路:要找p和q的最近公共祖先,很直观的方法是找到从根节点分别到p和q的路径,然后把其中一条路径(比如根到p)上的点hash成path1,从另一条路径(比如根到q)的底端(q)往根节点走,把走过的每个点去path1中找,如果找到,则就是他们的LCA,因为这是距离q最近、且在path1中的点。 但是怎样找到根到p和q的路径呢。这里换一种思路,定义如下新的数据结构,par表示该节点的父节点,根节点的父节点为-1。 [cpp] struct MyNode { TreeNode* node; int par; MyNode(TreeNode* n_, int p_) :node(n_), par(p_) {}; }; [/cpp] 先整体把树层次遍历一遍,成为一个保存MyNode的数组,记录每个点的父节点在数组中的下标。在遍历的同时,记录下p和q节点的下标。这样通过下标往前递归就可以找到p和q到根节点的路径了。 找到路径之后,对其中一条路径hash,另一条路径在该hash中找。代码如下: [cpp] class Solution { private: struct MyNode { TreeNode* node; int par; MyNode(TreeNode* n_, int p_) :node(n_), par(p_) {}; }; public: TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) { vector<MyNode> nodes; int i = 0, j = 0; queue<MyNode> tree; tree.push({ root,-1 }); while (!tree.empty()) { MyNode cur = tree.front(); tree.pop(); if (cur.node == NULL) continue; if (cur.node == p)i = nodes.size(); if (cur.node == q)j = nodes.size(); tree.push({ cur.node->left,nodes.size() }); tree.push({ cur.node->right,nodes.size() }); nodes.push_back(cur); } if (i == j)return nodes[i].node; unordered_set<int> path1; while (i != -1) { path1.insert(i); i = nodes[i].par; } while (j != -1) { if (path1.find(j) != path1.end()) { return nodes[j].node; } j = nodes[j].par; } return NULL; } }; [/cpp] 本代码提交AC,用时19MS。 讨论区还有一种递归的解法,很有意思。我们在以root为根的树中找p和q的LCA,如果root等于p或q其中之一,说明p或q就是他们的LCA。否则分别在左子树和右子树中递归的找LCA,如果发现在左右子树中找到的LCA都不为null,说明他们p和q分别位于root的两个子树,类似样例中的5和1,则root就是他们的LCA。如果某个子树返回的LCA为空,说明p和q都不在这个子树中,则先遇到谁,谁就是LCA,比如样例中的5和4,都不在3的右子树,在递归3的左子树的时候,先遇到了5,所以5是LCA。 完整代码如下: [cpp] class Solution { public: TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) { if (root == NULL || root == p || root == q)return root; TreeNode* left = lowestCommonAncestor(root->left, p, q); TreeNode* right = lowestCommonAncestor(root->right, p, q); if (left != NULL&&right != NULL)return root; return left != NULL ? left : right; } }; [/cpp] 本代码提交AC,用时23MS。]]>

LeetCode Longest Uncommon Subsequence II

LeetCode Longest Uncommon Subsequence II Given a list of strings, you need to find the longest uncommon subsequence among them. The longest uncommon subsequence is defined as the longest subsequence of one of these strings and this subsequence should not be any subsequence of the other strings. A subsequence is a sequence that can be derived from one sequence by deleting some characters without changing the order of the remaining elements. Trivially, any string is a subsequence of itself and an empty string is a subsequence of any string. The input will be a list of strings, and the output needs to be the length of the longest uncommon subsequence. If the longest uncommon subsequence doesn’t exist, return -1. Example 1:

Input: "aba", "cdc", "eae"
Output: 3
Note:
  1. All the given strings’ lengths will not exceed 10.
  2. The length of the given list will be in the range of [2, 50].

给定一个字符串数组,问这些字符串的最长非公共子序列的长度是多少。非公共子序列是指任意一个字符串的子序列,不是其他所有字符串的子序列。要求这样的子序列的最大长度。 首先非公共子序列不可能来自那些出现过多次的字符串,比如数组中有两个字符串都是”abcd”,那么其中一个”abcd”的任意一个子序列,必定是另一个”abcd”的子序列,不满足非公共子序列的定义。所以我们首先对所有字符串及其出现频率hash一下,只考察那些出现一次的字符串的子序列。 另一个观察是,如果存在非公共子序列,则最长的非公共子序列肯定是某个完整的字符串,这一点从LeetCode Longest Uncommon Subsequence I可知。 所以,我们首先把字符串按长度从大到小排序,排序完之后,遍历每个字符串s,如果s出现的频率大于1,直接不考虑;否则,采用贪心策略,我们在比s更长的那些字符串中判断s是否是他们的子序列,如果是,s不是非公共子序列;否则s是非公共子序列,因为后面都是比s更短的字符串,s肯定也不是他们的子序列,又我们根据字符串长度排序了。所以s的长度就是非公共子序列的最大长度。 完整代码如下: [cpp] class Solution { private: // 判断child是否是par的子序列 bool isSubSeq(const string& par, const string& child) { int i = 0, j = 0, m = par.size(), n = child.size(); while (i < m&&j < n) { if (par[i] == child[j])++j; ++i; } return j == n; } public: int findLUSlength(vector<string>& strs) { unordered_map<string, int> hash; for (const auto& s : strs)++hash[s]; sort(strs.begin(), strs.end(), [](const string& s1, const string& s2) {return s1.size() > s2.size(); }); for (int i = 0; i < strs.size(); ++i) { if (hash[strs[i]] > 1)continue; bool isSub = false; for (int j = i – 1; j >= 0; –j) { if (isSubSeq(strs[j], strs[i])) { isSub = true; break; } } if (!isSub)return strs[i].size(); } return -1; } }; [/cpp] 本代码提交AC,用时3MS。 参考:http://bookshadow.com/weblog/2017/04/03/leetcode-longest-uncommon-subsequence-ii/]]>