LeetCode Insert Delete GetRandom O(1) – Duplicates allowed Design a data structure that supports all following operations in average O(1) time. Note: Duplicate elements are allowed.
insert(val)
: Inserts an item val to the collection.remove(val)
: Removes an item val from the collection if present.getRandom
: Returns a random element from current collection of elements. The probability of each element being returned is linearly related to the number of same value the collection contains.
// Init an empty collection. RandomizedCollection collection = new RandomizedCollection(); // Inserts 1 to the collection. Returns true as the collection did not contain 1. collection.insert(1); // Inserts another 1 to the collection. Returns false as the collection contained 1. Collection now contains [1,1]. collection.insert(1); // Inserts 2 to the collection, returns true. Collection now contains [1,1,2]. collection.insert(2); // getRandom should return 1 with the probability 2/3, and returns 2 with the probability 1/3. collection.getRandom(); // Removes 1 from the collection, returns true. Collection now contains [1,2]. collection.remove(1); // getRandom should return 1 and 2 both equally likely. collection.getRandom();
这一题在LeetCode Insert Delete GetRandom O(1)的基础上,增加了有重复元素的约束。还是用数组+Hash实现,但是Hash表中不再只是存储单个元素的下标了,而是存储所有相同元素的下标的集合,采用最大堆(priority_queue)来存储相同元素的下标集合。 插入时,数值插入到数组末尾,下标插入到hash表中该数值对应的最大堆中,如果堆原来为空的话,说明这是第一次插入该数值。删除时,取出val在数组中的最大下标(priority_queue.top()即为堆中最大值),如果不是数组末尾,则和数组末尾的数值交换,同时更新原数组末尾数值的下标最大堆,最后删掉数组末尾的数。产生随机数同样好办,直接rand下标。 完整代码如下: [cpp] class RandomizedCollection { private: unordered_map<int, priority_queue<int>> hash; vector<int> nums; public: /** Initialize your data structure here. */ RandomizedCollection() { } /** Inserts a value to the collection. Returns true if the collection did not already contain the specified element. */ bool insert(int val) { hash[val].push(nums.size()); nums.push_back(val); return hash[val].size() == 1; } /** Removes a value from the collection. Returns true if the collection contained the specified element. */ bool remove(int val) { if (hash[val].empty())return false; int pos = hash[val].top(); hash[val].pop(); if (pos != nums.size() – 1) { nums[pos] = nums[nums.size() – 1]; hash[nums[pos]].pop(); hash[nums[pos]].push(pos); } nums.pop_back(); return true; } /** Get a random element from the collection. */ int getRandom() { return nums[rand() % nums.size()]; } }; [/cpp] 本代码提交AC,用时85MS。 但是为什么要用最大堆来存储相同数值的下标集合呢,而且最大堆的插入和删除效率不是O(1)的呀,难道是因为相同的数比较少?所以每个数值的下标堆很小,所以插入和删除的复杂度近似为O(1)? 那么能否用数组来存储相同数值的下标集合呢,即用unordered_map<int, vector> hash,答案是不可以。举例如下: 假设经过了6次插入之后,数组为[10,10,20,20,30,30],此时不同数值的下标集合为:
- 10:[0,1]
- 20:[2,3]
- 30:[4,5]
- 10:[0]
- 20:[2,3]
- 30:[4,1]
- 10:[]
- 20:[2,3]
- 30:[4,0]