LeetCode Count Primes

LeetCode Count Primes

Description:

Count the number of prime numbers less than a non-negative number, n.

Credits:
Special thanks to @mithmatt for adding this problem and creating all test cases.

Hint:

  1. Let's start with a isPrime function. To determine if a number is prime, we need to check if it is not divisible by any number less than n. The runtime complexity of isPrime function would 1be O(n) and hence counting the total prime numbers up to n would be O(n2). Could we do better?
  2. As we know the number must not be divisible by any number > n / 2, we can immediately cut the total iterations half by dividing only up to n / 2. Could we still do better?
  3. Let's write down all of 12's factors:
    2 × 6 = 12
    3 × 4 = 12
    4 × 3 = 12
    6 × 2 = 12
    

    As you can see, calculations of 4 × 3 and 6 × 2 are not necessary. Therefore, we only need to consider factors up to √n because, if n is divisible by some number p, then n = p × q and since pq, we could derive that p ≤ √n.

    Our total runtime has now improved to O(n1.5), which is slightly better. Is there a faster approach?

    public int countPrimes(int n) {
       int count = 0;
       for (int i = 1; i < n; i++) {
          if (isPrime(i)) count++;
       }
       return count;
    }
    
    private boolean isPrime(int num) {
       if (num <= 1) return false;
       // Loop's ending condition is i * i <= num instead of i <= sqrt(num)
       // to avoid repeatedly calling an expensive function sqrt().
       for (int i = 2; i * i <= num; i++) {
          if (num % i == 0) return false;
       }
       return true;
    }
    
  4. The Sieve of Eratosthenes is one of the most efficient ways to find all prime numbers up to n. But don't let that name scare you, I promise that the concept is surprisingly simple.
    Sieve of Eratosthenes: algorithm steps for primes below 121. "Sieve of Eratosthenes Animation" by SKopp is licensed under CC BY 2.0.

    We start off with a table of n numbers. Let's look at the first number, 2. We know all multiples of 2 must not be primes, so we mark them off as non-primes. Then we look at the next number, 3. Similarly, all multiples of 3 such as 3 × 2 = 6, 3 × 3 = 9, ... must not be primes, so we mark them off as well. Now we look at the next number, 4, which was already marked off. What does this tell you? Should you mark off all multiples of 4 as well?

  5. 4 is not a prime because it is divisible by 2, which means all multiples of 4 must also be divisible by 2 and were already marked off. So we can skip 4 immediately and go to the next number, 5. Now, all multiples of 5 such as 5 × 2 = 10, 5 × 3 = 15, 5 × 4 = 20, 5 × 5 = 25, ... can be marked off. There is a slight optimization here, we do not need to start from 5 × 2 = 10. Where should we start marking off?
  6. In fact, we can mark off multiples of 5 starting at 5 × 5 = 25, because 5 × 2 = 10 was already marked off by multiple of 2, similarly 5 × 3 = 15 was already marked off by multiple of 3. Therefore, if the current number is p, we can always mark off multiples of p starting at p2, then in increments of p: p2 + p, p2 + 2p, ... Now what should be the terminating loop condition?
  7. It is easy to say that the terminating loop condition is p < n, which is certainly correct but not efficient. Do you still remember Hint #3?
  8. Yes, the terminating loop condition can be p < √n, as all non-primes ≥ √n must have already been marked off. When the loop terminates, all the numbers in the table that are non-marked are prime.The Sieve of Eratosthenes uses an extra O(n) memory and its runtime complexity is O(n log log n). For the more mathematically inclined readers, you can read more about its algorithm complexity on Wikipedia.
    public int countPrimes(int n) {
       boolean[] isPrime = new boolean[n];
       for (int i = 2; i < n; i++) {
          isPrime[i] = true;
       }
       // Loop's ending condition is i * i < n instead of i < sqrt(n)
       // to avoid repeatedly calling an expensive function sqrt().
       for (int i = 2; i * i < n; i++) {
          if (!isPrime[i]) continue;
          for (int j = i * i; j < n; j += i) {
             isPrime[j] = false;
          }
       }
       int count = 0;
       for (int i = 2; i < n; i++) {
          if (isPrime[i]) count++;
       }
       return count;
    }

本题要求从1~n共有多少个素数。Hint里一步步深入写得非常详细了,简要概括一下,有两种方法,一种是常规的从1~n一个个判断是否为非数,另一种是比较巧妙的Hash方法。

常规方法。首先看看判断一个数n是否为素数的方法,因为如果n为合数,则n可以分解为p×q,又n=p×q=(√n)×(√n),假设p是较小数的话,则p≤(√n)。所以我们只需要从2~(√n)判断能否被n整除,时间复杂度为O(n0.5)。从1~n都需要判断是否为素数,所以总的时间复杂度为O(n1.5)。

完整代码如下:

class Solution {
private:
	bool isPrime(int x) {
		if (x <= 1)return false;
		for (int i = 2; i*i <= x; ++i) {
			if (x%i == 0)return false;
		}
		return true;
	}
public:
	int countPrimes(int n) {
		int ans = 0;
		for (int i = 1; i < n; ++i) {
			if (isPrime(i))++ans;
		}
		return ans;
	}
};

本代码提交TLE,在n很大的时候无法通过测试。

后来看了Hint,解法还是很巧妙的。如果一个数n是素数,则n的倍数肯定不是素数了,比如2是素数,则4=2*2、6=2*3、8=2*4...这些数肯定就不是素数了。所以我们可以建立一个1~n的Hash表,表示一个数是否为素数。初始的时候Hash[1~n]=true。然后从2开始,如果Hash[i]为true,说明i是素数,则2*i, 3*i,...都不是素数了,即Hash[2*i], Hash[3*i]..=false。如果Hash[i]为false,说明i不是素数,则i可以分解为i=p*q,使得p(或q)为素数,则之前测试p时,已经把p的所有倍数都置为false了,而i的倍数肯定也是p的倍数,所以不需要对i的倍数再置false了。

循环的终止条件是i>(√n),和第一种解法的分析一样,如果某个合数x大于(√n),则其肯定有一个素因子i是小于(√n)的,所以在测试i时,就已经把x置为false了。

另外对于素数i,我们之前分析的是要把所有2*i, 3*i...都置为false。其实这里也可以优化,我们可以从i*i开始置为false,而不需要每次都从2*i开始。比如i=5,常规是要把所有2*5, 3*5, 4*5, 5*5,...都置为false,但其实2*5, 3*5, 4*5都已经被之前的素数2或3置为false了。所以每次我们只需要从i*i开始,下标以i递增的置为false就好。(从i*i到i*i+(i-1)之间的合数被比i小的合数置为false了)

总的来说,这一题加速的技巧很多,完整代码如下:

class Solution {
public:
	int countPrimes(int n) {
		vector<bool> mark(n, true);
		for (int i = 2; i*i < n; ++i) {
			if (!mark[i])continue;
			for (int j = i*i; j < n; j += i) {
				mark[j] = false;
			}
		}
		int ans = 0;
		for (int i = 2; i < n; ++i)ans += (mark[i] ? 1 : 0);
		return ans;
	}
};

本代码提交AC,用时29MS。

Leave a Reply

Your email address will not be published. Required fields are marked *