I used to follow a blog Crypto Crats....and once I planned to write about a Primality testing by AKS for the blog. It was supposed in two parts, first one being very simple and describe the algorithm. In second one, I planned to describe the imporant steps in the proof of correctness of the algorithm. However, Royal Sujit never wrote the second part. So thought, let me publish the first part here. If at all I write second part too, I will request the managers of the Crypto Crats to post my articles. I know this being a technical blog, many of you won't be interested. However, if at all it interests you, you can definately give me comments and tell me how this article could have been better for a layman to understand. Thanks.
Primality Testing
What is a prime number? A number for which the only divisors are 1 and itself is called a prime number. e.g: 5 is a prime number. How? 2, 3, 4 do not divide 5. Consider, 9, it is not a prime number. 3 divides 9. Is 101 a prime number? One can check whether any number from 2 to 100 divides 101? If yes, then 101 is not prime else by definition it is. It is tedious to check all numbers. One important observation we can make here, if no number less than or equal to 10 divides 101, then no number will ever divide 101? Why? 10*10 = 100 and if p divides 101 then 101/p also divides 101. So at least on of these has to less than square root of 101. So the job is simple. Now suppose the given number to check is very big like in thousands, then checking this manually will be laborious. However, we can write a small computer program and leave it to computer to check whether given number is prime or not. This problem is called primality testing. The problem of primality testing is:
Given n to determine whether n is prime or not.
What if somebody asks is n prime, where n may hundred digit number. Then this naive primality testing would take millions of years to decide whether the n is a prime or no. Do we really need to deal with such big numbers? Yes!!! RSA, the celebrated public key algorithm in cryptography starts with two primes. Many cryptography systems require large prime numbers. How does one get these numbers? The answer is simple, generate a random number and check whether it is a prime or not. But as pointed out above the naive algorithm requires time which is proportional to square root of n. We will say it has running time O (√ n). To represent n in computer we will need log n bits. Now √ n is same as 2(log n)/2. That is running time of the algorithm is exponential in size of the input.
In cryptography we will be dealing with 1024 or 2048 bit numbers. So there is need to have to faster algorithms. Number theory is of vital importance here. So in this article, I will talk about primality testing.
Before starting with primality testing of given integer n, some basics:
We say a ≡ b mod (n) when n divides a - b.
(Note: mod operation is defined for all integers. This is not restricted only to primes.)
e.g. 36 ≡ 24 mod 12. Now suppose n is prime number.
Then, Fermat's little theorem states that an-1 ≡ 1 mod n, if a is not a multiple of n.
Now suppose we could assume that the reverse is true. That is, if an-1 ≡ 1 mod n for any a not multiple of n implies n is prime. (Here, we are checking whether n is prime and not a). Then it would have been a very simple job to decide the primality of the given integer, n. Randomly choose a between 2 and n. And check whether an-1 ≡ 1 mod n or no. This can be done in polynomial time in log n.
The bad news is that, this is not true. But variant of this has been used by Miller and Rabin for the primality testing and Miller-Rabin test runs in polynomial time. It randomly generates number, a, to check whether n is prime or no. When a test with this a returns n is prime, then a is said to be a witness of n to be prime. If n is composite and randomly chosen a still says n is prime, a is said to be a liar. It can be shown that for any composite number at least half the a are not going to be liars. When n is prime all a < n are going to be witness for n to be prime. Thus when this test says n is composite then n is definitely composite as for any prime all a are going to be witness, no a can return n to be composite. However, when test says n is prime, there is some probability that n is composite due to presence of liars. Though this seems a problem, it is not a big problem. The probability of error is < ½. At least ½ elements are not liars, they will declare n to be composite. So repeating test k times we can reduce the probability of error < (½)k.
Still this is non-zero. Can we have deterministic algorithm? In 2002, three Indian scientists from IIT proved with an explicit algorithm that given a number we can check whether it is a prime or no in time which is polynomial in size of the number (i.e. log n) and not number (n). The previous algorithms like Rabin-Miller test are also polynomial in log n. The problem with those is, they are not deterministic. If an algorithm gives answer that n is prime, then there is some probability that n is not prime. These IITians have developed an algorithm which is polynomial in log n and when it says n is prime then n is prime and when it says n is composite then n is composite. In short this is deterministic polynomial time in input size. The authors of this paper are: Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. All three received Gödel Prize and Fulkerson prize for their contributions in 2006. This algorithm is known as AKS primality testing.
Again one fact from number theory:
If n is prime,
(X + a)n ≡ Xn + a mod n for any a =1, 2, … , n-1. (1)
That is, consider polynomial (X + a)n and take coefficients modulo n.
The above statement states, we should get Xn + a.
This is true only if n is prime which means, if n is composite, then there exists a a for which (1) is not true. However this will lead to exponential running time as (X + a)n has n coefficients, and we have to keep track of all of them.
The remedy is, we will take mod with polynomial as well.
That is, when we say p(X) mod (Xr - 1), we mean Xr ≡ 1, Xr+1 ≡ X and so on. e.g Xr+1 + X+1 ≡ 2X+ 1 mod (Xr - 1). If we use mod (Xr - 1) operation as well, we have to keep track of only r coefficients. If r is of size polynomial in log n, then running time will also be polynomial in log n. Beauty is the algorithm works if r is suitable prime.
Thus we will be checking, (X + a)n mod (n, Xr - 1). If this does not happen to Xn + a mod (n, Xr - 1) then n is composite.
The AKS primality testing is as follows:
Input: n >1
Output: n is prime or composite.
Algorithm:
1. If n = qs for some integers q and s, then output composite.
2. Find suitable prime r which is roughly polynomial in log n.
(I will explain about this r in analysis of this algorithm)
3. If 1 < gcd (a, n) < n for some a ≤ r, output composite.
4. For a = 1 to √ r * log n, check
whether (X + a)n = Xn + a mod (n, Xr - 1).
If not then output composite.
5. Output prime.
So the algorithm is very simple and runs in polynomial time since all the steps can be performed in O ( polynomial (log n)) time and there are at most polynomial log n steps.
Though this is deterministic polynomial time algorithm, still running time is too high for practical usage. Miller-Rabin test runs in O (log n3) time. Nevertheless this is an important milestone in primality testing, a truly wonderful elegant solution for primality testing. Mathematicians were surprised how they overlooked this for such a long time. This has lead to mathematicians to search for any other important simple facts which have been overlooked and can be used to solve many important problems like factoring.
The algorithm is deterministic. We will see why this algorithm works and what I mean by a suitable prime r in the next part of this article. Though the analysis is simple, few basic things from abstract algebra will be assumed:
Group, Subgroup, Order of a group, Lagrange Theorem, Field. I will cite suitable references as well in next part.
To conclude this part:
We have seen AKS primality test. This takes O (log n12) time.
Hendrik Lenstra and Carl Pomerance showed that this algorithm can be improved to O (log n6).
---
Regards,
Royal Sujit