In my last post I discussed the RSA problem. Involved with the RSA problem is the fact that being able to easily factor the product of two large primes would allow one to easily derive the RSA private key from the corresponding public key, rendering RSA useless.

So how would one go about factoring *m*, the product of two large primes, *p* and *q*?

The most naive approach is called trial division. It begins by dividing *m* by first 2, then 3, etc., until you find a number that has no remainder when *m* is divided by it. If *m* is not divided evenly by 2 then you can skip all multiples of 2, so in fact you can skip all composite numbers. This method will eventually find the smaller factor which leads directly to the larger one no later than when you try the square root of *n*. This is far from the most efficient method.

For certain subsets of integer factoring problems, Fermat’s method can be vastly more efficient than trial division. The closer *p* is to *q*, the more efficient Fermat’s method is. Fermat’s method was first described by Peirre de Fermat in the 17th century.

Fermat’s method is based on the fact that all positive odd integers can be represented as the difference of two squares. This is easy to show using simple algebra. If we say that

*m* = *pq*

and we make the qualification that *m* is odd, we can perform some substitutions that will allow us to show that *m* is the difference of two squares. If we arbitrarily say that *p* is larger or equal to *q* then we can define

*u* = (*p* + *q*) / 2

and

*v* = (*p* – *q*) / 2

then we can say that

*m* = *u*^{2} – *v*^{2}

which can easily be shown to be true as

*u*^{2} – *v*^{2} = ((*p* + *q*) / 2)^{2} – ((*p* – *q*) / 2)^{2}

= ((*p*^{2} + 2*pq* + *q*^{2}) / 4) – ((*p*^{2} – 2*pq* + *q*^{2}) / 4) = 4*pq* / 4 = *pq* = *m*

But saying that *m* is the difference of *u*^{2} and *v*^{2} is not necessarily the same as saying that *m* is the difference of two squares, since *u* and *v* may not be integers. We did, after all, define them as fractions. However, if *u* and *v* are integers then *m* *is* the difference of two squares. It is easy to show that *u* and *v* are integers.

We started with the qualification that *m* is odd. If that’s the case then both *p* and *q* are odd since if either *p* or *q* were even then 2 would be a factor of one of them which would mean that 2 would be a factor of *m*. Since both *p* and *q* are odd then both their sum and difference must be even because and odd number plus an odd number is an even number and an odd number minus an odd number is an even number. Therefore, both their sum and difference are evenly divided by 2 which means that both *u* and *v* are integers. *m* is therefore the difference of two squares.

So what is Fermat’s method? How does it work? It’s pretty simple, in fact. It involves searching through the possible values of *u* and *v*. *u* must be greater than or equal to √*m*. *u* is equal to √*m* when *v* is 0 and *m* is a square. It is always true, even when *m* is prime, that if *u* is (*m* + 1) / 2 and *v* is (*m* – 1) / 2, which corresponds to the factors of *m* which are *m* and 1.

((*m* + 1) / 2)^{2} – ((*m* – 1) / 2)^{2} = ((*m*^{2} + 2*m* + 1) / 4) – ((*m*^{2} – 2*m* + 1) / 4) = 4*m* / 4 = m

Fermat’s method begins by first choosing a value, we’ll call *u’* with initial value √*m*. If *m* – *u’*^{2} is also a square then *u* = *u’* and *v* = *m* – *u’*^{2}. Otherwise, we increment *u’* and test it again. Eventually this method will find *u* in *u* – √*m* iterations. Having found *u* and *v*, we can easily derive *p* and *q* as

*p* = *u* + *v*

and

*q* = *u* – *v*

Why is this an effective method? It isn’t always. If *p* and *q* are far apart then Fermat’s method may be much less effective than trial division. In the worst case, when *m* is prime and *p* = *m* and *q* = 1, the number of iterations required is ((*m* + 1) / 2) – √*m*. However, when *p* and *q* are close together, the number of iterations can be very small. Let’s look at an example. Let’s say that *p* is 2153, *q* is 1933, and *m* is 4161749.

*u* = (2153 + 1933) / 2 = 2043

ceil(√*m*) = 2041

In this example, we will find that *u* is 2043 in only 3 iterations. We’ve just used Fermat’s method to factor a 22-bit semiprime in only 3 steps!

Let’s make sure everything adds up.

*v* = 2043^{2} – *m* = 12100 = 110^{2}

*p* = 2043 + 110 = 2153

*q* = 2043 – 110 = 1933

What about when *p* and *q* are further apart? Let’s try a different example. This time we’ll say that *p* is 1373347, *q* is 3, and *m* is 4120041. Notice that in this example, *m* has very similar magnitude to the previous example.

*u* = (1373347 + 3) / 2 = 686675

ceil(√*m*) = 2030

Look what a difference there is in this example. This time using Fermat’s method we would have go through 684646 iterations to find *u*! That’s much worse than trial division, which would require at most 2029 iterations even without skipping composite numbers.

Clearly Fermat’s method is not useful for all factoring problems. Also, clearly it’s rather good for a subset of them. But why?

This graph illustrates the relationship between *u* and *v* for the function *u* = f(*v*) = √(m – *v*^{2}). The slope of the curve is 1 at *v* = 0 and as *v* increases, the curve is asymptotic to the line *u* = *v*.

Fermat’s method is effective at factoring numbers with factors that are close together because in that case, v is relatively small. Because the slope of the curve of *u* and *v* is close to 1 for small values of *v*, a small change in *u* translates to a large change in *v*. Incrementing *u* by 1 when *u* is close to √*m* means skipping many possible values for *v*.

So this is great, we can use Fermat’s method to solve real factoring problems. These days the recommendation is to use a public key with 2048 bits, which means we would need to factor a 2048 bit number to obtain the private key from the public key. We’re going to examine a much simpler problem. Let’s see how well Fermat’s method does on the RSA-100 challenge, which is a challenge to factor a 100 decimal digit number – a number with only 330 bits. In this example

*p* = 40094690950920881030683735292761468389214899724061

*q* = 37975227936943673922808872755445627854565536638199

*m* = 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139

*p* and *q* are very close together right? This should be no problem for Fermat’s method. Let’s see how many iterations we would have to use.

*u* = (40094690950920881030683735292761468389214899724061 + 37975227936943673922808872755445627854565536638199) / 2

= 39034959443932277476746304024103548121890218181130

ceil(√*m*) = 39020571855401265512289573339484371018905006900195

So to factor Fermat’s method requires 14387588531011964456730684619177102985211280935 iterations to find *u*. That’s much better than trial division. In fact, it’s approximately 2712 times better. But it still takes 1.44e+45 iterations! That’s a lot of iterations. If you could check 1,000,000 values of *u* per second per CPU core, that would still take 1.44e+39 seconds. That’s 4.56e+35 CPU-years. Remember, this was only a 330-bit number. Clearly Fermat’s method is not very useful to factor 2048-bit RSA keys.

There are actually much better methods of factoring large semiprimes than Fermat’s method. The two currently fastest known methods are called the Quadratic seive and the General number field seive. As it happens, both of these methods are rooted in Fermat’s method. Look for a future post to read a hopefully simple explanation about how they work.