Mathematics, Number Theory

Improving on Fermat’s factoring method

In my last post I described Fermat’s factoring method. Fermat’s method is very simple but it’s not the most effective. However, I’ve discovered a significant improvement. It’s not going to beat the Quadratic sieve or General number field sieve, but it’s a substantial improvement none-the-less.

As far as I know, the improvement I describe here is an original discovery. I’ve done a bit of research looking at other improvements to Fermat’s method to see if I could find this method described elsewhere, but with no success. That being said, this method is rather simple so I’d be surprised if I was really the first person to find it.

The simplicity of this method is illustrated by the fact that its proof relies only on the basic rules of modular arithmethic. The rules are fairly simple. I’ll go over them here for background.

First, the most basic concept of modular arithmetic is the congruence relation. Modular congruence represents the equivalence of the members of a congruence class, modulo some integer n. The members of a congruence class are equivalent in the sense that they share the same remainder when divided by n. The relationship between two members of the same congruence class, a and b, (mod n), is represented as

ab (mod n)

Next, addition is distributive. This means that for any two integers, a and b, (mod n)

(a + b)(mod n) = (a (mod n)) + (b (mod n))

which means that the remainder of a divided by n plus the remainder of b divided by n is equal to the remainder of the sum of a and b divided by n.

Next, multiplication is distributive. This means that for any two integers, a and b, (mod n)

ab(mod n) = (a (mod n)) * (b (mod n))

which means that the remainder of a divided by n multiplied by the remainder of b divided by n is equal to the remainder of the product of a and b divided by n.

Last, common factors can be removed. This means that for any two integers, a and b, (mod n), and a third integer, c where c evenly divides a, b, and n

ab (mod n) implies that (a / c) ≡ (b / c) (mod (n / c))

Now, let’s examine the factoring problem again. Let’s assume that we want to factor an odd semiprime, m with factors p and q. Our goal is to discover p and q, knowing only m.

What can we possibly know about p and q? At first glance, it would seem like the answer to that question is “not much”. However, it turns out we can know a great deal.

Let’s try an enumerate some things we know:

  • m is odd
  • p is prime
  • q is prime
  • p is odd, because m is odd and p is a factor of m, which means that 2 cannot be a factor of either p or q
  • q is odd, because m is odd and q is a factor of m, which means that 2 cannot be a factor of either p or q
  • p + q is even, because both p and q are odd and the sum of two odd numbers is an even number
  • pq is even, because both p and q are odd and the difference of two odd numbers is an even number
  • φ(m) is even, where φ is Euler’s totient function, because φ(m) = (p – 1)(q – 1) and both p – 1 and q – 1 are even because both p and q are odd

Note that all these except for the first three, which were given, follow from the distributive modular addition rule, (mod 2).

That’s already quite a lot, isn’t it? We won’t stop there.

As I demonstrated in the post about Fermat’s method, we also know that because m is odd, m is the difference of two squares, which we’ll call u2 and v2 and that u = (p + q) / 2 and v = (pq) / 2, assuming we arbitrarily choose p to be the larger of p and q if they are not equal. Note, again, that u and v are both integers because both p + q and pq are even.

Here’s something new, though. We also know that u and v have opposite parity. By that I mean that if u is even then v must be odd and vice versa. We know this because m is the difference of p2 and q2 and m is odd. If u is even then u2 is even. If u is odd then u2 is odd. The same is true of v2 and v. If u2 and v2 are both even or both odd then their difference would be even. Therefore one of u2 and v2 must be even and the other must be odd and likewise one of u and v must be even and the other must be odd.

Remember that Fermat’s method iterates though all the possible values of u beginning with ceil(√m) until it finds the one where u2m is a square. What if we could know whether u is even or odd? If we could then we could reduce the number of iterations by half. For example, if we know that u is even, we can start with the first even number greater or equal to ceil(√m) and then increment by 2 for each iteration rather than 1.

It turns out that we can know if u is even or odd with complete certainty. The way we can do that is by using both the modular arithmetic distribution rules we already talked about and by examining the possible combinations of sums and products of congruence classes, (mod 4). There are only 4 congruence classes, (mod 4). Also because we know that p and q are both odd we only have to consider the odd ones. There are only 2 odd congruence classes, (mod 4), which are 1 and 3. p, q, and m must all be congruent with either 1 or 3, (mod 4). By first examining what the possible products of 1 and 3 are and then comparing that to the congruence class of m, (mod 4), we can know which congruence class combinations, (mod 4), that p and q could possibly be members of. Here are the possible multiplications of 1 and 3:

(1 * 1) ≡ 1 (mod 4)
(1 * 3) ≡ 3 (mod 4)
(3 * 3) ≡ 1 (mod 4)

Note that if p and q are both congruent with 1, (mod 4), or both congruent with 3, (mod 4), then m will be congruent with 1, (mod 4). Otherwise m will be congruent with 3, (mod 4), and one of p or q will be congruent with 1, (mod 4), and the other will be congruent with 3, (mod 4). Now let’s see what the possible sums could be of those congruence class combinations:

(1 + 1) ≡ 2 (mod 4)
(1 + 3) ≡ 0 (mod 4)
(3 + 3) ≡ 2 (mod 4)

From this we can conclude that if both p and q are congruent with 1, (mod 4), or both are congruent with 3, (mod 4), then their sum will be congruent with 2, (mod 4). Otherwise their sum will be congruent with 0, (mod 4). Therefore, if m is congruent with 1, (mod 4), then the sum of p and q is congruent with 2, (mod 4), and if m is congruent with 3, (mod 4), then the sum of p and q is congruent with 0, (mod 4).

Remember that u = (p + q) / 2. This means that p + q = 2u. Now, depending on whether m is congruent with 1, (mod 4), or with 3, (mod 4), we can know that one of the following is true:

2u ≡ 2 (mod 4) (if m ≡ 1 (mod 4))
2u ≡ 0 (mod 4) (if m ≡ 3 (mod 4))

Because 2 divides 2u, 0, 2, and 4, we can remove the common factor of 2 from both sides of the congruence and the modulus in both these expressions. Therefore the above is equivalent to the following:

u ≡ 1 (mod 2) (if m ≡ 1 (mod 4))
u ≡ 0 (mod 2) (if m ≡ 3 (mod 4))

A number that is congruent with 0, (mod 2), is an even number and a number that is congruent with 1, (mod 2), is an odd number. Therefore, if m is congruent with 3, (mod 4), then u is even and if m is congruent with 1, (mod 4), then u is odd.

Let’s look at an example, which will hopefully make this more clear. Let’s consider the case where

p = 349
q = 191
m = 66659
ceil(√m) = 259
u = (349 + 191) / 2 = 270

Now, because 66659 ≡ 3 (mod 4), we can know that u is even. Rather than starting our search for u at 259 and incrementing by 1 on each iteration, we can start at 260 and increment by 2. Our search for u will take only 6 iterations rather than 12. We’ve reduced the number of iterations by half.

But can we do better than a 2x improvement? We likely can! We can guarantee ourselves a 2x improvement if we use (mod 4), but we can get a better improvement with larger mod values. Let’s see how we do with (mod 12).

There are 6 odd congruence classes, (mod 12), which are 1, 3, 5, 7, 9, and 11. Let’s see what the possible multiplications of these classes are:

(1 * 1) ≡ 1 (mod 12)
(1 * 3) ≡ 3 (mod 12)
(1 * 5) ≡ 5 (mod 12)
(1 * 7) ≡ 7 (mod 12)
(1 * 9) ≡ 9 (mod 12)
(1 * 11) ≡ 11 (mod 12)
(3 * 3) ≡ 9 (mod 12)
(3 * 5) ≡ 3 (mod 12)
(3 * 7) ≡ 9 (mod 12)
(3 * 9) ≡ 3 (mod 12)
(3 * 11) ≡ 9 (mod 12)
(5 * 5) ≡ 1 (mod 12)
(5 * 7) ≡ 11 (mod 12)
(5 * 9) ≡ 9 (mod 12)
(5 * 11) ≡ 7 (mod 12)
(7 * 7) ≡ 1 (mod 12)
(7 * 9) ≡ 3 (mod 12)
(7 * 11) ≡ 5 (mod 12)
(9 * 9) ≡ 9 (mod 12)
(9 * 11) ≡ 3 (mod 12)
(11 * 11) ≡ 1 (mod 12)

This is a much larger set of possible combinations than was the set of multiplications of classes (mod 4). To make it easier to evaluate them, let’s organize them into a table by congruence class:

1 3 5 7 9 11
(1 * 1) ≡ 1 (mod 12)
(5 * 5) ≡ 1 (mod 12)
(11 * 11) ≡ 1 (mod 12)
(1 * 3) ≡ 3 (mod 12)
(3 * 5) ≡ 3 (mod 12)
(3 * 9) ≡ 3 (mod 12)
(7 * 9) ≡ 3 (mod 12)
(9 * 11) ≡ 3 (mod 12)
(1 * 5) ≡ 5 (mod 12)
(7 * 11) ≡ 5 (mod 12)
(1 * 7) ≡ 7 (mod 12)
(5 * 11) ≡ 7 (mod 12)
(1 * 9) ≡ 9 (mod 12)
(3 * 3) ≡ 9 (mod 12)
(3 * 7) ≡ 9 (mod 12)
(3 * 11) ≡ 9 (mod 12)
(5 * 9) ≡ 9 (mod 12)
(9 * 9) ≡ 9 (mod 12)
(1 * 11) ≡ 11 (mod 12)
(5 * 7) ≡ 11 (mod 12)

Now let’s see what the possible sums of these classes are, (mod 12).

1 3 5 7 9 11
(1 + 1) ≡ 2 (mod 12)
(5 + 5) ≡ 10 (mod 12)
(11 + 11) ≡ 10 (mod 12)
(1 + 3) ≡ 4 (mod 12)
(3 + 5) ≡ 8 (mod 12)
(3 + 9) ≡ 0 (mod 12)
(7 + 9) ≡ 4 (mod 12)
(9 + 11) ≡ 8 (mod 12)
(1 + 5) ≡ 6 (mod 12)
(7 + 11) ≡ 6 (mod 12)
(1 + 7) ≡ 8 (mod 12)
(5 + 11) ≡ 4 (mod 12)
(1 + 9) ≡ 10 (mod 12)
(3 + 3) ≡ 6 (mod 12)
(3 + 7) ≡ 10 (mod 12)
(3 + 11) ≡ 2 (mod 12)
(5 + 9) ≡ 2 (mod 12)
(9 + 9) ≡ 6 (mod 12)
(1 + 11) ≡ 0 (mod 12)
(5 + 7) ≡ 0 (mod 12)

From this, we can conclude that

u ≡ 1 or u ≡ 5 (mod 6) (if m ≡ 1 (mod 12))
u ≡ 0 or u ≡ 2 or u ≡ 4 (mod 6) (if m ≡ 3 (mod 12))
u ≡ 3 (mod 12) (if m ≡ 5 (mod 6))
u ≡ 2 or u ≡ 4 (mod 6) (if m ≡ 7 (mod 12))
u ≡ 1 or u ≡ 3 or u ≡ 5 (mod 6) (if m ≡ 9 (mod 12))
u ≡ 0 (mod 6) (if m ≡ 11 (mod 12))

Let’s re-examine the example we checked earlier factoring m = 66659. Because 66659 ≡ 11 (mod 12), we know that u ≡ 0 (mod 6). Now when factoring 66659, we can start checking u at 264 and we can increment by 6 on each iteration. We’ve reduced the number of iterations required to only 2. This is a 6x improvement.

Can we do any better than 6x? Not with such a small value of m. Let’s try something much larger and see how well we can do.

p = 433024223
q = 413158523
m = 178907648397902629
ceil(√m) = 422974761
u = (433024223 + 413158523) / 2 = 423091373

Let’s see how well we do with (mod 96). I won’t build a table of all the possible combinations of products and sums of the congruence classes (mod 96). In this example, 178907648397902629 ≡ 37 (mod 96). I’ll tell you what such a table would show, however. It would show that one of the following is true:

u ≡ 13 (mod 48)
u ≡ 19 (mod 48)
u ≡ 29 (mod 48)
u ≡ 35 (mod 48)

How does this help us though? Which one do we start with? If we choose one and it’s the wrong one, how do we decide it’s the wrong one?

We can’t know which of these congruences is actually true. If we were to choose one and it’s not the right one and we ignore the others, we could follow it to the highest possible value of u before we could know it was the the wrong one. We could potentially make Fermat’s method far worse than the worst case, which is obviously not an improvement.

Instead, what we can do is apply Fermat’s method four times in parallel. As soon as one of the parallel attempts finds the answer, the others can stop. How much of an improvement can we get this way over Fermat’s method? It turns out that using this approach the improvement is 1/2 the modulus divided by the number of possible congruences for u. In this case, the improvement is 12x because 48 / 4 = 12.

Can we better than 12x? We can! If we choose a modulus of 1260, there are 24 possible modulus values for u. I won’t bore you with the details of what they are. With 24 possible modulus values for u, that means that using a modulus of 1260, we can get a 39.375x improvement over Fermat’s method.

If you want to examine what the possible modulus values for u are given a modulus for m, here is a short python program that calculates them:


import sys

m = long(sys.argv[1])
a = long(sys.argv[2])
mods = {}
for i in range(1L, m, 2 if m % 2 == 0 else 1):
    for j in range(i, m, 2 if m % 2 == 0 else 1):
        mult = i * j
        mm = mult % m
        s = i + j
        if mm not in mods:
            mods[mm] = set()
        mods[mm].add((s % m) / 2)

print sorted(mods[a])

Why did I choose the modulus values of 96 and 1260? Here’s where I digress into conjecture. It appears that integers with low totient values are good candidate modulus values for this method. Here totient refers to Euler’s totient function and by “integers with low totient values”, I mean integers with a low ratio of totient to value.

For example, compare the possible modulus values for u when m is 11, 12, and 13. Note that 11 and 13 are prime and thus their totients are 10 and 12 respectively. They have high totient to value ratios. On the other hand, 12 has a totient of 4. It has a totient to value ratio that is relatively low. The range for the number of possible modulus values for u using a modulus of 11 is either 5 or 6. The range of the number of possible modulus values of u using a modulus of 13 is either 6 or 7. By contrast, the range of the number of possible modulus values of u using a modulus of 12 is between 1 and 3.

Other than my own limited observations, I have no evidence that low totient to value integers are more effective than high totient to value integers. I have no mathematical proof.

The question, then, is how useful is this factoring method? I’m not positive but it appears that this method is not going to be more effective than the quadratic seive or the general number field seive methods. The value of this method depends entirely on how quickly the number of possible modulus values for u increases with respect to the low totient modulus. This is again conjecture, but as far as I can tell, the rate of improvement of this method decreases as the modulus value increases. I’ll go into a deeper analysis of the performance of this method in a future post. For now, I think the general number field seive method is safe in its place on top.

Leave a Reply

Your email address will not be published. Required fields are marked *