Thursday, April 3, 2008

Fermat Factorization Method Revisited

Abstract:

A simple method using unknown properties of the Fermat factorization method will be described below. This method exploits patterns in numbers. It will be shown that Fermat’s method extends well beyond what has been recognized it can do.

The Method

We start with the usual equation

N=p*q

in which p and q are primes with p>q for simplicity. As it stands, this equation is not very useful if we were asked to factorize N. The idea was thus to look for a way to represent N or equivalently p and q in a form that is better suited for the factorization problem.

The starting point for the search for that representation is the well known arithmetic progressions 6*k+1 and 6*k-1. These two progressions represent all the numbers, primes or composites, except the multiples of the first two primes 2 and 3.

6*k+1=7, 13, 19,…

6*k-1=5, 11, 17,…

Within one of these two progressions, the difference between two numbers is always a multiple of 6 since we can write

(6*k1+1) - (6*k2+1) = 6*(k1-k2)

and

(6*k1-1) - (6*k2-1) = 6*(k1-k2)

We can therefore write

P=q+6*k with k an integer which can take any integer value. Using this simple fact, we can rewrite N=p*q as N = (q+6*k)*q=q2 + 6*k*q and we get the usual quadratic equation:

q2+6kq-N=0

whose well known solutions can be written as:

k± =-3 ± √( 9k2+N)

For k± to be an integer, 9*k2+N must be a perfect square. So we set it equal to

9k2+N = m2

This equation is not valid for numbers N of the form

N = (6k1-1)*(6k2+1) since the difference between p and q cannot be written as a multiple of 6. However, It is easy to transform any number of the form N=6*k-1 into a number of the form N=6*k+1 simply by multiplying it by 5 to get a new number N for which the previous equation applies.

Numbers which are multiples of 2 and 3 will not be considered here either because it is easy to extract factors of 2 and 3 and then apply the above method if the result is a number of the form N = 6k+1. This issue will not be revisited.

Since p=q+6k, we can write also k=(p-q)/6 and deduce that m=(p+q)/2 from 9k2+N = m2 with p=m+3k and q=m-3k. The equation 9k2 + N = m2 can be rewritten

N=m2-9k2

This is the well known Fermat’s difference of squares equation. This equation tells us that we could factor N if we knew (p+q) or (p-q) or at least the way these two quantities vary. It turned out that (p+q) and (p-q) variations are not random or erratic but obey a simple law with the precision of a metronome.

Before we describe these variations, we will present another element which is important for the factorization problem. It turned out that it is useful to further split the progressions 6k±1 into 3 different ones, each on the basis of the sum of digits (Sd) property of the integers. It should be noted that the Sd is periodic and has a period of 3.

The 6k+1 can be transformed into:

F(1)=7 + 18k with Sd=7

F(2)=13 + 18k with Sd=4

F(3)=19 + 18k with Sd=1

and the 6k-1 can be transformed into:

F(4)=5 + 18k with Sd=5

F(5)=11 + 18k with Sd=2

F(6)=17 + 18k with Sd=8

We now give the multiplication tables for the sum of digits M1, M2 and M3 representing all the possible multiplications of numbers of the form:

M1=(6k1+1)*(6k2 +1)

M2=(6k1-1)*(6k2-1)

M3=(6k1-1)*(6k2+1)

M1

1

4

7

1

1

4

4

7

7

7

1

4

M2

2

5

8

2

4

5

1

7

8

7

4

1

M3

1

4

7

2

2

8

5

5

5

2

8

8

8

5

2

Looking at these matrices, we see that there are 4 ways to generate a number with a sum of digits of 1, 4 or 7 but only 3 ways for a number with Sd of 2, 5 or 8.

We are now ready to show how the quantities (p+q) and (p-q) vary. We will use tables to illustrate that. These tables are of course incomplete since the process can be continued indefinitely.

Table 1 for Sd=1

p

q

N

m=(p+q)/2

k=(p-q)/6

7

13

91

10

1

7

31

217

19

4

7

49

343

28

7

7

67

469

37

10

Table 2 for Sd=7

p

q

N

m=(p+q)/2

k=(p-q)/6

7

19

133

13

2

7

37

259

22

5

7

55

385

31

8

7

91

637

49

14

Table 3 for Sd=4

p

q

N

m=(p+q)/2

k=(p-q)/6

7

25

175

16

3

7

43

301

25

6

7

61

427

34

9

7

79

553

43

12

7

97

679

52

15

Tables 4, 5 and 6 and 7, 8 and 9 will show the same pattern regardless of the sum of digits of N. The value of (p+q)/2 increases by 9 and that of k increases by 3 within a given table. This simply means that regardless of the number p and q used to produce N, the value of (p+q) and (p-q) can only take a specific value. This property is very important when we are trying to factor a number N since it severely limits the allowed values for (p+q) and (p-q).

We are now ready to give an example of factorization of N=31*373=11563.

Sd(n)=7 so N can only be the product of a 4*4, a 1*7 or a 7*1 p and q values on one hand and of a possible 8*2 or 5*5 number. The last two are usually easily eliminated.

At this point, we need to guess a starting value for m. Since (p+q)> 2√N, it is easy to see that m>√N. √N here is the integer value of √N.

Since √N=107, we get

m=m(07)+9*i>107 with m(07) is the starting value of m for the very first product for Sd=7. The value of m(07) is 13. We know that all subsequent values of m can be calculated by using m=m(07)+9*i with i given by i>(107-13)/9=10.44 but since i must be an integer, we round if off to 11.

We get for m, m=13+9*11=112 and since m=(p+q)/2=112, we get the following for the allowed values of (p+q), p+q=2*112=224.

At this point, the factor base is uniquely determined. It consists of pairs whose sum adds up to 224. For clarity, we list them below.

q(i)

p(i)

1

223

7

217

13

211

19

205

25

199

31

193

37

187

43

181

49

175

55

169

61

163

67

157

73

151

79

145

85

139

91

133

97

127

103

121

we stop when we reach the last q< √N. In practice, few more pairs can be discarded since their product is bigger than N and consequently, q cannot be a factor. In the example above, 85*139=11815 > N=11563 so there is really no need to include pairs with q>85.

At this point, we know that q, the smaller factor of N, is among the q(i) of the factor base.

For clarity, we regroup the pairs according to the sum of digits in three different groups or tables.

Table 1 Sd=1*7=7

q(i)

p(i)

m

k(i)=(p-q)/6

1

223

112

37

19

205

112

31

37

187

112

25

55

169

112

19

73

151

112

13

Table 2 Sd=4*4=7

q(i)

p(i)

m

k(i)=(p-q)/6

13

211

112

33

31

193

112

27

49

175

112

21

67

157

112

15

85

139

112

9

Table 3 Sd=7*1=7

q(i)

p(i)

m

k(i)=(p-q)/6

7

217

112

35

25

199

112

29

43

181

112

23

61

163

112

17

79

145

112

11

Note that for the same value of m, the value of k decreases by 6 when q increases by 18.

We can at this point evaluate the GCD(N, q(i)) to find the factor q but it takes a lot of work to check every q(i).

To decrease the amount of work needed to factor N by evaluating the GCD, we can do some more preparatory work and use the parameter k=(p-q)/6 introduced earlier. Let’s go back to the original equation

N+9k2=m2

We now know that, as a general rule, m increases by 9 and k increases by 3 so we can write:

N = (m+9*n)2-9*(k+3*n)2 for some unknown (m,k,n)

And if we had a pair (m,k) that is in sync, we will be done because this equation is linear in n. We can rewrite the equation as:

N = (m+9n-9n-3k)(m+9n+3k+9n)=(m-3k)(m+3k+18n)

Remember m-3k=q(i) and m+3k=p(i) and m+3k+18n=p(i)+18n=p

Every pair (q(i),p(i)) of the factor base produces an N(i)=q(i)*p(i). For one and only one of these pairs, q=q(i)= the smaller factor. For that pair we can write:

N-N(i)=qp-q(i)p(i)=q(i)p-q(i)p(i)=q(i)[p-p(i)]=q(i)*18n since p-p(i)=18n

So the previous equation can be rewritten as:

N = q(i)*[p(i)+18n]=q*[p(i)+18n]

We need to determine a starting point for the iterations, which in this method, is simply the evaluation of the GCD(N, q(i)).

We already have a good value for m, namely the one we started with and used to calculate the factor base. What we need is a way to calculate the corresponding value of the parameter k that is in sync with m.

We start with calculating k from

9k2=m2-N

For N=31*373=11563 and m=112, we get

9k2=1122-11563=981 and k=10.44. This tells us right away that this k, not being an integer, is not the right k. But it does tell us that:

this k is below that of the pair (73,151) in Table 1,

this k is between the k of the pair (85,139)=9 and that of the pair (67,157)=15 in Table 2.

this k is below that of the pair (79,145) in Table 3.

At this point, we can use the higher k values as a starting point in our GCD calculation.

We know that:

k=13 and m=112 are in sync in Table1,

k=15 and m=112 are in sync in Table 2 and

k=11 and m=112 are in sync in Table 3.

But we do not want to waste any time trying values of n that we know do not fit.

For example, the value of n=0 simply means that (p,q) ( the real factors) are one pair among the factor base and we know that when we calculate k like above and we will get a perfect square for 9k^2 and an integer value for k. The number N=10117 is such a number.

So when we calculate k like above and do not get an integer value, we can use the following equation to calculate k to make sure we discard all pairs of (p(i),q(i)) that do not even give an n=1

9*(k+3*1)2=(m+9*1)2-N

In our case we get k+3= √[1212-11563]/9 =18.49 or k=18.49-3=15.49

We know that this value is not correct but it gives us a starting point for k since its value is between two values that are in sync with m.

13 ≤ k ≤ 19

15 ≤ k ≤ 21

11 ≤ k ≤ 17

So instead of starting at k=13, 15, 11 as calculated previously, we can now start at k=19, 21, 17 within each table. We always round off k to the largest value.

We use the equation

N = (m-3k(i))(m+3k(i)+18n)

We can plug in the values of k=19, 21, 17 which are all a good starting point within every table and calculate n or simply calculate GCD(N, q(i)=m-3k(i)) to check if q(i) is a factor.

In our example,

q(1)=112-3*19=55 but the GCD will tell us 55 is not a factor,

q(2)=112-3*21=49 but the GCD will tell us 49 is not a factor,

q(3)=112-3*17=61 but the GCD will tell us 61 is not a factor.

So we increase k by 6 (as k increases by 6 within a given table) and calculate the next three GCD’s to find that:

GCD(N, 112-3*25)=GCD(N,67) which is not a factor,

GCD(N, 112-3*27)=GCD(N,31)=31 and this tells us that 31 is the small factor q. The GCD(N,112-3*23)=GCD(N,43) which is not a factor.

In fact, we did not even need to calculate n. So we know that the pair (q(i),p(i))=(31,193) is the witness pair for the real factors (q,p)=(31,373).

Once we have the factor q, we can calculate p from N or we determine n and calculate it from p=m+3k + 18n=112+3*27+18n=193+180=373. Remember N-N(i)=q(i)*18n so we can easily calculate n from it.

n=[N-N(i)]/(18*q(i))=[11593-31*193]/(18*31)=10

At this point the number N has been factored into N=31*373.

We conclude by saying that it is not known at this point how well this method will perform with large N (in the hundred of digits) but we think it is interesting to find out. Doing calculations by hand is very tedious and when N is large, it is simply out of the question. We would therefore appreciate any feedback from the professionals.

No comments: