Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Crust or bust.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


                           

Factoring Algorithm: Reciprocity

When a relevant composite number (c)=(a)*(b), the period-length of its reciprocal usually is an exact fraction of (a-1)*(b-1)
 
(0)
  [vote for,
against]

For initial background information, see my previously posted "Primality Test" Idea (linked). It was about the reciprocals of various prime numbers in various Bases. Here, we will be working with the reciprocals of various composite numbers in various Bases.

We will only be talking about odd composite numbers that are not divisible by 3 (since testing composite numbers for divisibility by 2 or by 3 is just about always done as part of the preliminaries, of a factoring algorithm). Also, we will define (c) as a composite number with two factors, (a) and (b).

My investigations of such composite numbers indicate that when the reciprocal of (c) --that is, (1/c)-- is computed in some Base such that a repeating sequence results, the length of the period is never greater than (a-1)*(b) --and then only when (c) is a perfect square. Since testing for whether or not (c) is a perfect square is also part of the preliminaries of most factoring algorithms, we can ignore that group of composites here.

Most of the time, when (1/c) yields a repeating series, the period-length is either equal to (a-1)*(b-1)/2, or it is equal to some other exact fraction of (a-1)*(b-1). For our purposes here, we shall compare that result to the reciprocal of some prime (p), as described in that "Primality Test" Idea.

The reciprocal of prime (p) may have a period-length of either (p-1) or some factor of (p-1), in which case a "Modulus Series" will always have the value of "1" at Position (p-1). The reciprocal of (c), when not a square and not divisible by 2 or by 3, can yield a Modulus Series in which the value of "1" will be found at Position (a-1)*(b-1).

Now, suppose we knew what that Position was? There is a useful technique in Algebra known as "solving simultaneous equations". Thus if we know that (a)*(b)=(c), and we also know that (a-1)*(b-1)=(P)osition, then we can use that simultaneous-equations technique to algebraically determine the values of (a) and (b), guaranteeing that we can factor (c).

In more detail, we can first rewrite (a)*(b)=(c) as (a)=(c/b), and then we can then rewrite (a-1)*(b-1)=(P) as [(c/b)-1]*(b-1)=(P). After a few more manipulations that can be written as (b^2) + (P-c-1)*(b) +c = 0, which can be plugged into the well-known Quadratic Formula. Then, after a few more manipulations, the relevant result is: (b) = {(c+1-P) ± SQR[(P-c)^2 - 2(P+c) +1]} / 2 --where "SQR" refers to computing a square root.

So, if we know the value of (P) --and we certainly know the value of (c)-- we can compute the value of (b), and later use (a)=(c/b) to compute the value of (a). Actually, because of that "±" symbol in the Quadradic Equation, it can be used to compute BOTH factors of (c) directly. ALSO, please keep in mind that that equation is Totally Generic; for ANY value of (c), if we can find an appropriate value of (P), then we can compute two factors of (c).

That Quadratic Equation is complicated enough that if we don't know the value of (P), making guesses as to its value would probably be an exercise in wasting computer-processing time. So, how difficult is it, to discover the value of (P) some other way, say by examining the Modulus Series? If (c) is a large composite number, it can be very TEDIOUS, if not particularly difficult.

But there are a couple of things that can reduce the tedium somewhat, and perhaps someone can develop these notions further, so that's why I'm posting this overall Idea here on the HalfBakery.

First, if (a) and (b) are odd numbers, then (a-1) and (b-1) must be even numbers, and therefore (a-1)*(b-1)=(P) must be a number that is divisible by 4. So we only need to compute every 4th Modulus in the Series, looking for that "1".

Next, if we assign variable (r) as equal to the integer portion of the square root of (c), then we can be sure that either (a) or (b) is smaller than (r), while the other variable is larger than (r). Note that if (a) and (b) were both APPROXIMATELY equal to (r), then it logically follows that (r-1)*(r-1) is approximately equal to (P). We can use that fact to estimate that the starting point of our search, in the Modulus Series for (1/c), should be Position (r-1)*(r-1).

Next, by inspection of various values for (a) and (b) that differ from (r), it is almost always true that one of those variables gets larger faster than the other variable gets smaller. This means that the more that (a-1)*(b-1) differs from (r-1)*(r-1), the more that (a-1)*(b-1) is larger than (r-1)*(r-1).

The preceding also means that the "direction" of our search should always be in the direction of greater Positions than the estimated starting Position of (r-1)*(r-1). --AND it means that if (r-1)*(r-1) happens to be a number that is not exactly divisible by 4, our ACTUAL starting point should be the first number that IS divisible by 4, which is larger than (r-1)*(r-1).

There are a couple more quite-important things to describe, with respect to the overall search algorithm being presented here. First, the mathematical computations relevant to specifying a Modulus Series are more efficient in Base Two than in any other Base. Consider 1/77 as an example (ignore commas used as spacers):
, , , .0000001101010011...
77|1.0000000000000000...
, , ( , , , , 128) , , , , , , , , , Modulus Series begins: 2, 4, 8, 16, 32, 64
, , , , , , , , 77
, , , , ,51->102 , , , , , , , , , 51 follows the above 64, since 128 > 77
, , , , , , , , , 77
, , , ,25->50->100 , , , , , , ,25 and 50 follow that 51
, , , , , , , , , , , 77
, , , , , , 23->46->92 , , , , , 23 and 46 follow that 50
, , , , , , , , , , , , ,77
, , , , ,15->30->60->120 , , ,15 and 30 and 60 follow that 46
, , , , , , , , , , , , , , , 77
, , , , , , , , , , , , ,43->86 , , 43 follows that 60
, , , , , , , , , , , , , , , , 77
, , , , , , , , , , , , , , , , , 9 and 18 and 36 and 72 will follow that 43 ...

There is more to that Modulus Series, of course. But I'll stop here so that we can use this example more thoroughly, in terms of employing the Modulus Series to find the factors of 77 (and who cares if we already know what those factors are!).

Anyway, first, the thing that makes Base 2 most convenient, when doing Long Division, is that we never have to muliply the divisor by anything other than 1 (the latest digit in the quotient), which means we can actually just ignore the multiplication part of that Long Division process, and simply subtract (when possible) the divisor, and ONLY the divisor.

In any other Base we would have to do extra multiplications of the divisor, which are simply unnecessary in Base 2. So, to find a Modulus Series in Base 2 we simply double the previous Modulus, check to see if it is bigger than the divisor, and if it is, subtract the divisor. Then go do another doubling, regardless of whether or not we just did a subtraction.

Now, according to our earlier analysis, we want to compute the square root of 77 --which is 8-and-an-irration, so (r)=8, and then square (r-1) to obtain 49. Then we find the first number larger than 49 that is divisible by 4, and that number is 52. So the Modulus at Position 52 is the first one we want to examine, in terms of discovering the desired value of (P).

The Modulus Series for 1/77 for Base 2, as computed above, is (starting with the First Position) 2, 4, 8, 16, 32, 64, 51, 25, 50, 23, 46, 15, 30, 60, 43, 9, 18, 36, 72, ... By inspection, the Moduli at the 4th, 8th, 12th, ... Positions are 16, 25, 15, 9, ...

Using the information that was presented in that earlier "Primality Test" Idea, regarding multiplication of Moduli and Positions, it should be clear that if we multiply that 4th-Position number by itself, 16*16 and take the Modulus with respect to 77, the result is 256 mod 77 = 25, which is the number at the 4th-plus-4th-equals-8th Position. And multiplying 16*25 lets us compute 400 mod 77 = 15, which is the Modulus at the 4th-plus-8th-equals-12th Position. And so on.

The logical thing to do next is to create a Binary Representation of Position 52, the starting point for our search, so that we can then multiply appropriate Moduli to compute the Modulus in that Position of the Series. That Representation is 110100 (32+16+4).

Since we already have 16-at-the-4th and 9-at-the-16th Positions, we can compute 9*9 and 81 mod 77 = 4, for the 16th-plus-16th-equals-32nd Position. (There is more to say about that "4", in a little bit.) Thus 16*9*4 leads to 576 mod 77 = 37 at the 52nd Position.

From there, we can multiply by 16 and compute Moduli at every 4th Position after the 52nd, as previously indicated. In this particular example, only two sets of such computations are needed, to discover that the desired "1" is in the 60th Position --(37*16) mod 77 = 53 in the 56th Position, and (53*16) mod 77 = 1 in the 60th.

As an exercise for the reader, I will let you plug (P)=60 and (c)=77 into that Quadratic Equation presented earlier, from which you should be able to compute the factors of 77. You don't have to, of course, if you already know that those factors are 7 and 11, and, if you think about them in terms of (a) and (b) and things presented earlier about those variables, then note that (7-1)*(11-1)=60, pretty much guaranteeing that (P) must equal 60.

Well, that was just an example, involving a fairly small composite number. You can be sure the Modulus computations will be a LOT more lengthy/tedious when (c) is a large composite number!

Nevertheless, there is still a variant procedure that might be useful. It is to be noted that with respect multiplying/skipping our way through the Modulus Series for 1/77, that "4" we found at the 32nd Position is one of the numbers already among the first members of the Modulus Series. That is, a "4" was at the 2nd Position (somewhat ignored because that Position was not divisible by 4), and we know that --at least with respect to reciprocals-- the Zeroth Position always has the Modulus value of "1".

Since we also know that the Modulus Series is a REPEATING series, it logically follows that that "4" in the 32nd Position must be part of a repeat of the original Modulus Series. Therefore we can deduce that there should have been a "1" at the 30th Position.

The simplest way to prove it is to figure the Binary Representation of 30, which is 11110 (16+8+4+2), and multiply the Moduli at the component Positions, 9*25*16*4=14400, and (yes!) 14400 mod 77 = 1. Furthermore, we can be quite sure that 30 is the correct period-length for 1/77 in Base 2, since the Modulus Series that we originally computed via Long Division included the first 19 Positions, more than half of 30.

Therefore, if we know that there is a "1" at Position 30, and that the Modulus Series repeats, then there must also be a "1" at Position 60, Position 90, Position 120, ... --and only the FIRST of THOSE numbers is worth trying directly as a value for (P) in the Quadratic Equation, because it is the only Position smaller than (c=77) and larger than (r-1)*(r-1).

The preceding stuff regarding Modulus 4 at Positions 32 and 2 was presented as a result of an "accidental" discovery. How can we formalize it as a generic procedure, for helping us factor (c)?

Before getting to the next thing, though, here's something to note, regarding the full list of the 30 members of the Modulus Series for 1/77:
2, 4, 8, 16, 32, 64, 51, 25, 50, 23, 46, 15, 30, 60, 43, 9, 18, 36, 72, 67, 57, 37, 74, 71, 65, 53, 29, 58, 39, 1

In the "Primality Test" Idea it was noted that you could take such a series and split it in half and add the halves, to obtain a sequence of constant values (the divisor) --IF the divisor was a prime number; it didn't work for composites. Nevertheless, the results of such splitting-and-adding, for composites, are still interesting:
11, 22, 44, 88, 99, 121, 88, 99, 121, 88, 99, 44, 88, 99, 44

Every one of those sums is divisible by 11, which is one of the factors of our example divisor 77. IN THEORY, that means that if we could find the halfway mark in such a Modulus Series, we could use that observation (via the well-known Greatest Common Divisor algorithm) to factor some arbitrary composite number (c). Except for the fact that sometimes (but, actually, not often), the period-length is an odd number!

Well, regarding thinking about the full period-length, I propose creating a Large Data Table. Modern computer processing power can easily create such a Table containing several billion items. The following discussion assumes we will be dealing with a value of (c) that is so large that even ten billion items is but a tiny tiny fraction of its magnitude.

Let us specify that each item in the Large Data Table should consist of two parts. The primary part is the Modulus at various initial Positions in the Modulus Series. That is, we want our Table to hold the first several billion Moduli of the Series (even if they constitute just a tiny fraction of the overall Series), from Positions 1, 2, 3, 4, ...

The secondary part of each data item should be the actual Position Number. When constructing the Table we want to simultaneously Sort it, based on each Modulus (primary data). So, when full, it will contain a collection of increasing but irregular Modulus values, and the associated secondary data (the Position Numbers) will be completely mixed up.

Let us suppose that our Table contains the first ten billion Modulus items. We now directly compute the Modulus at the 19,999,999,999th Position. We will then do a "binary search" (very fast on sorted data; see link) for that Modulus in the Large Data Table.

If the Modulus is in the Table, then we can immediately obtain its associated (secondary data) Position Number from the Table, and from THAT we can compute a Position that must contain a "1", just as in the example for 1/77 and Modulus "4" it was deduced that Position 30 was relevant. Then we can compute a sequence of trial values of (P), for the Quadratic Equation, and factoring (c) should fairly quickly follow.

If the Modulus at almost the 20-billionth Position is not in the Large Data Table, then we compute the Modulus at almost the 30-billionth Position, and do another binary search of the Table. Then we compute the Modulus at almost the 40-billionth Position, almost the 50-billionth Position, and so on. EVENTUALLY we should find a Modulus that is in the Table, after which see-previous-paragraph.

Of course, if "ten billion" is truly just a tiny tiny fraction of (c), then even that procedure could take a long long time to find a Modulus that is in the Large Data Table.

Even a sort of "shotgun" approach might not find it much faster. While in the example of 1/77 we found that "4" while computing Moduli at Positions that were powers of 2, we can't expect such to happen when skipping trillions of numbers at a time, for large-enough powers of 2. Or for large powers of 3, or 5, or 6, ....

A notion that I have not had much time to study remains to be presented. Here again is that Modulus Series for 1/77 in Base 2:
2, 4, 8, 16, 32, 64, 51, 25, 50, 23, 46, 15, 30, 60, 43, 9, 18, 36, 72, 67, 57, 37, 74, 71, 65, 53, 29, 58, 39, 1

By inspection, one can see that there are sequences of increasing-values of numbers, and there are sequences of decreasing-values of numbers. It may be possible to find some useful True Series, when we have a large-enough Modulus Series to play with (including being large enough to contain rather larger numbers than in this series).

If so, then some of the stuff I talked about in my "Factoring Algorithm: Parabolical" Idea (see link) may be applicable!

Vernon, Sep 12 2011

Primality Test Primality_20Test
As mentioned in the main text [Vernon, Sep 12 2011]

Binary Search http://en.wikipedia...ry_search_algorithm
As mentioned in the main text [Vernon, Sep 12 2011]

Factoring Algorithm: Parabolical Factoring_20Algorithm_3a_20Parabolical
As mentioned in the main text [Vernon, Sep 12 2011]

Current price of tea in China http://supertart.co...eainchina/index.php
[rcarty, Sep 12 2011]

Hey Vern! http://www.youtube....watch?v=HfZ0aAvvlkk
[rcarty, Sep 12 2011]


Please log in.
If you're not logged in, you can see what this page looks like, but you will not be able to add anything.



Annotation:







       [jutta], formatting that division problem for vertical alignments would be SO much easier if only we had the ability to do some stuff in a fixed font (like if the HTML "code" tag was supported)....
Vernon, Sep 12 2011
  

       I'm not certain it would be a good idea to make it easier for Vernon to submit / describe long-format ideas...
RayfordSteele, Sep 12 2011
  

       I don't either. I'm putting up with it, because I may be wrong and it's not hurting anyone, but I don't think the halfbakery is a good forum for recreational mathematics.
jutta, Sep 12 2011
  

       I kinda like recreational maths, as long as it's "inventive", and I don't think it's out of place here.   

       It would be nice, though, if the beginnings and ends of Vernon's posts could be brought closer together.
MaxwellBuchanan, Sep 12 2011
  

       I think a strict 500 word limit for idea length could help guide wayward halfbakers. Perhaps a new ground for marking for deletion - too long.
xaviergisz, Sep 12 2011
  

       On the other hand, how much does the 501st word cost?   

       One of the things I like about [Vernon]'s ideas is that I can read the first and last sentence and feel that I've cheated the world out of forty minutes.
MaxwellBuchanan, Sep 12 2011
  

       Sometimes I wonder if he's the Vern that Ernest was always referring to. [link]
rcarty, Sep 12 2011
  

       Any chance this solution can bring back my scroll button?
daseva, Sep 12 2011
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle