I'm trying to gather some statistics on prime numbers, among which is the distribution of factors for the number (prime-1)/2. I know there are general formulas for the size of factors of uniformly selected numbers, but I haven't seen anything about the distribution of factors of one less than a prime.
I've written a program to iterate through primes starting at the first prime after 2^63, and then factor the (prime - 1)/2 using trial division by all primes up to 2^32. However, this is extremely slow because that is a lot of primes (and a lot of memory) to iterate through. I store the primes as a single byte each (by storing the increment from one prime to the next). I also use a deterministic variant of the Miller-Rabin primality test for numbers up to 2^64, so I can easily detect when the remaining value (after a successful division) is prime.
I've experimented using a variant of pollard-rho and elliptic curve factorization, but it is hard to find the right balance of between trial division and switching to these more complicated methods. Also I'm not sure I've implemented them correctly, because sometimes they seem to take a very lone time to find a factor, and based on their asymptotic behavior, I'd expect them to be quite quick for such small numbers.
I have not found any information on factoring many numbers (vs just trying to factor one), but it seems like there should be some way to speed up the task by taking advantage of this.
Any suggestions, pointers to alternate approaches, or other guidance on this problem is greatly appreciated.
Edit: The way I store the primes is by storing an 8-bit offset to the next prime, with the implicit first prime being 3. Thus, in my algorithms, I have a separate check for division by 2, then I start a loop:
factorCounts = collections.Counter()
while N % 2 == 0:
factorCounts[2] += 1
N //= 2
pp = 3
for gg in smallPrimeGaps:
if pp*pp > N:
break
if N % pp == 0:
while N % pp == 0:
factorCounts[pp] += 1
N //= pp
pp += gg
Also, I used a wheel sieve to calculate the primes for trial division, and I use an algorithm based on the remainder by several primes to get the next prime after the given starting point.
I use the following for testing if a given number is prime (porting code to c++ now):
bool IsPrime(uint64_t n)
{
if(n < 341531)
return MillerRabinMulti(n, {9345883071009581737ull});
else if(n < 1050535501)
return MillerRabinMulti(n, {336781006125ull, 9639812373923155ull});
else if(n < 350269456337)
return MillerRabinMulti(n, {4230279247111683200ull, 14694767155120705706ull, 1664113952636775035ull});
else if(n < 55245642489451)
return MillerRabinMulti(n, {2ull, 141889084524735ull, 1199124725622454117, 11096072698276303650});
else if(n < 7999252175582851)
return MillerRabinMulti(n, {2ull, 4130806001517ull, 149795463772692060ull, 186635894390467037ull, 3967304179347715805ull});
else if(n < 585226005592931977)
return MillerRabinMulti(n, {2ull, 123635709730000ull, 9233062284813009ull, 43835965440333360ull, 761179012939631437ull, 1263739024124850375ull});
else
return MillerRabinMulti(n, {2ull, 325ull, 9375ull, 28178ull, 450775ull, 9780504ull, 1795265022ull});
}