Introduction
Update: I recorded myself reading 419 digits from the 52nd Mersenne prime for Say the prime.
The 52nd Mersenne prime was discovered on October 21, 2024 (Source: GIMPS). Despite intense effort, it took six years to surpass the 51st! It is the largest prime number ever found so far! A prime number is a number that is only divisible by 1 and itself. A Mersenne prime is a prime number of the form \(m = 2^p - 1\), where \(p\) itself is a prime number.
The first few Mersenne primes (source: Wikipedia) are:
3, 7, 31, 127, 8191, 131071, 524287, 2147483647
The specific form of Mersenne primes enables the use of a fast prime-testing algorithm called the Lucas-Lehmer test. Mersenne primes are closely connected to perfect numbers — numbers that are equal to the sum of their proper divisors. Perfect numbers take the form \(N = 2^{p-1}(2^p-1)\), where \(2^p - 1\) is a Mersenne prime.
Mersenne primes are rare, and the numbers grow exponentially as \(p\) increases, making them increasingly difficult to discover. The computational effort needed to verify whether a number of the form \(2^p - 1\) is prime also grows significantly, requiring sophisticated algorithms and powerful hardware, especially for large candidates. In this blog post, I will generate some Mersenne primes and explore how fast they grow.
Python setup
For this project, I use the standard Python library, SymPy for fast prime number calculations, and tabulate for formatting tables in a readable way. I use the following imports:
import time
import sympy
import tabulate
To easily print tables, I created tabuliterate
, which runs a function on each value
of an iterable and optionally applies a condition.
def tabuliterate(fun, iterable, cond):
table = []
t0 = time.perf_counter()
for i, j in enumerate(iterable):
n = fun(j)
if cond is not None and not cond(n):
continue
t = time.perf_counter()
table.append([i, j, t - t0, n])
t0 = t
return table
Checking for prime numbers
One can check if a candidate number \(n\) is prime by verifying if it is divisible by any smaller prime up to \(\sqrt{n}\). Here’s a simple python implementation of this:
def is_prime(n):
if n < 2:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
Let’s use this to print all primes in the range (1, 50).
%%time
prime_table = tabuliterate(is_prime, range(1, 50), cond=lambda x: x)
print(
tabulate.tabulate(
prime_table,
headers=["i", "prime(i)", "t (s)", "is_prime"],
tablefmt="github"))
The tabuliterate
function runs is_prime
for each value in the range, and saves the
results if is_prime
returns True
. I then use tabulate
to print a well-formatted,
blog-ready Markdown table. To better understand performance, I use time.perf_counter
to measure the time between iterations (in column t (s)
). Additionally, I use the
Jupyter magic %%time
to print timing information for the entire cell. While these are
not substitutes for proper profiling, they provide sufficient insight for this blog,
optimization is not the main focus.
CPU times: user 91 μs, sys: 7 μs, total: 98 μs
Wall time: 101 μs
i | prime(i) | t (s) | is_prime |
---|---|---|---|
1 | 2 | 1.4666e-05 | True |
2 | 3 | 2.45901e-06 | True |
3 | 5 | 3.66601e-06 | True |
4 | 7 | 3.45899e-06 | True |
5 | 11 | 5.83302e-06 | True |
6 | 13 | 3.208e-06 | True |
7 | 17 | 4.959e-06 | True |
8 | 19 | 2.541e-06 | True |
9 | 23 | 5.58398e-06 | True |
10 | 29 | 6.916e-06 | True |
11 | 31 | 2.79202e-06 | True |
12 | 37 | 7.125e-06 | True |
13 | 41 | 5.04201e-06 | True |
14 | 43 | 3.375e-06 | True |
15 | 47 | 4.95798e-06 | True |
16 | 53 | 7.04202e-06 | True |
17 | 59 | 8.41598e-06 | True |
18 | 61 | 3.667e-06 | True |
19 | 67 | 7.208e-06 | True |
20 | 71 | 5.04201e-06 | True |
21 | 73 | 3.54199e-06 | True |
22 | 79 | 7.125e-06 | True |
23 | 83 | 5.37501e-06 | True |
24 | 89 | 6.666e-06 | True |
25 | 97 | 8.542e-06 | True |
Now let’s try a few bigger numbers, including powers of ten and some known Mersenne primes:
%%time
candidates = 1000, 10000, 100000, 1000000, 131071, 524287, 2147483647
is_prime_table = tabuliterate(is_prime, candidates)
print(tabulate.tabulate(
is_prime_table,
headers=["i", "n", "t (s)", "is_prime"],
tablefmt="github"))
CPU times: user 8.42 ms, sys: 367 μs, total: 8.79 ms
Wall time: 8.92 ms
i | n | t (s) | is_prime |
---|---|---|---|
0 | 1000 | 1.1542e-05 | False |
1 | 10000 | 4.29098e-06 | False |
2 | 100000 | 3.29202e-06 | False |
3 | 1000000 | 2.29198e-06 | False |
4 | 131071 | 4.9333e-05 | True |
5 | 524287 | 0.000114417 | True |
6 | 2147483647 | 0.00870742 | True |
Using this very basic and unoptimized prime checker, we can still test prime numbers relatively quickly. However, by the time we reach the 9th Mersenne prime, the computational effort required is no longer feasible for this method without significant optimization.
Generating Prime Numbers
Before attempting to find Mersenne primes, we first need to generate prime numbers for
the Mersenne exponents. Specifically, we need a function to generate the nth prime:
p = prime(n)
. We could use the is_prime
function and loop over all possible values
of n
, but as Mersenne primes grow very quickly, this approach will soon face
performance limitations. The is_prime
function doesn’t store previously checked
values, leading to redundant calculations.
To overcome this, we can use a cache dictionary {n: prime(n)}
to store previously
computed primes, allowing us to generate the nth prime more efficiently by avoiding
unnecessary recomputation. Here is an implementation to compute the n-th prime:
def nth_prime(n, cache):
if n in cache: return cache[n]
c, num = max(cache.keys()), cache[max(cache.keys())] + 2
while c < n:
is_prime = True
limit = int(num**0.5) + 1
for p in cache.values():
if p > limit: break
if num % p == 0:
is_prime = False
break
if is_prime: c, cache[c + 1] = c + 1, num
num += 2
return cache[n]
Now let’s generate some prime numbers:
%%time
cache = {1: 2, 2: 3, 3: 5}
nth_primes_table = tabuliterate(
lambda x: nth_prime(x, cache=cache),
(10**i for i in range(3, 7))
)
print(
tabulate.tabulate(
nth_primes_table,
headers=["i", "n", "t (s)", "prime(n)"],
tablefmt="github"))
CPU times: user 20.6 s, sys: 124 ms, total: 20.8 s
Wall time: 20.8 s
i | n | t (s) | prime(n) |
---|---|---|---|
0 | 1000 | 0.00183408 | 7907 |
1 | 10000 | 0.0355349 | 104717 |
2 | 100000 | 0.770876 | 1299653 |
3 | 1000000 | 19.9434 | 15485837 |
We are able to generate the millionth prime, but the approach becomes impractical for larger primes. Notably, the 8th Mersenne prime is already 2,147,483,647, making this method unsuitable without significant optimization for larger primes.
Finding Mersenne primes
Instead of trying to reinvent a faster and smarter prime generator and tester, here is the same test implemented with SymPy:
%%time
sympy_nth_primes_table = tabuliterate(
lambda x: sympy.prime,
(10**i for i in range(3, 7))
)
print(tabulate.tabulate(
sympy_nth_primes_table,
headers=["i", "n", "t (s)", "prime(n)"],
tablefmt="github"))
CPU times: user 47 μs, sys: 1 μs, total: 48 μs
Wall time: 57 μs
i | n | t (s) | prime(n) |
---|---|---|---|
0 | 1000 | 0.0163215 | 7919 |
1 | 10000 | 0.00630858 | 104729 |
2 | 100000 | 0.0111348 | 1299709 |
3 | 1000000 | 0.0328108 | 15485863 |
This approach is significantly faster. Now, let’s try finding some Mersenne primes!
To do this, we use sympy.isprime
as a filter to determine if numbers of the form are
prime:
%%time
N=600
mersenne_primes_table = tabuliterate(
lambda x: 2**x-1,
(sympy.prime(n) for n in range(1, N)),
sympy.isprime
)
CPU times: user 1min 14s, sys: 574 ms, total: 1min 14s
Wall time: 1min 14s
It takes a little over one minute to test the first 600 primes. However, we can optimize this further by leveraging SymPy’s `is_mersenne_prime, which is specifically designed for testing Mersenne primes:
%%time
N=600
mersenne_primes_table = tabuliterate(
lambda x: 2**x-1,
(sympy.prime(n) for n in range(1, N)),
sympy.ntheory.primetest.is_mersenne_prime
)
CPU times: user 1min 7s, sys: 94.7 ms, total: 1min 7s
Wall time: 1min 7s
i | n | p(n) | t (s) | D(m) | m(p) |
---|---|---|---|---|---|
1 | 0 | 2 | 4.0125e-05 | 1 | 3 |
2 | 1 | 3 | 9.33401e-06 | 1 | 7 |
3 | 2 | 5 | 6.666e-06 | 2 | 31 |
4 | 3 | 7 | 5.24998e-06 | 3 | 127 |
5 | 5 | 13 | 1.8209e-05 | 4 | 8191 |
6 | 6 | 17 | 0.0152827 | 6 | 131071 |
7 | 7 | 19 | 0.00496579 | 6 | 524287 |
8 | 10 | 31 | 0.0177748 | 10 | 2147483 |
9 | 17 | 61 | 0.0320199 | 19 | 2305843…3693951 |
10 | 23 | 89 | 0.02063 | 27 | 6189700…9562111 |
11 | 27 | 107 | 0.0131295 | 33 | 1622592…0288127 |
12 | 30 | 127 | 0.00890125 | 39 | 1701411…4105727 |
13 | 97 | 521 | 0.241578 | 157 | 6864797…5057151 |
14 | 110 | 607 | 0.0484024 | 183 | 5311379…1728127 |
15 | 206 | 1279 | 0.405773 | 386 | 1040793…8729087 |
16 | 327 | 2203 | 0.534715 | 664 | 1475979…7771007 |
17 | 338 | 2281 | 0.0510051 | 687 | 4460875…2836351 |
18 | 454 | 3217 | 0.548838 | 969 | 2591170…9315071 |
19 | 582 | 4253 | 0.639087 | 1281 | 1907970…0484991 |
20 | 601 | 4423 | 0.0981187 | 1332 | 2855425…8580607 |
21 | 1195 | 9689 | 3.24355 | 2917 | 4782202…5754111 |
22 | 1225 | 9941 | 0.17318 | 2993 | 3460882…9463551 |
23 | 1356 | 11213 | 0.750064 | 3376 | 2814112…6392191 |
24 | 2253 | 19937 | 5.40748 | 6002 | 4315424…8041471 |
25 | 2434 | 21701 | 1.1508 | 6533 | 4486791…1882751 |
26 | 2590 | 23209 | 0.983179 | 6987 | 4028741…9264511 |
27 | 4623 | 44497 | 13.4968 | 13395 | 8545098…1228671 |
28 | 8383 | 86243 | 27.3762 | 25962 | 5369279…3438207 |
We managed to test up to and found the first 28 Mersenne primes in about a minute!
Conclusion
Mersenne primes are fascinating mathematical objects with deep connections to number theory. Discovering them requires both computational ingenuity and efficient algorithms, as they grow exponentially large and exceedingly rare. I was amazed I could generate a 25962 digit Mersenne prime on my laptop in a minute. However, noticing that the last number took more than twice as much as the previous one, and the 52 is another 24 steps away… the resent discovery really is something.