Skip to main content.

Let n be a natural number. Denote by S_n the set

\{ (x, y) \in \N \mid 1 \le x, y \le n, \; \gcd(x, y) = 1 \}

of coprime pairs (x, y) in [1, n]^2, and let P_n = |S_n| be their number. It is well-known that \lim_{n\to\infty} \frac{P_n}{n^2} = \frac{6}{\pi^2} \approx 0.607927; this was first proven by Ernesto Cesàro.

We are interested in P_n itself. Even though we know that P_n \ge \frac{6 n^2}{\pi^2} - \varepsilon n^2 for some \varepsilon > 0, we do not know how this \varepsilon can be chosen.

First note that P_n = 2 P_n' - 1, where

P_n' = |\{ (x, y) \in \N \mid 1 \le x \le y \le n, \; \gcd(x, y) = 1 \}|,

and that P_n' = \sum_{y=1}^n \varphi(y), where \varphi is Euler's \varphi function. We want to find an explicit lower bound for P_n', which allows us to find an explicit lower bound for P_n itself.

As in the proof of Theorem 330 in Hardy and Wright, we have

\sum_{d=1}^n \varphi(d) = \frac{1}{2} \sum_{d=1}^n \mu(d) \biggl( \Bigl\lfloor \frac{n}{d} \Bigr\rfloor^2 + \Bigl\lfloor \frac{n}{d} \Bigr\rfloor \biggr),

where \mu is the Möbius function. Hardy and Wright compare that expression to \frac{1}{2} \sum_{d=1}^\infty \mu(d) \frac{n^2}{d^2} = \frac{n^2}{2 \zeta(2)} = \frac{3 n^2}{\pi^2} and show that the error is O(n \log n).

More precisely, it is

P_n' ={} & \frac{1}{2} \sum_{d=1}^n \mu(d) \biggl( \Bigl\lfloor \frac{n}{d} \Bigr\rfloor^2 + \Bigl\lfloor \frac{n}{d} \Bigr\rfloor \biggr) \\ {}={} & \frac{1}{2} \sum_{d=1}^\infty \mu(d) \frac{n^2}{d^2} - \frac{1}{2} \sum_{d=n+1}^\infty \mu(d) \frac{n^2}{d^2} \\ {}+{} & \frac{1}{2} \sum_{d=1}^n \mu(d) \biggl( \Bigl\lfloor \frac{n}{d}\Bigr\rfloor^2 + \Bigl\lfloor\frac{n}{d}\Bigr\rfloor - \frac{n^2}{d^2} \biggr).

First,

\biggl| \frac{1}{2} \sum_{d=n+1}^\infty \mu(d) \frac{n^2}{d^2} \biggr| \le{} & \frac{n^2}{2} \sum_{d=n+1}^\infty \frac{1}{d^2} \le \frac{n^2}{2} \int_n^\infty x^{-2} \; dx \\ {}={} & \frac{n^2}{2} \Bigl[ -\tfrac{1}{3} x^{-3} \Bigr]_n^\infty = \frac{1}{6 n}.

Next,

& \biggl| \frac{1}{2} \sum_{d=1}^n \mu(d) \biggl( \Bigl\lfloor\frac{n}{d}\Bigr\rfloor^2 + \Bigl\lfloor\frac{n}{d}\Bigr\rfloor - \frac{n^2}{d^2} \biggr) \biggr| \\ {}\le{} & \frac{1}{2} \sum_{d=1}^n \biggl| \frac{(n - (n \mod d))^2}{d^2} + \frac{n - (n \mod d)}{d} - \frac{n^2}{d^2} \biggr| \\ {}={} & \frac{1}{2} \sum_{d=1}^n \biggl| \frac{-2 n (n \mod d) + (n \mod d)^2}{d^2} + \frac{n - (n \mod d)}{d} \biggr| \\ {}\le{} & \frac{1}{2} \sum_{d=1}^n \Bigl( \frac{2 n d}{d^2} + \frac{d^2}{d^2} + \frac{n}{d} + \frac{d}{d} \Bigr) = \frac{1}{2} \sum_{d=1}^n \Bigl( \frac{3 n}{d} + 2 \Bigr) = \frac{3}{2} n \sum_{d=1}^n \frac{1}{d} + n.

Since

\sum_{d=1}^n \frac{1}{d} \le 1 + \int_1^n \frac{1}{x} \; dx = 1 + \log n,

we finally obtain

P_n' \ge \frac{3 n^2}{\pi^2} - \frac{1}{6 n} - \frac{3}{2} n (1 + \log n) - n = \frac{3 }{\pi^2} n^2 - \frac{8}{3} n - \frac{3}{2} n \log n.

With a simple computer program, I verified that

\frac{P_n}{n^2} \ge \frac{2 \cdot 494866 - 1}{1276^2} = \frac{989731}{1628176} \approx 0.6078771582433

for all n < 932453, and the above bound shows

P_n ={} & 2 P_n' - 1 \ge \frac{6}{\pi^2} n^2 - \frac{16}{3} n - 3 n \log n - 1 \\ {}\ge{} & \Bigl( \frac{6}{\pi^2} - \frac{16}{3 \cdot 932453} - \frac{3 \log 932453}{932453} - \frac{1}{932453^2} \Bigr) n^2 \\ {}>{} & 0.607877158257419 n^2

for n \ge 932453. The computer verification took less than one second. We therefore obtain:

Theorem.

We have P_n \ge \frac{989731}{1628176} n^2 > 0.6078771582433 n^2 for all n \in \N. Equality holds if and only if n = 1276.

The program for verification of the bound is the following C++ program:

 1 #include <iostream>
 2 #include <iomanip>
 3 #include <cmath>
 4 #include <vector>
 5 
 6 // Simple implementation of Euclid's algorithm
 7 unsigned long long gcd(unsigned long long a, unsigned long long b)
 8 {
 9     while (true)
10     {
11         if (a == 0)
12             return b;
13         b %= a;
14         if (b == 0)
15             return a;
16         a %= b;
17     }
18 }
19 
20 // Compute all primes up to a bound
21 
22 std::vector<unsigned> primes;
23 
24 void create_primes(unsigned max)
25 {
26     // TODO: replace with sieve of Eratosthenes
27     primes.push_back(2);
28     for (unsigned p = 3; p <= max; p += 2)
29     {
30         bool prime = true;
31         for (unsigned i = 0; i < primes.size(); ++i)
32             if ((p % primes[i]) == 0)
33             {
34                 prime = false;
35                 break;
36             }
37         if (prime)
38             primes.push_back(p);
39     }
40 }
41 
42 // Compute the Euler phi function (for not too large arguments)
43 unsigned phi(unsigned n)
44 {
45     if (n < 2)
46         return 1;
47     unsigned r = n;
48     bool found_divisor = false;
49     for (unsigned i = 0; i < primes.size(); ++i)
50         if ((n % primes[i]) == 0)
51         {
52             r = r / primes[i] * (primes[i] - 1);
53             found_divisor = true;
54         }
55     if (!found_divisor)
56         --r;
57     return r;
58 }
59 
60 int main()
61 {
62     // Compute probabilities up to:
63     unsigned N = 932453;
64     // unsigned N = 10000000;
65 
66     // Compute enough primes
67     create_primes(sqrt(N));
68     std::cout << "Prepared " << primes.size() << " primes\n";
69 
70     // Compute probabilities
71     unsigned long long C = 1;
72     for (unsigned n = 2; n < N; ++n)
73     {
74         C += phi(n); // number of pairs (x, y) with 1 <= x <= y <= n such that gcd(x, y) == 1
75         // Compute rational number representing probability
76         unsigned long long num = 2 * C - 1;
77         unsigned long long den = n;
78         den *= n;
79         unsigned long long g = gcd(num, den);
80         num /= g;
81         den /= g;
82         // Compute floating point approximation of probability
83         long double P = (long double)num / (long double)den;
84         // Output if probability is <= 0.6079
85         if (num <= (unsigned long long)((long double)0.6079 * den))
86             std::cout << n << ": " << C << ", " << num << "/" << den << " \\approx " << std::setprecision(20) << P << "\n";
87     }
88 }

On my laptop, it runs in less than one second. For N = 10\,000\,000, it requires less than 22 seconds.

Comments.

Felix Fontein wrote on November 28, 2012:

An alterantive approach can be found here (Proposition 3.2). There, we analyzed the probability that integers 0 \le x, y \le n are coprime (as opposed to 1 \le x, y \le n as here).