Representing integers as the sum of two squares


Almost 400 years ago, Pierre de Fermat stated that every odd prime of the form `4k+1` can be expressed as the sum of two squares:

`p = x^2 + y^2`

with integers `x` and `y`, if and only if

`p ≡ 1 mod 4`

Later on, around the year 1747, Leonhard Euler was able to prove Fermat's statement correct.

# Overview

In this post, we're presenting a recursive algorithm for finding all the possible representations, as a sum of two squares, for any given integer that can be expressed this way.

For example, if `n=12345678910111213`, we can represent it as a sum of two squares in 4 different ways:

`12345678910111213 = 1251082^2 + 111104067^2`
`12345678910111213 = 13508317^2 + 110286918^2`
`12345678910111213 = 52418877^2 + 97969078^2`
`12345678910111213 = 64959738^2 + 90143837^2`

If the prime factorization of a number is known, it can be verified if the number is expressible as a sum of two squares by checking each prime factor modulo 4. If a prime factor is congruent to `3 mod 4` , then its power must be even. Otherwise, the number cannot be represented as a sum of two squares.

In other words, a positive integer can be written as a sum of two squares, if and only if its prime decomposition contains no prime congruent to `3 mod 4` raised to an odd power.

For example, `2450 = 2 * 5^2 * 7^2`. Of the primes occurring in this decomposition, only `7` is congruent to `3 mod 4`. Its exponent in the decomposition, `2`, is even. Therefore, it must be expressible as the sum of two squares. Indeed, `2450 = 7^2 + 49^2 = 35^2 + 35^2`.

On the other hand, the prime decomposition of `3430` is `2 * 5 * 7^3`. This time, the exponent of `7` is an odd number. Therefore, `3430` cannot be written as the sum of two squares.

Based on the identity:

`a^2 * (b^2 + c^2) = (ab)^2 + (ac)^2`

we can see that every square multiplied with a prime of the form `4k+1` can also be represented as a sum of two squares. This identity will be used later on in our algorithm to, recursively, produce all the solutions.

# Algorithm

Assuming that we are given a positive integer `n` that can be represented as a sum of two squares, the first step of the algorithm is breaking this number into two parts: `n = a * b`, where `a` is the product of its prime factors congruent to `1 mod 4`, and `b` is the product of its prime factors congruent to `3 mod 4` and `2 mod 4`, each prime having an even power.

If `n` has the form `2^(2k+1) * z`, where `z` is odd, then `a` is multiplied by `2` and `b` is multiplied by `2^(2k)`.

At the end of this construction, we can see that `b` is a square. The next step is setting `b` to its square root:

`b := sqrt(b)`

If `a=1`, then `n` can be represented only as `n = b^2 + 0^2`.

If `a=2`, then `n` can be represented only as `n = b^2 + b^2`.

If `a > 2`, we need to find all the solutions to the following quadratic congruence:

`x^2 ≡ -1 mod a`

Taking square roots modulo a prime power can be done efficiently by using the Tonelli–Shanks algorithm (see Dickson's reference and Hensel's lemma).

But since `a` may not be a prime power, we need to decompose it into prime powers `p_k^(e_k)` and take the square root of `p_k^(e_k)-1` modulo `p_k^(e_k)`, where for each prime power we get back two square roots.

Then, by applying the Cartesian product to the list of pairs of square roots, we get all the Cartesian combinations.

For each Cartesian combination, we use the Chinese Remainder Theorem to combine the partial congruences into a solution to `x`.

For example, if `n = p_1^(e_1) * p_2^(e_2)`, then all the solutions to `x^2 ≡ -1 mod n` can be generated in the following way:

`A = [[sqrt(p_1^(e_1) - 1) mod p_1^(e_1), p_1^(e_1)], [p_1^(e_1) - (sqrt(p_1^(e_1) - 1) mod p_1^(e_1)), p_1^(e_1)]]`

`B = [[sqrt(p_2^(e_2) - 1) mod p_2^(e_2), p_2^(e_2)], [p_2^(e_2) - (sqrt(p_2^(e_2) - 1) mod p_2^(e_2)), p_2^(e_2)]]`

Taking the Cartesian product of `A` and `B`, we generate the following 4 combinations:

`[A_1, B_1]`
`[A_1, B_2]`
`[A_2, B_1]`
`[A_2, B_2]`

By applying the Chinese Remainder Theorem, from each Cartesian combination we get a solution to `x`.

In the next step, for each solution `s` to `x`, we use a modified version of Euclid's algorithm to compute a modified greatest common divisor of `s` and `a` that is the largest value that satisfies `s^2 <= a`, defined as:

`gcd(s, q) = (s, q)` for `s^2 <= a`

`gcd(s, q) = gcd(q, s mod q)`

This gives us the following two values:

`(s, q) := gcd(s, a)`

From which we generate a new solution, as:

`n = (b*s)^2 + (b * (q mod s))^2`

The last part of the algorithm is the recursive part, and it involves the identity described at the beginning, which uses each square `p^(2k)` in `a`, to recursively represent `a/p^(2k)` as sums of two squares, then multiplying each solution with `b * p^k`.

Algorithmically speaking:

Let `p_k^(e_k)` be the prime powers of `a`.
Let `j := e_k mod 2`. When `e_k` is even, `j=0`, otherwise `j=1`.

If `j < e_k`, recursively find solutions to `a/p_k^(e_k - j)`.

Then multiply each solution by `b * p_k^t`, where `t=(e_k-j)/2`.

As long as `j < e_k`, we increment `j` by `2` and repeat. Otherwise, we go to the next prime power of `a` and do the same steps.

In the end, all the non-negative integer solutions to `n = x^2 + y^2` are found.

# Implementations

Bellow we have three efficient implementations of the algorithm described in this post, which finds all the non-negative integer solutions to the equation `n = x^2 + y^2` for any given number `n` that can be represented as the sum of two squares, assuming that its prime factorization can be computed.

Sum of two squares - algorithm implementations:
The last two versions are implemented in the Sidef programming language. The third version uses the Tonelli-Shanks algorithm and it works for arbitrary large integers.

Extra: