Notice
For a solution in Erlang
or C / C++
, go to Trial 4 below.
Wikipedia Articles
- The definition of "integer square root" could be found here
Methods of computing square roots
- An algorithm that does "bit magic" could be found here
[ Trial 1 : Using Library Function ]
Code
isqrt(N) when erlang:is_integer(N), N >= 0 ->
erlang:trunc(math:sqrt(N)).
Problem
This implementation uses the sqrt()
function from the C library, so it does not work with arbitrarily large integers (Note that the returned result does not match the input. The correct answer should be 12345678901234567890
):
Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.4 (abort with ^G)
1> erlang:trunc(math:sqrt(12345678901234567890 * 12345678901234567890)).
12345678901234567168
2>
[ Trial 2 : Using Bigint +
Only ]
Code
isqrt2(N) when erlang:is_integer(N), N >= 0 ->
isqrt2(N, 0, 3, 0).
isqrt2(N, I, _, Result) when I >= N ->
Result;
isqrt2(N, I, Times, Result) ->
isqrt2(N, I + Times, Times + 2, Result + 1).
Description
This implementation is based on the following observation:
isqrt(0) = 0 # <--- One 0
isqrt(1) = 1 # <-+
isqrt(2) = 1 # |- Three 1's
isqrt(3) = 1 # <-+
isqrt(4) = 2 # <-+
isqrt(5) = 2 # |
isqrt(6) = 2 # |- Five 2's
isqrt(7) = 2 # |
isqrt(8) = 2 # <-+
isqrt(9) = 3 # <-+
isqrt(10) = 3 # |
isqrt(11) = 3 # |
isqrt(12) = 3 # |- Seven 3's
isqrt(13) = 3 # |
isqrt(14) = 3 # |
isqrt(15) = 3 # <-+
isqrt(16) = 4 # <--- Nine 4's
...
Problem
This implementation involves only bigint additions so I expected it to run fast. However, when I fed it with 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111
, it seems to run forever on my (very fast) machine.
[ Trial 3 : Using Binary Search with Bigint +1
, -1
and div 2
Only ]
Code
Variant 1 (My original implementation)
isqrt3(N) when erlang:is_integer(N), N >= 0 ->
isqrt3(N, 1, N).
isqrt3(_N, Low, High) when High =:= Low + 1 ->
Low;
isqrt3(N, Low, High) ->
Mid = (Low + High) div 2,
MidSqr = Mid * Mid,
if
%% This also catches N = 0 or 1
MidSqr =:= N ->
Mid;
MidSqr < N ->
isqrt3(N, Mid, High);
MidSqr > N ->
isqrt3(N, Low, Mid)
end.
Variant 2 (modified above code so that the boundaries go with Mid+1 or Mid-1 instead, with reference to the answer by Vikram Bhat)
isqrt3a(N) when erlang:is_integer(N), N >= 0 ->
isqrt3a(N, 1, N).
isqrt3a(N, Low, High) when Low >= High ->
HighSqr = High * High,
if
HighSqr > N ->
High - 1;
HighSqr =< N ->
High
end;
isqrt3a(N, Low, High) ->
Mid = (Low + High) div 2,
MidSqr = Mid * Mid,
if
%% This also catches N = 0 or 1
MidSqr =:= N ->
Mid;
MidSqr < N ->
isqrt3a(N, Mid + 1, High);
MidSqr > N ->
isqrt3a(N, Low, Mid - 1)
end.
Problem
Now it solves the 79-digit number (namely 1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111
) in lightening speed, the result is shown immediately. However, it takes 60 seconds (+- 2 seconds) on my machine to solve one million (1,000,000) 61-digit numbers (namely, from 1000000000000000000000000000000000000000000000000000000000000
to 1000000000000000000000000000000000000000000000000000001000000
). I would like to do it even faster.
[ Trial 4 : Using Newton's Method with Bigint +
and div
Only ]
Code
isqrt4(0) -> 0;
isqrt4(N) when erlang:is_integer(N), N >= 0 ->
isqrt4(N, N).
isqrt4(N, Xk) ->
Xk1 = (Xk + N div Xk) div 2,
if
Xk1 >= Xk ->
Xk;
Xk1 < Xk ->
isqrt4(N, Xk1)
end.
Code in C / C++ (for your interest)
Recursive variant
#include <stdint.h>
uint32_t isqrt_impl(
uint64_t const n,
uint64_t const xk)
{
uint64_t const xk1 = (xk + n / xk) / 2;
return (xk1 >= xk) ? xk : isqrt_impl(n, xk1);
}
uint32_t isqrt(uint64_t const n)
{
if (n == 0) return 0;
if (n == 18446744073709551615ULL) return 4294967295U;
return isqrt_impl(n, n);
}
Iterative variant
#include <stdint.h>
uint32_t isqrt_iterative(uint64_t const n)
{
uint64_t xk = n;
if (n == 0) return 0;
if (n == 18446744073709551615ULL) return 4294967295U;
do
{
uint64_t const xk1 = (xk + n / xk) / 2;
if (xk1 >= xk)
{
return xk;
}
else
{
xk = xk1;
}
} while (1);
}
Problem
The Erlang code solves one million (1,000,000) 61-digit numbers in 40 seconds (+- 1 second) on my machine, so this is faster than Trial 3. Can it go even faster?
About My Machine
Processor : 3.4 GHz Intel Core i7
Memory : 32 GB 1600 MHz DDR3
OS : Mac OS X Version 10.9.1
Related Questions
The answer by user448810 uses "Newton's Method". I'm not sure whether doing the division using "integer division" is okay or not. I'll try this later as an update. [UPDATE (2015-01-11): It is okay to do so]
The answer by math involves using a 3rd party Python package
gmpy
, which is not very favourable to me, since I'm primarily interested in solving it in Erlang with only builtin facilities.The answer by DSM seems interesting. I don't really understand what is going on, but it seems that "bit magic" is involved there, and so it's not quite suitable for me too.
Infinite Recursion in Meta Integer Square Root
- This question is for C++, and the algorithm by AraK (the questioner) looks like it's from the same idea as Trial 2 above.
//
is integer division in Python, so Newton's method onx^2 - n = 0
works. – Blender//
means in Python, and therefore I doubt whether it works, since the algorithm published in Wikipedia (and anything goes with "Newton's method") works on real number (notice the footnote on floating point). – Siu Ching Pong -Asuka Kenji-isqrt(n)
. – Blenderabs(x[k+1] - x[k]) < 1
and since we're doing integer arithmetic,x[k+1]
must be equal tox[k]
in order to fulfil the condition. Therefore, when the algorithm applies to the input 3:x0 = 3
,x1 = (3 + 3 div 3) div 2 = 2
,x2 = (2 + 3 div 2) div 2 = 1
,x3 = (1 + 3 div 1) div 2 = 2
, which causes an infinite loop. To correct this, terminate whenx[k+1] >= x[k]
(because of the reason you mentioned). – Siu Ching Pong -Asuka Kenji-x_{n_1} >= x_n
as the stopping condition. I think the Wikipedia example could be updated to account for both integer and floating point division. – Blender