Abstract: This paper mainly discusses how to calculate the square root of a small integer by using the sum of the number of series. This algorithm can accurately return the square root of an integer to any precision when the memory space permits. This algorithm has the advantages of simple logic and does not need to use a large number of databases. In addition, this algorithm is relatively efficient. On the current mainstream computers, it takes about five seconds to calculate the square root of an integer to 0.1 million valid digits and output them to a file. This article not only describes the algorithm, but also provides the complete C language code. The code is compiled under VC and GCC.

To describe the algorithm principle, we first start with the binary theorem. The binary Theorem gives the integer power of the sum of two numbers, for example (*X*+*Y*)*N*Expand as similar*Axbyc*The equality of the sum of items. The formula is as follows,

(1)

It is called the binary coefficient and can also be recorded. For details, see [1]. In particular, when x = 1, the formula (1) can be written as follows,

(2)

It is worth noting that when a is a positive integer, the above formula is true. When a is a real number or even a complex number, the above formula is also true. For details, see [2]. When a = 1/2, formula 2 changes

Above (2k-3 )!! And 2 K !! ! The symbol represents a double factorial. When n is an odd number, N !! = N × (n-2) × (n-4) × *... 1. When n is an even number, N !! = N × (n-2) × (n-4) ×... 2. When n = 0 or-1, n !! = 1. For more information, see [3]. According to the definition, the formula derived above can be expressed:

(3)

The recursive formula for the coefficient of K times of X is :,

The following formula is provided in [4] of the document, which is actually the same as (3), but (3) is simpler.

Replace "+ X" in Formula 3 with "-X". We can find that all coefficients in polynomial are negative. The formula is as follows:

(4)

The algorithm in this article is based on (4). The main method is as follows:

1. Find out*X*Base, which is an approximate value of the square root of the base, so that the base is slightly greater than the square root of X. The base can be expressed as Q/P. According to the following steps, an expression of the square root of N can be introduced.

(5)

For example:

2. If the root type is expanded according to (4) and the sum is obtained one by one, a number slightly less than 1 can be obtained, and then multiplied by base to get the final result.

The following describes the algorithms of step 1 and Step 2 respectively.

The objective of step 1 is to obtain a fractional representation of the approximate value of the square root of N, which is recorded as Q/P. To facilitate the calculation of the sum of series, Q*2*Cannot be greater than the integer type built in the computer. For 32-bit systems, Q*2*Cannot be greater than 2*32*-1. The following method is used to calculate the fractional expression of a floating point number and Gradually refine the iteration process.

1.1 at the beginning of the iteration, the square root of N is S, and S is calculated using the SQRT function of the C mathematical library, which can be accurate to 15-16 decimal digits, then calculate a lower bound low and the upper bound high of S. For code from 1.1 to 1.6, see findfraction in the C code below.

1.2 calculate the mid, take the denominator of low and high as the denominator of mid, and take the sum of the molecules of low and high as the numerator of mid. It is easy to prove that low <mid <High.

If the square of the numerator whose value is 1.3 compared to the mid value exceeds the maximum value of the built-in integer, go to Step 1.5.

1.4 compare the values of S and MID. If mid> S, replace the original values of high with the values of mid, and record them as high mid; otherwise, low mid.

1.5 to 1.2, continue the next iteration.

1.6 base temperature high

It is easy to see that Steps 1.2 to 1.5 are an iterative process, and the entire process is similar to 2 points. In the entire iteration process, the interval high-low is getting smaller and smaller, and the high is getting closer and closer to the square root of N.

Obtain the value of base = P/Q, and use (5) to obtain the value of REM.

2. the calculated value can be calculated using the formula (4). REM has this characteristic. The smaller the molecule, the larger the denominator, and the smaller the REM, the faster the series will converge. This method is used in the calculation process. RES is used to represent the sum of series. For the moment, the positive and negative values are not considered. item is used to represent the value of level I. Repeat the following steps until the item is small enough to reach the specified precision. In the following code, Rem. D indicates the denominator of REM, and REM. N indicates the molecules of REM.

2.1 item lifecycle 1/2

2.2 item = item * REM, that is, item * = REM. N, item/= REM. d

2.3 res + = item

2.4 I limit 2

2.5 item * = REM. N, item/= REM. d

2.6 item * = (2i-3), item/= 2I;

2.7 res + = item

2.8 check the item value. If the item is small enough, convert it to 2.11

2.9 I ++

2.10 to 2.5, continue to calculate the and res of the next item and item

2.11 multiply RES by-1. Here, we use the complement method, which is similar to the logarithm notation. the fractional part is supplemented, and the integer part is reduced by 1, for example, the value of-0.1235 indicates-1 + (1-0.1235) = (-1) + 0.8765.

2.12 so far, RES is equal to the sum of all except the first item in the series, that is.

2.13 res + = 1, and the integer part changes from-1 to 0

2.14 res * = base.

So far, the square root of N is calculated.

The following describes the expression of any number of precision and its four operations.

In this article, the C program uses a dynamic array to represent a fixed-point decimal point with any precision. The array element type is the built-in Integer type unsigned long of the C language, each element of the array represents a 9-digit 10-digit number. The long_float struct defines the representation of any number of Precision values, and the Data Pointer represents the address of the array. The number of Precision values is in the fixed-point format. The integer part is represented by data [0], the first element of the array data, and the fractional part is represented by other elements of the array. This expression meets the requirements, because the integer part of the square root of a small integer is exactly represented by one element. Len, a member of long_float, indicates the length of the array, and first_idx indicates the offset address of the first non-0 element in the array. In the calculation process, the item value will be smaller and smaller, and the first part of the array will show continuous 0. To reduce the calculation amount, it will only be counted from the first non-0 element. First_idx> = The len-1 is treated as any number of precision value is 0. To ensure accuracy, the length of the actually allocated array is two more units than the expected precision when creating an array. The storage sequence of any precision is the same as that of display. During multiplication and addition, the order from right to left is used for calculation. During division, the order from left to right is used for calculation.

The calculation of any precision required in this article only involves the following operations.

1. Add two random precision vertices. The lf_add function is used to add two random precision vertices of the same length.

2. Multiply any number of precision points by a single precision integer. the type of a single precision integer is defined as unsigned long. The lf_mul function is used to calculate any number of precision points with a single precision integer.

3. Divide any number of precision points by a single precision integer. The lf_div function is used to implement this function.

4. Perform the complement operation, that is, multiply-1 by the product of any precise number of points. The result is indicated by the complement code. The lf_neg function is used to implement this function.

5. the function lf_init is used to create a dynamic array and clear each element 0. other functions include gcd32. The gcd32 function is used to calculate the maximum number of two single-precision integers.

The complete source code is as follows.

# Include <stdio. h> # include <stdlib. h> # include <memory. h> # include <math. h> # define Radix 1000000000 # define log10_rad9 // log10 (Radix) # define max_word 0 xFFFF # If _ msc_ver> 1000 // the compiler is vctypedef unsigned _ int64 uint64; # else // the compiler is gcctypedef unsigned long uint64; # endif typedef unsigned long uint32; typedef struct _ frac {uint32 N; // numeratoruint32 D; // denominator} frac; /* define a decimal point Type. Valid numbers of numbers are stored in the array data []. Each array element data [I] can represent 9-digit 10-digit numbers. The length of the array is Len, so it can represent Len * 9-digit 10-digit numbers. Data [0] indicates the integer part, data [1] to data [len-1] indicates that the order in which digits in the decimal part are stored is the same as that in the display order, for example, data [0] = 2, data [1] = 222212345, data [2] = 111198765 indicates that the first few storage units of the 2.222212345111198765 array may be 0. To save the amount of computing, you can start from the first non-0 element of the array. The sequence number of the first non-zero array unit is represented by first_idx. For example, first_idx = 3 indicates data [0], data [1], data [2] is all 0 */typedef struct _ long_float {uint32 * data; int Len; int first_idx;} long_float; // function: Evaluate N1, maximum common divisor int gcd32 (uint32 N1, uint32 N2) // greatest common divisor {uint32 modnum; uint32 bignum = (N1> N2 )? N1: N2; uint32 smallnum = (N1 <N2 )? N1: N2; If (smallnum = 0) return bignum; while (1) {modnum = bignum % smallnum; If (modnum = 0) break; bignum = smallnum; smallnum = modnum;} return smallnum;} void frac_set (frac * frac, uint32 N, uint32 d) {frac-> N = N; frac-> d = D ;} void lf_init (long_float * pnum, int Len) {pnum-> DATA = (uint32 *) malloc (LEN * sizeof (uint32); memset (pnum-> data, 0, sizeof (uint32) * Len); pnum-> first_idx = 0; pnum-> Len = Len;} void lf_free (long_float * pnum) {fre E (pnum-> data); pnum-> Len = 0;} void lf_add (long_float * pTARGET, const long_float * psrc) {int I; uint32 T, C; for (C = 0, I = psrc-> len-1; I> = psrc-> first_idx; I --) {T = pTARGET-> data [I] + psrc-> data [I] + C; C = 0; If (T> = Radix) {T-= Radix; C = 1;} pTARGET-> data [I] = T;} while (C> 0 & I> = 0) {T = pTARGET-> data [I] + C; C = 0; If (T> = Radix) {T-= Radix; C = 1 ;} pTARGET-> data [I --] = T;} pTARGET-> first_idx = I + 1; if (I <0) {printf ("error in % d line \ n ", __ Line _); Return ;}} void lf_mul (long_float * pTARGET, uint32 K) {int I; uint64 T, C; For (C = 0, I = pTARGET-> len-1; I >=ptarget-> first_idx; I --) {T = (uint64) (pTARGET-> data [I]) * (uint64) K + C; pTARGET-> data [I] = (uint32) (T % Radix); C = (uint32) (T/Radix);} If (C> 0) {if (I <0) {printf ("error in % d line \ n" ,__ line _); return;} pTARGET-> data [I] = (uint32) c; pTARGET-> first_idx = I ;}// return remainder uint32 lf_div (long_float * pTARGET, Uint32 K) {int I; uint64 t; uint32 m = 0; for (I = pTARGET-> first_idx; I <pTARGET-> Len; I ++) {T = (uint64) M * (uint64) Radix + pTARGET-> data [I]; pTARGET-> data [I] = (uint32) (T/K ); M = (uint32) (T % K);} If (pTARGET-> data [pTARGET-> first_idx] = 0) pTARGET-> first_idx ++; return m ;} void lf_neg (long_float * pTARGET) {int I = pTARGET-> len-1; while (pTARGET-> data [I] = 0) I --; if (I> 0) {pTARGET-> data [I] = Radix-pTARGET-> data [I]; I --; while (I >=1) {pTARGET-> data [I] = (Radix-1)-pTARGET-> data [I]; I --;}} pTARGET-> data [0] --; pTARGET-> first_idx = 0;}/* Find a fraction X, such that X equal to f approximatively and x> F */frac findfraction (double F) {frac low, mid, high; uint32 C, root_n; double F1 = f-floor (f); root_n = (uint32) (floor (f); If (F1 <= 0.5) {c = (uint32) (1.0/F1); frac_set (& low, 1, C + 1); frac_set (& High, 1, C);} else {c = (uint32) (1.0/(1.0-f1); frac_s Et (& low, C-1, c); frac_set (& High, C, C + 1);} low. N + = root_n * low. d; high. N + = root_n * High. d; If (High. n> max_word) {frac_set (& low, root_n, 1); frac_set (& High, root_n + 1,1);} while (1) {frac_set (& Mid, low. N + high. n, low. d + high. d); If (MID. n> = max_word) break; If (double) (MID. n)/(double) (MID. d)> F) High = mid; elselow = mid;} return high;} void calc_sqrt (uint32 N, long_float * pitem, long_float * presult) {int I; uint 32 max_uint32, tgcd; frac base, REM; double F; F = SQRT (double) (n);/* assume base approximatively equal to SQRT (N) and base> SQRT (N), SQRT (n) = base * SQRT (1-REM) = base * SQRT (1-(N * base. D * base. d)/(base. N * base. n) */base = findfraction (f); frac_set (& REM, (base. N * base. n)-N * base. D * base. d, base. N * base. n); tgcd = gcd32 (REM. n, Rem. d); If (tgcd> 1) {REM. n/= tgcd; REM. d/= tgcd;} printf ("SQRT (% u) = (% u/% u)/SQ RT (1-% u/% u) \ n ", N, base. n, base. d, Rem. n, Rem. d); // item [1] = REM * 1/2 // item [I] = item [I-1] * REM * (2 * i-1) /(2 * I + 1) // result = 1-sum (item [I]) Where I = 2 up to X and item [x] <10 ^-d) pitem-> data [1] = Radix/2; // calculating item [1], Now item = 1/2lf_mul (pitem, Rem. n); // item [1] = item [1] * REM, now the item = 1/2 * remlf_div (pitem, Rem. d); lf_add (presult, pitem); // sum = sum + item [I] I = 2; max_uint32 = ~ 0; while (pitem-> first_idx <pitem-> Len) {If (REM. N * (2 * I-3) <max_uint32) {lf_mul (pitem, Rem. N * (2 * I-3); // item * = REM; lf_div (pitem, Rem. d); //} else {lf_mul (pitem, Rem. n); // item * = remlf_div (pitem, Rem. d); // lf_mul (pitem, 2 * I-3); // item * = (2i-3)/2I} lf_div (pitem, 2 * I); lf_add (presult, pitem); // sum = sum + item [I] I ++;} lf_neg (presult); // presult =-presult; presult-> data [0] + = 1; // presult = presult + 1; lf_mul (presult, base. n); // result = Result * baself_div (presult, base. d);} // calculating SQRT (n) to d decimal Digtal and print the resultvoid calc_print (uint32 N, int d) {int I, bufflen; long_float item, result; uint32 R; r = (uint32) (SQRT (n); If (R * r = N) {printf ("SQRT (% u) = % u \ n ", n, R); return;} bufflen = (D + LOG10_RAD-1)/log10_rad + 3; lf_init (& item, bufflen); lf_init (& result, bufflen); calc_sqrt (n, & item, & result); printf ("\ t = % u. ", result. data [0]); for (I = 1; I <buffLen-2; I ++) printf ("% 09u", result. data [I]); printf ("\ n"); lf_free (& item); lf_free (& result);} int main (INT argc, char * argv []) {uint32 I; uint32 base; base = 2; for (I = base; I <base + 100; I ++) {calc_print (I, 100);} return 0 ;}

All rights reserved. For more information, see the source.

References:

**[1]**"**Binomial Theorem**From Wikipedia, http://en.wikipedia.org/wiki/Binomial_theorem.

**[2]**"**Binomial series**From Wikipedia, http://en.wikipedia.org/wiki/Binomial_series

**[3]**Weisstein, Eric W., "Double factorial" from mathworld, http://mathworld.wolfram.com/DoubleFactorial.html

**[4]**"**Square Root**From Wikipedia, http://en.wikipedia.org/wiki/Square_root