I never have had the need to take the square root of a number with a microcontroller. However, every few months someone on the PIC LIST asks a) can it be done and b) can it be done fast. Well one day Andy Warren makes this claim about a square root routine that he had in his possession with the following features:

- 24 words long (It turns out it was actually 28 words)
- requires only 4 bytes of RAM - Including the 2 for the 16 bit input
- Executes in 95 cycles

BUT, the original author did not grant him permission to publicize it. Bummer. BUT, quoting Andy, "Maybe this will prod someone else to recreate it." Well I took a stab at it and came up with a routine that was:

- 28 bytes long
- requires only 4 bytes of RAM - Including the 2 for the 16 bit input
- Executes in about 145 cycles

Eric Smith and Martin Nilsson independently suggested a way to squeeze one word out of the total program memory count. However, the routine was still a dog compared to Andy's screamer. So I dropped the whole square root thing until I serendipitously came across the book "The Logic of Computer Arithmetic", by Ivan Flores. He discusses two methods one can use to find the square root of a number. His first method called the "non-restoring method" is the technique I used in the program above. However, his explanation gave me another idea. So after a half dozen attempts or so I came up with a new routine:

- 30 bytes long
- requires only 3 bytes of RAM - Including the 2 for the 16 bit input ( a savings of one byte)
- Executes in 104 cycles best case, 109 average, 122 worst case.

While it's slightly slower than the one in Andy's possession, it does save one byte of RAM. Here's the routine along with some code illustrating its use. Just for grins, I've thrown in the first square root slug I wrote (it's called SQRT_A, while the faster one is called SQRT):

list p=16C64,t=ON,c=132,n=80,st=off radix dec include "P16C64.INC" s1 equ 0x20 s2 equ 0x21 N_hi equ 0x22 N_lo equ 0x23 mask equ 0x24 xl equ 0x25 xh equ 0x26 xstep equ 0x27 ORG 0 ;Reset Vector GOTO Main ORG 4 ;Interrupt Vector Main BCF STATUS,RP0 ;Point to BANK 0 CLRF xl CLRF xh MOVLW 1 MOVWF xstep xxx MOVF xh,W MOVWF N_hi MOVF xl,W MOVWF N_lo NOP CALL sqrt NOP MOVF xstep,W ADDWF xl,F SKPNC INCF xh,F GOTO xxx ;--------------------------------------------------------------- ;sqrt ; ; The purpose of this routine is take the square root of a 16 bit ;unsigned integer. ;Inputs: N_hi - High byte of the 16 bit number to be square rooted ; N_lo - Low byte " " " ;Outputs: s1 - Temporary variable ; s2 - 8 bit result returned in here ; ;27 words, 145 cycles best case sqrt__A MOVLW 0x40 MOVWF s1 CLRF s2 L2 ADDWF s2,W SUBWF N_hi,W ;W = N_hi-s1-s2 SKPNC ;if N_hi > (s1+s2) goto L3 ; then replace N_hi with N_hi-s1-s2 BTFSS s1,6 ;If this is the first time through the loop BTFSS N_lo,0 ;or If MS bit is zero, then N < s1+s2 goto L1 L3 ;N is greater than s1+s2, so replace N_hi with N_hi-s1-s2 MOVWF N_hi RLF s1,W RLF s1,W ADDWF s2,F ;s2 = s2 | (s1<<1) L1 BTFSC s1,7 ;If s1 has rolled all the way around, we're done. return ;Next, roll N left one bit position. I.e. N = (N<<1) | (N>>15). This puts the ;MS bit into the LS bit position. RLF N_hi,W ;C = MS bit of N RLF N_lo,F RLF N_hi,F CLRC RRF s1,W ;Roll s1 right one bit: s1 = ((s1>>1) | (s1<<7)) & 0xff RRF s1,F ; BTFSS s1,7 ;If this is the last time through the loop, we need to make goto L2 ;a tiny exception. ;We only get here if this is the last time through the loop. This special exception needs ;to handle a 10 bit addition. Right now, the value of s1 that got shifted into W is zero. ;It should be 1 >> 1, ie a fraction of 1/2. Or, another way to look at it is the value of ;s1 to be subtracted from N is 0x0080. Now, if N_lo is less than 0x80, then the subtraction ;will cause a borrow that must be propagated to N_hi, i.e. N_hi must be decremented. Rather ;than subtracting 1 from N_hi now, we'll instead add 1 to s2 and do the subtraction above. BTFSS N_lo,7 MOVLW 1 goto L2 ;---------------------------------------------------------- ;sqrt ; ; The purpose of this routine is take the square root of a 16 bit ;unsigned integer. ;Inputs: N_hi - High byte of the 16 bit number to be square rooted ; N_lo - Low byte " " " ;Outputs: W register returned with the 8 bit result ; ;Memory used: ; mask ; 30 Instructions ; 104 Cycles best case, 109 average, 122 worst case sqrt MOVLW 0xc0 ;Initialize value for mask MOVWF mask MOVLW 0x40 ;Initial value for the root sq1 SUBWF N_hi,F ;Subtract the root developed so far SKPC ;If it is larger than N_hi goto sq5 ; then go restore the subtraction sq2 IORWF mask,W ;Set the current bit sq3 RLF N_lo,F ;Shift N left one position RLF N_hi,F RRF mask,F ;Shift the mask right, and pick up msb of N BTFSC mask,7 ;If msb of N is set, then unconditionally goto sq6 ;set the next bit (because subtracting the ;root is guranteed not to cause a borrow) XORWF mask,W ;Append "01" to the root developed so far SKPC ;If the lsb of mask was shifted into the carry goto sq1 ;then we're done. Otherwise, loop again. ;We are almost done. In the last iteration, we have 7 bits of the root. When ;"01" is appended to it, we will have a 9-bit number that must be subtracted ;from N. SUBWF N_hi,F ; SKPC ;If the upper 7 bits cause a borrow, then RETURN ;the appended "01" will as well: We're done. SKPNZ ;If the result of the subtraction is zero BTFSC N_lo,7 ;AND the msb of N_lo is set then the LSB of the ;root is zero. XORLW 1 ;Otherwise, it is one. RETURN ; sq6 SKPNC ;Need to unconditionally set the current bit of the root. RETURN ;However, if we're through iterating, then leave. Note, ;C is set by shifting the mask right and the LSB of root ;was set by IORWF at sq2. BCF mask,7 ;Clear the MSB of N that got shifted into the mask. XORWF mask,W ;Append "01" to the root developed so far. SUBWF N_hi,F ;Subtract goto sq2 ;Go unconditionally set the current bit. sq5 ADDWF N_hi,F ;Restore N_hi by reversing the SUBWF with a ADDWF goto sq3 ;Don't set the current bit END

Here's some square root theory. And here's more software.

This page is maintained by Scott Dattalo. You can reach me at: scott@dattalo.com. Last modified on 13DEC99.