I never have had the need to take the square root of a number with a microcontroller. However, every few months someone on the PIC LIST asks a) can it be done and b) can it be done fast. Well one day Andy Warren makes this claim about a square root routine that he had in his possession with the following features:
BUT, the original author did not grant him permission to publicize it. Bummer. BUT, quoting Andy, "Maybe this will prod someone else to recreate it." Well I took a stab at it and came up with a routine that was:
Eric Smith and Martin Nilsson independently suggested a way to squeeze one word out of the total program memory count. However, the routine was still a dog compared to Andy's screamer. So I dropped the whole square root thing until I serendipitously came across the book "The Logic of Computer Arithmetic", by Ivan Flores. He discusses two methods one can use to find the square root of a number. His first method called the "non-restoring method" is the technique I used in the program above. However, his explanation gave me another idea. So after a half dozen attempts or so I came up with a new routine:
While it's slightly slower than the one in Andy's possession, it does save one byte of RAM. Here's the routine along with some code illustrating its use. Just for grins, I've thrown in the first square root slug I wrote (it's called SQRT_A, while the faster one is called SQRT):
list p=16C64,t=ON,c=132,n=80,st=off
radix dec
include "P16C64.INC"
s1 equ 0x20
s2 equ 0x21
N_hi equ 0x22
N_lo equ 0x23
mask equ 0x24
xl equ 0x25
xh equ 0x26
xstep equ 0x27
ORG 0 ;Reset Vector
GOTO Main
ORG 4 ;Interrupt Vector
Main
BCF STATUS,RP0 ;Point to BANK 0
CLRF xl
CLRF xh
MOVLW 1
MOVWF xstep
xxx
MOVF xh,W
MOVWF N_hi
MOVF xl,W
MOVWF N_lo
NOP
CALL sqrt
NOP
MOVF xstep,W
ADDWF xl,F
SKPNC
INCF xh,F
GOTO xxx
;---------------------------------------------------------------
;sqrt
;
; The purpose of this routine is take the square root of a 16 bit
;unsigned integer.
;Inputs: N_hi - High byte of the 16 bit number to be square rooted
; N_lo - Low byte " " "
;Outputs: s1 - Temporary variable
; s2 - 8 bit result returned in here
;
;27 words, 145 cycles best case
sqrt__A
MOVLW 0x40
MOVWF s1
CLRF s2
L2 ADDWF s2,W
SUBWF N_hi,W ;W = N_hi-s1-s2
SKPNC ;if N_hi > (s1+s2)
goto L3 ; then replace N_hi with N_hi-s1-s2
BTFSS s1,6 ;If this is the first time through the loop
BTFSS N_lo,0 ;or If MS bit is zero, then N < s1+s2
goto L1
L3
;N is greater than s1+s2, so replace N_hi with N_hi-s1-s2
MOVWF N_hi
RLF s1,W
RLF s1,W
ADDWF s2,F ;s2 = s2 | (s1<<1)
L1
BTFSC s1,7 ;If s1 has rolled all the way around, we're done.
return
;Next, roll N left one bit position. I.e. N = (N<<1) | (N>>15). This puts the
;MS bit into the LS bit position.
RLF N_hi,W ;C = MS bit of N
RLF N_lo,F
RLF N_hi,F
CLRC
RRF s1,W ;Roll s1 right one bit: s1 = ((s1>>1) | (s1<<7)) & 0xff
RRF s1,F ;
BTFSS s1,7 ;If this is the last time through the loop, we need to make
goto L2 ;a tiny exception.
;We only get here if this is the last time through the loop. This special exception needs
;to handle a 10 bit addition. Right now, the value of s1 that got shifted into W is zero.
;It should be 1 >> 1, ie a fraction of 1/2. Or, another way to look at it is the value of
;s1 to be subtracted from N is 0x0080. Now, if N_lo is less than 0x80, then the subtraction
;will cause a borrow that must be propagated to N_hi, i.e. N_hi must be decremented. Rather
;than subtracting 1 from N_hi now, we'll instead add 1 to s2 and do the subtraction above.
BTFSS N_lo,7
MOVLW 1
goto L2
;----------------------------------------------------------
;sqrt
;
; The purpose of this routine is take the square root of a 16 bit
;unsigned integer.
;Inputs: N_hi - High byte of the 16 bit number to be square rooted
; N_lo - Low byte " " "
;Outputs: W register returned with the 8 bit result
;
;Memory used:
; mask
; 30 Instructions
; 104 Cycles best case, 109 average, 122 worst case
sqrt
MOVLW 0xc0 ;Initialize value for mask
MOVWF mask
MOVLW 0x40 ;Initial value for the root
sq1 SUBWF N_hi,F ;Subtract the root developed so far
SKPC ;If it is larger than N_hi
goto sq5 ; then go restore the subtraction
sq2 IORWF mask,W ;Set the current bit
sq3 RLF N_lo,F ;Shift N left one position
RLF N_hi,F
RRF mask,F ;Shift the mask right, and pick up msb of N
BTFSC mask,7 ;If msb of N is set, then unconditionally
goto sq6 ;set the next bit (because subtracting the
;root is guranteed not to cause a borrow)
XORWF mask,W ;Append "01" to the root developed so far
SKPC ;If the lsb of mask was shifted into the carry
goto sq1 ;then we're done. Otherwise, loop again.
;We are almost done. In the last iteration, we have 7 bits of the root. When
;"01" is appended to it, we will have a 9-bit number that must be subtracted
;from N.
SUBWF N_hi,F ;
SKPC ;If the upper 7 bits cause a borrow, then
RETURN ;the appended "01" will as well: We're done.
SKPNZ ;If the result of the subtraction is zero
BTFSC N_lo,7 ;AND the msb of N_lo is set then the LSB of the
;root is zero.
XORLW 1 ;Otherwise, it is one.
RETURN ;
sq6 SKPNC ;Need to unconditionally set the current bit of the root.
RETURN ;However, if we're through iterating, then leave. Note,
;C is set by shifting the mask right and the LSB of root
;was set by IORWF at sq2.
BCF mask,7 ;Clear the MSB of N that got shifted into the mask.
XORWF mask,W ;Append "01" to the root developed so far.
SUBWF N_hi,F ;Subtract
goto sq2 ;Go unconditionally set the current bit.
sq5 ADDWF N_hi,F ;Restore N_hi by reversing the SUBWF with a ADDWF
goto sq3 ;Don't set the current bit
END
Here's some square root theory. And here's more software.