Dr. Dobb's Journal - April 2008 - (Page 41) d04will_p3ds 2/19/08 11:30 AM Page 41 0 or 1 or even 0.5 in a calculation, and this is not so easy to spot when changing every use of “double” to “fixed.” The profiler showed that these constants were responsible for quite a few calls to the converting constructor, so replacing them with fixed-point constants shaved another couple of percents off the runtime. CORDIC The single biggest gain was achieved by optimizing the trigonometric and exponential functions. For the trig functions, the CORDIC method was used. (For more information on CORDIC, which is short for “COordinate Rotation DIgital Computer,” see “Implementing CORDIC Algorithms,” by Pitt Jarvis; www.ddj.com/architect/ 184408428.) CORDIC gets its name because it uses properties of rotations on a plane to calculate sines and cosines. A vector can be rotated counterclockwise by an angle θ using a rotation matrix. We only need to handle angles in the range –π/2 to π/2, since the sines and cosines of larger angles only vary in sign (if at all), and restricting it to this range limits the number of iterations required. The sine and cosine of an angle can thus be calculated simply by rotating the unit vector from the x axis by the required angle as described: The cosine and sine are the x and y coordinates of the rotated vector. Calculating arctan The inverse tangent can easily be calculated with the CORDIC rotation tables. This time, rather than rotating the unit vector, a vector with an x component of 1 and a y component of the value for which we want the arctan is rotated until it has a y value of 0. Shift and Add for Exponentials Having found out about using shifts and adds with CORDIC to implement sines and cosines, I wondered if the same could be done for exponentials and logarithms, too. For exponentials it’s rather easy, as: e x+y=e x•e y ( cos(θ) –sin(θ) sin(θ) cos(θ) ) Equivalently, if you have two angles α and β such that θ=α+β, then we can first rotate by α, and then by β, to achieve the same effect. cos(θ) (sin(θ) –sin(θ)) cos(θ) = cos(α)–sin(α) cos(β)–sin(β) (sin(α) cos(α) ) (sin(β) cos(β) ) The rotation by β can then itself be split into further rotations by smaller and smaller angles. The benefit here comes from the fact that you can then pick a set of angles αi in advance, and precompute the appropriate sines and cosines. Rather than a time-consuming and potentially inaccurate power-series calculation, the desired sines and cosines can instead be calculated by a short series of multiplications and additions. Rather than using both sines and cosines of the angles αi, the cosines can be factored out: So by choosing values of x such that ex= 2n or ex=1+2–n for some n, then exponentials can again be calculated just as a matter of shifts and adds. In each iteration, the current value of x is compared to the suggested value of y. If x is less than y, then that term is ignored, otherwise x=x–y and the result is multiplied by the appropriate value with a shift and add. By running from high values of y to smaller ones, the value of x approaches zero, and the result approaches the correct value for the exponential. For 64-bit fixed point (with 63 significant bits), this requires 63 table entries. For negative values of x, the principle is the same, but the table values are for ex=2–n or ex =1/(1–2–n). Here we can use the fact that log(2–n)=–log(2n) to avoid duplicate table entries. ( cos(θ) –sin(θ) sin(θ) cos(θ) ) = i=0 Π n cos(αi ) • i=0 Π n (tan(α ) i 1 –tan(αi ) 1 ) Logarithms Logarithms are slightly more complex to get right, though conceptually just as easy, as they are just a reverse lookup using the same tables as for calculating the exponentials. In fact, the implementation is even easier because logarithms are always positive. The key here is to find where to start the calculation, and you do that by shifting the parameter left until there is a 1 in the most significant bit. By counting the shifts, you know what initial value to use for the result. For example, in the 35.28 fixed-point system being used here, if you had to shift-left 30 times, the original value was 2 5*y for some y, so you start with a result of log(2 5) and work from there. Similarly, if you had to shift-left 42 times, then the original value was 2–7*y, so you start with a result of log(2–7). Once you have the starting result for log(x), it’s smooth sailing, as you always have a value with the MSB set to 1. For increasing values of n, check to see if the current value of x>=1+(x*2–n). If it is, then subtract (x*2–n) from the current value of x, and add log(1/(1–2–n)) to the result. Repeat until x==1 or until you’re out of bits. This uses the identity that: log(x*y)=log(x)+log(y) As written, this is fine for some fixed angle theta, but what about a general angle? It can be shown that provided that: αi = 0.5•αi–1 then any angle θ can be made by adding or subtracting each αi exactly once up to a precision determined by the number of angles n. Since cos(x)=cos(–x) and tan(x)=–tan(–x), the factored-out product of cosines can be stored as a constant multiplier once the angles have been determined, and the appropriate signs used for the tangents depending on the actual angle θ. As a final simplifying step, if you choose the angles αi such that tan(αi)=2–j for some integer j, then this greatly simplifies the multiplication, as it is now just a simple shift. April 2008 l www.ddj.com l Dr. Dobb’s Journal 41 http://www.ddj.com/architect/184408428 http://www.ddj.com/architect/184408428 http://www.ddj.com
Table of Contents Feed for the Digital Edition of Dr. Dobb's Journal - April 2008 Dr. Dobb's Journal - April 2008 Contents Hmmmm Alia Vox Developer Diaries Dr. Dobb's Excellence in Programming Award Conversations Fast String Search on Multicore Processors The Byzantine Generals Problem Optimizing Math-Intensive Applications with Fixed-Point Arithmetic Random Numbers in a Range Using Generic Programming The Agile Edge Effective Concurrency Swaine's Flames Dr. Dobb's Journal - April 2008 Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page Cover1) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page Cover2) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 1) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 2) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Journal - April 2008 (Page 3) Dr. Dobb's Journal - April 2008 - Contents (Page 4) Dr. Dobb's Journal - April 2008 - Contents (Page 5) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 6) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 7) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 8) Dr. Dobb's Journal - April 2008 - Hmmmm (Page 9) Dr. Dobb's Journal - April 2008 - Alia Vox (Page 10) Dr. Dobb's Journal - April 2008 - Alia Vox (Page 11) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 12) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 13) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 14) Dr. Dobb's Journal - April 2008 - Developer Diaries (Page 15) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Excellence in Programming Award (Page 16) Dr. Dobb's Journal - April 2008 - Dr. Dobb's Excellence in Programming Award (Page 17) Dr. Dobb's Journal - April 2008 - Conversations (Page 18) Dr. Dobb's Journal - April 2008 - Conversations (Page 19) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 20) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 21) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 22) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 23) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 24) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 25) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 26) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 27) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 28) Dr. Dobb's Journal - April 2008 - Fast String Search on Multicore Processors (Page 29) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 30) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 31) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 32) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 33) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 34) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 35) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 36) Dr. Dobb's Journal - April 2008 - The Byzantine Generals Problem (Page 37) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 38) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 39) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 40) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 41) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 42) Dr. Dobb's Journal - April 2008 - Optimizing Math-Intensive Applications with Fixed-Point Arithmetic (Page 43) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 44) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 45) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 46) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 47) Dr. Dobb's Journal - April 2008 - Random Numbers in a Range Using Generic Programming (Page 48) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 49) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 50) Dr. Dobb's Journal - April 2008 - The Agile Edge (Page 51) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 52) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 53) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 54) Dr. Dobb's Journal - April 2008 - Effective Concurrency (Page 55) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page 56) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page Cover3) Dr. Dobb's Journal - April 2008 - Swaine's Flames (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.