Cosine in floating point

Question

I am trying to implement the cosine and sine functions in floating point (but I have no floating point hardware).

Since my processor has no floating-point hardware, nor instructions, I have already implemented algorithms for floating point multiplication, division, addition, subtraction, and square root. So those are the tools I have available to me to implement cosine and sine.

I was considering using the CORDIC method, at this site However, I implemented division and square root using newton's method, so I was hoping to use the most efficient method.

Please don't tell me to just go look in a book or that "paper's exist", no kidding they exist. I am looking for names of well known algorithms that are known to be fast and efficient.

Can't you find and adapt an existing math library? Because the math below them are quite complex! Writing a competitive math library might worth you a PhD (and would need years of efforts). — Basile Starynkevitch
I'm not really a code junkie, I would just like the algorithm and I can do the implementation myself. I have to rewrite the code in assembly myself and hand schedule it as well. (It is for a custom processor). — Veridian
I believe there are lots of books and papers on that subject. Did you go into a university library? — Basile Starynkevitch
@Basic: if you only need limited accuracy and don't care about extremely large inputs (which is often the case), a lookup table is a great approach. If you want to deliver a fully-accurate floating-point result over the entire range, it's prohibitively expensive in memory usage. — Stephen Canon
@StephenCanon Yeah, when OP mentioned 23 bits of precision, it became clear that a lookup table would be huge. That said, you've taught me something with the Argument Reduction... paper - Thanks. — Basic

Stephen Canon Stephen Canon · Accepted Answer · 2012-02-13T18:49:55

First off, depending on your accuracy requirements, this can be considerably fussier than your earlier questions.

Now that you've been warned: you'll first want to reduce the argument modulo pi/2 (or 2pi, or pi, or pi/4) to get the input into a manageable range. This is the subtle part. For a nice discussion of the issues involved, download a copy of K.C. Ng's ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit. (simple google search on the title will get you a pdf). It's very readable, and does a great job of describing why this is tricky.

After doing that, you only need to approximate the functions on a small range around zero, which is easily done via a polynomial approximation. A taylor series will work, though it is inefficient. A truncated chebyshev series is easy to compute and reasonably efficient; computing the minimax approximation is better still. This is the easy part.

I have implemented sine and cosine exactly as described, entirely in integer, in the past (sorry, no public sources). Using hand-tuned assembly, results in the neighborhood of 100 cycles are entirely reasonable on "typical" processors. I don't know what hardware you're dealing with (the performance will mostly be gated on how quickly your hardware can produce the high part of an integer multiply).

Cosine in floating point

3 Answers