<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto">As I work as a consultant for high performance computing, I came across a lot of these problems. I'll give my thoughts on it:<div><br></div><div>- The IEEE 754-2008 standard defines the roundind modes for +, -, *, / and sqrt which should give the result of those operations rounded to -infinity, +infinity, towards 0 or to the nearest floating point. The default mode being, rounding to nearest.</div><div><br></div><div>- For all other functions, I would say that "it is open bar" for the platform. The reason is that even though it is possible to write libraries such that given a double x, std::exp(x) will be the exact value rounded to the nearest floating point, these algorithms are extremely costly. Most people, don't need such an accuracy, and an "engineering tradeoff" has to be made. Think, for instance at this program:</div><div><br></div><div>=====</div><div>#include <iostream></div><div>#include <cmath></div><div><br></div><div>int main() {</div><div> double x = <a href="3.14159265358979323846">3.14159265358979323846</a>;</div><div> std::cout << std::sin(x) << std::endl;</div><div><br></div><div> return 0;</div><div>}</div><div>=====</div><div><br></div><div>and run it on different platforms. I am sure you'll get a lot of fun :-) You'll discover quickly that you can't even trust the first digit given by your program by just applying one transcendental function.</div><div>One might think that computing accurately x^n for integrers n is simple, but it is not. Check this <a href="http://perso.ens-lyon.fr/jean-michel.muller/p1-Kornerup.pdf">http://perso.ens-lyon.fr/jean-michel.muller/p1-Kornerup.pdf</a> if you are interested.</div><div><br></div><div>As a consequence:</div><div>- Forget about getting the exact same floating point from one platform to another unless you only use +, -, *, /, sqrt.</div><div>- Even if you do that, you should disable vectorization to get correct results. Vectorizing a reduction (for instance computing the sum of the elements of a vector) won't give you the same result as the same loop which is not vectorized because + is not associative. Worse, a vectorized loop might not give you the same answer from run to run because of the fact that malloc does not always align to 32 bytes which affect the loop peeling and therefore the result.</div><div>- When it come to multithreading, especially with libraries which dispatch the work at runtime such as TBB or OpenMP without a static schedule, you also don't have the same result from one run to another on the same machine.</div><div>- Worse, even without vectorization and multithreading, compilers might reorder the sum to benefit from instruction level parallelism.</div><div>So high performance computing and correct rounding are just ennemies of one another.</div><div><br></div><div>For the very few you needs the garantee of correct rounding (mainly people doing academic research on floating point arithmetic), you understand that using pow is not even an option. So they don't care about how it is handled.</div><div><br></div><div>For the rest of us, I think we should use the most efficient way to compute x^n : divide and conquer. I do believe that std::pow is completely flawed anyway (it's my opinion on most of the C++ library, but I'll try not to rant about it). There was a "float std::pow(float x, int n)" until C++11, but it dissapeared !!! (check cppreference). Now both x and n have to be converted to double before anything happens. Even the compilers who are smart enough to convert std::pow(x, 2.0) to x * x end up working with doubles instead of floats, loose time in the conversion, and kill the performance by a factor of 2 because they can load half as many numbers in the vector registers. So I strongly recommend not using it. Instead, I recommend defining:</div><div><br></div><div>template <typename T, int n></div><div>T il::ipow(T x)</div><div><br></div><div>and specializing it for small values of n (up to 10 for instance). If you like that game you can even do template meta-programming to define il::ipow (I personnaly specialized it for values up to 10 and wrote a while loop for n larger). That way, you'll be sure that when you call il::ipow<5>(x), it will compute u = x * x, and return u * u * x.</div><div><br></div><div>I think the right thing to do is to teach people about these problems, and tell them how wrong they are when they test x == y with 2 floating point, or even worse, write unit test that expect a floating point as a result without any accuracy. <span style="background-color: rgba(255, 255, 255, 0);">In have seen so much of these tests in different industries, that this is scary. </span>And when their program give 2 completely different answers from one platform to another, most of the time (my experience is always), it is because the algorithm is unstable, so both results are wrong.</div><div><br></div><div>So my two cents: If you care about performance, write your own ipow. If you think you need accuracy up to 0.5 ulp, I'll say that I am very reluctant to believe you and I'll advise you to use specialized libraries. This presentation gives a nice review of them: <a href="https://indico.cern.ch/event/313684/contributions/1687762/attachments/600507/826484/FDD-2014-CERN1.pdf">https://indico.cern.ch/event/313684/contributions/1687762/attachments/600507/826484/FDD-2014-CERN1.pdf</a></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br><br>Sent from my iPad</div></body></html>