<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Relevant to this discussion is <a href="http://bugs.llvm.org/show_bug.cgi?id=25721" class="">http://bugs.llvm.org/show_bug.cgi?id=25721</a> (-ffp-contract=fast does not work with LTO). I am working on adding function attributes for fp-contract=fast which should fix this.<div class=""><br class=""></div><div class="">Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on. This will show up in the opt-viewer. Then the user can opt in either with the command-line switch or the new function attribute.</div><div class=""><br class=""></div><div class="">Adam<br class=""><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">Folks,<br class=""><br class="">I've been asking around people about the state of FP contract, which<br class="">seems to be "on" but it's not really behaving like it, at least not as<br class="">I would expect:<br class=""><br class="">int foo(float a, float b, float c) { return a*b+c; }<br class=""><br class="">$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -<br class="">(...)<br class="">fmul s0, s0, s1<br class="">fadd s0, s0, s2<br class="">(...)<br class=""><br class="">$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -<br class="">(...)<br class="">fmadd s0, s0, s1, s2<br class="">(...)<br class=""><br class="">I'm not sure this works in Fortran either, but defaulting to "on" when<br class="">(I believe) the language should allow contraction and not doing it is<br class="">not a good default.<br class=""><br class="">i haven't worked out what would be necessary to make it work on a<br class="">case-by-case basis (what kinds of fusions does C allow?) to make sure<br class="">we don't do all or nothing, but if we don't want to start that<br class="">conversation now, then I'd recommend we just turn it all the way to 11<br class="">(like GCC) and let people turn it off if they really mean it.<br class=""><br class="">The rationale is that:<br class=""><br class="">* Contracted operations increase precision (less rounding steps)<br class="">* It performs equal or faster on all architectures I know (true everywhere?)<br class="">* Users already expect that (certainly, GCC users do)<br class="">* Makes us look good on benchmarks :)<br class=""><br class="">A recent SPEC2k6 comparison Linaro did for AArch64, enabling<br class="">-ffp-contract=fast took the edge of GCC in a number of cases and in<br class="">some of them made them comparable in performance. So, any reasons not<br class="">to?<br class=""><br class="">If we go with it, we need to first finish the job that Sebastian was<br class="">dong on the test-suite, then just turn it on by default. A second<br class="">stage would be to add tests/benchmarks that explicitly test FP<br class="">precision, so that we have some extra guarantee that we're doing the<br class="">right thing.<br class=""><br class="">Opinions?<br class=""><br class="">cheers,<br class="">--renato<br class="">_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<br class=""></div></div></blockquote></div><br class=""></div></div></body></html>