<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Folding a couple of topics back into this thread:<div class=""><br class=""></div><div class=""><email from <a href="mailto:cameron.mcinally@nyu.edu" class="">cameron.mcinally@nyu.edu</a>><br class=""><div class=""><br class=""></div><div class="">I'd like to touch on a topic mentioned in the blog post. The constrained intrinsics work is at a road block on how to proceed with the constrained implementation in the backends, i.e. D55506. Reviews/ideas in this area would be greatly appreciated (attn: target code owners). </div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Cameron</div><div class=""><br class=""></div><div class=""><email from <a href="mailto:venkataramanan.kumar.llvm@gmail.com" class="">venkataramanan.kumar.llvm@gmail.com</a>></div><div class=""><br class=""></div><div class=""><div class="">Just like to point out few things that I thought is related to FP Numerics.</div><div class="">LLVM could do some additional transformation with "sqrt" and "division" under fast math on X86 like 1/sqrt(x)* 1/sqrt(x) to 1/x. These are long latency instructions and could get benefit if enabled under unsafe math.</div></div><div class=""><br class=""></div><div class=""> Also are we considering doing such FP transforms on vector floating point types?</div><div class=""><br class=""></div><div class="">regards,</div><div class="">Venkat.</div><div><br class=""><blockquote type="cite" class=""><div class="">On Apr 3, 2019, at 9:30 AM, David Greene <<a href="mailto:dag@cray.com" class="">dag@cray.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">"Kaylor, Andrew via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> writes:<br class=""><br class=""><blockquote type="cite" class="">====================<br class=""><br class="">Masked vector FP operations<br class=""><br class="">====================<br class=""><br class="">We’ve resisted adding explicitly predicated operations other than load<br class="">and store in the past, but I think for vector FP operations we’re<br class="">going to need this in order to maintain strict FP semantics.<br class=""></blockquote><br class="">Yep, we definitely will. This is one of the reasons Simon Moll's<br class="">predication work (D57504) is so important.<br class=""><br class=""><blockquote type="cite" class="">====================<br class=""><br class="">Complex types<br class=""><br class="">====================<br class=""><br class="">There, I said it.<br class=""></blockquote><br class="">I'll echo my colleague's response.<br class=""><br class="">Oh hell yes! OH HELL YES! :)<br class=""><br class=""><blockquote type="cite" class="">====================<br class=""><br class="">Accuracy controls<br class=""><br class="">====================<br class=""><br class="">We have a fast math flag that lets us substitute approximations for<br class="">some math library functions. It would be nice to have a mechanism to<br class="">control the accuracy of the approximations.<br class=""></blockquote><br class="">Indeed. "Fast or not" is too coarse.<br class=""><br class=""><blockquote type="cite" class="">====================<br class=""><br class="">Per function controls<br class=""><br class="">====================<br class=""><br class="">Similarly, it would be nice to explicitly list which math library functions could be replaced.<br class=""></blockquote><br class=""><blockquote type="cite" class="">I’d also like to suggest the formation of a floating point working<br class="">group to try to get more organized about driving some of these things<br class="">(particularly the constrained intrinsics) toward completion.<br class=""></blockquote><br class="">That's a great idea.<br class=""><br class=""> -David<br class=""></div></div></blockquote></div><br class=""></div></body></html>