<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Nick, <div><br></div><div>I like the simplicity of the attribute approach. However, one of the problems of using the attribute approach is that you lose them when you inline the function. I am not sure if this problem disqualifies this approach for the proposed uses or not. </div><div><br></div><div>Thanks,</div><div>Nadav</div><div><br><div><div>On Nov 1, 2013, at 1:26 PM, Nick Lewycky <<a href="mailto:nlewycky@google.com">nlewycky@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">FYI, see also the previous discussion about "speculatable":<div><br></div><div> <a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/064426.html">http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/064426.html</a></div>
<div><br></div><div>I think such an attribute should be added.</div><div><br></div><div>In the thread which lead up to that thread, I proposed using more fine-grained attributes and Michael rightly pointed out the problem with that: you'd need one for every possible form of undefined behaviour. You listed "nofptrap", "nodivtrap" and "nomemtrap", but you didn't say "nounreachabletrap". Whoops!</div>
<div><br></div><div>Nick<br><div class="gmail_extra"><br><br><div class="gmail_quote">On 31 October 2013 07:38, Pekka Jääskeläinen <span dir="ltr"><<a href="mailto:pekka.jaaskelainen@tut.fi" target="_blank">pekka.jaaskelainen@tut.fi</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hello,<br>
<br>
OpenCL C specifies that instructions should not trap (it is "discouraged"<br>
in the specs). If they do, it's vendor-specific how the hardware<br>
exceptions are handled.<br>
<br>
It might be also the case with some other (future) languages targeting "streamlined" parallel accelerators in an heterogeneous setting.<br>
At least CUDA comes to mind. What about OpenACC and the new OpenMP,<br>
does someone know offhand?<br>
<br>
It would help several optimizations if they could assume certain<br>
instructions do not trap. E.g., I was looking at the if-conversion of<br>
the loop vectorizer, and it seems to not support speculating stores,<br>
divs, etc. which could be done if we knew it's safe to speculatively<br>
execute them.<br>
<br>
[In this particular if-conversion case proper predicated execution<br>
(not speculative) would require predicates to be added for all LLVM<br>
instructions so they could be squashed. I think this was discussed<br>
several years ago in the context of a generic IR-level if-conversion<br>
pass, but it seems such a thing did not realize eventually.]<br>
<br>
Anyways, "speculative" if-conversion is just one example where knowing<br>
that traps need not to be considered in the function at hand<br>
would help the optimizations. Also other speculative code motion<br>
optimizations, e.g., LICM, could benefit from it.<br>
<br>
One way would be to introduce a new function attribute. Functions (e.g.,<br>
OpenCL C or CUDA kernels) could be marked with an attribute that states<br>
that the instructions can be assumed not to trap -- it's a programmer's or<br>
the runtime's mistake if they do. The runtime should change the fp<br>
computation mode to the non-trapping one before calling such<br>
a function (this is actually stated in the OpenCL specs). If such<br>
handling is not supported by the target, then the attribute should not<br>
be added the first place.<br>
<br>
The attribute could be called 'notrap' which would include the<br>
semantics of any trap caused by any instruction. Or that could be<br>
split, just in case the hardware is known not to support one of the<br>
features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),<br>
'nodivtrap' (no divide by zero exceptions, undef value output instead),<br>
'nomemtrap' (no mem exceptions).<br>
<br>
What do you think of the general idea? Or is there something similar<br>
already that can accomplish this?<br>
<br>
Thanks in advance,<span class=""><font color="#888888"><br>
-- <br>
Pekka<br>
______________________________<u></u>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu/" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvmdev</a><br>
</font></span></blockquote></div><br></div></div></div>
_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br></blockquote></div><br></div></body></html>