<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Nov 1, 2013, at 1:45 PM, Filip Pizlo <<a href="mailto:fpizlo@apple.com">fpizlo@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div><br class="Apple-interchange-newline">On Nov 1, 2013, at 4:48 AM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">----- Original Message -----<br><blockquote type="cite">Hi Nadav,<br><br>On 10/31/2013 08:53 PM, Nadav Rotem wrote:<br><blockquote type="cite">data-parallel languages which have a completely different<br>semantics. In<br>OpenCL/Cuda you would want to vectorize the outermost loop, and the<br>language guarantees that it is safe to so.<br></blockquote><br>Yeah. This is the separate (old) discussion and not strictly related<br>to<br>the problem at hand. Better if-conversion benefits more than OpenCL C<br>work-item loops.<br><br><br><br>[For reference, here's an email in the thread from Spring. This<br>discussion<br>lead to the parallel loop metadata to mark the data-parallel loops:<br><br><a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html">http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html</a><br><br>The current status of this work is that there's now also effectively<br>loop interchange functionality in pocl so the inner (sequential)<br>loops<br>in the OpenCL C kernels are interchanged with the implicit parallel<br>work-item (outer) loops when it's semantically legal. After this the<br>inner loop vectorizer can be used efficiently also for kernels with<br>sequential loops.]<br><br><blockquote type="cite">Function attribute is one possible solution. Another solution<br>would be to<br>use metadata. I think that we need to explore both solutions and<br>estimate<br>their effect on the rest of the compiler. Can you estimate which<br>parts of<br>the compiler would need to be changed in order to support this new<br>piece<br>of information ? We need to think about what happens when we merge<br>or<br>hoist load/stores. Will we need to review and change every single<br>memory<br>optimization in the compiler ?<br></blockquote><br>The original idea was that if the function is marked notrap, it only<br>loosens the previous restrictions for the optimizations. Thus, if the<br>old code still assumes trapping semantics, it should be still safe<br>(only<br>worse optimizations might result).<br><br>Anyways, this has at least one problem that I see: functions that<br>have<br>the notrap attribute cannot be safely inlined to functions without<br>that<br>attribute. Otherwise a function which has possibly been optimized<br>with the<br>assumption of not trapping (and speculate an instruction that might<br>trap),<br>might again trap due to dropping the attribute (and the runtime not<br>knowing it has to switch off the trapping behavior). Thus, perhaps<br>notrap should simply always imply noinline to avoid this issue.<br><br>The another way is to add 'notrap' metadata to all possibly trapping<br>instructions. This should be safe and perhaps work across inlining,<br>but it requires more maintenance code and it might not work very<br>well in practice: the runtime might want to (or be able to) switch<br>the trapping semantics of e.g. the FP hardware on function basis, not<br>per instruction. If that's not the case, the code generator has<br>to support the instructions separately, injecting instructions that<br>switch on/off the trapping behavior.<br><br>The metadata approach has a benefit that there can be optimizations,<br>unrelated to the input language, that intelligently prove whether a<br>particular instruction instance can trap or not. E.g., if it's known<br>from code that a divider of a division is never zero, one can set<br>this metadata to a single DIV instruction, perhaps helping later<br>optimizations.<br><br>IMHO, the attribute approach is easier and makes more sense in<br>this particular case where the trapping behavior is dictated<br>by the input language, but OTOH the metadata approach seems to go<br>better along how it has been done previously (fpmath) and might<br>open the door for separate non-language-specific optimizations.<br></blockquote><br>The large complication that you end up with a scheme like this is maintaining control dependencies. For example:<br><br>if (z_is_never_zero()) {<br> x = y / z !notrap<br> ...<br>}<br><br>the !notrap asserts that the division won't trap, which is good, but also makes it safe to speculatively execute. That's the desired effect, but not in this instance, because it will allow hoisting outside of the current block:<br><br>x = y / z !notrap<br>if (z_is_never_zero()) {<br> ...<br>}<span class="Apple-converted-space"> </span><br><br>and that obviously won't work correctly. This seems to leave us with three options:<br><br>1. Add logic to all passes that might do this to prevent it (in which case, we might as well add some new (subclass data) flags instead of metadata).<span class="Apple-converted-space"> </span><br><br>2. Assert that !notrap cannot be used where its validity might be affected by control dependencies.<br><br>3. Represent the control dependencies explicitly in the metadata. Andy, Arnold (CC'd) and I have been discussing this in a slightly-different context, and briefly, this means adding all of the relevant conditional branch inputs to the metadata, and ensuring dominance before the metadata is respected. For example:<br><br> if (i1 %c = call z_is_never_zero()) {<br> %x = %y / %z !notrap !{ %c }<br> ...<br> }<br></div></blockquote><div dir="auto"><br></div><div dir="auto">Does this !{%c} reference to %c obey the same rules that other uses of a value would obey in LLVM?</div><div dir="auto"><br></div><div dir="auto">If it does, then the following transformation would be trivially valid:</div><div dir="auto"><br></div><div dir="auto"><div> if (i1 %c = call z_is_never_zero()) {<br> %x = %y / %z !notrap !{ 1 }<br> ...<br> }</div><div><br></div><div>Because we can prove that %c must have the value 1 inside the then case. But, now you have a !notrap !{1}, which means you can do:</div><div><br></div> %x = %y / %z !notrap !{ 1 }<br><div> if (i1 %c = call z_is_never_zero()) {<br> ...<br> }</div><div><br></div><div>... and the world just broke. So clearly, the !{ %c } reference cannot obey all of the same rules as other uses of a value would obey in LLVM IR. Can you describe exactly what rules such a use of %c would have? What would replaceAllUsesWith do for it? How should other phases treat it? What can they do to it?</div></div></div></blockquote><div><br></div><div>We have to be very clear about answering these questions *if* we actually implement the control dependent metadata, but I don’t see this as a new problem.</div><div><br></div><div>In general, metadata uses cannot be considered SSA uses. We are not going to do SSA update on metadata, ever. LLVM will not have metadata phis.</div><div><br></div><div>Optimizations that walk uses and correlate their values with control flow should also definitely not process metadata.</div><div><br></div><div>I am concerned that the compare itself would be processed by CorrelatedValuePropagation, which would call replaceAllUsesWith. Metadata uses are currently updated with RAUW, but it isn’t clear that’s the right thing. Either the RAUW interface could be extended, or, worst case, we could make these “control dependent uses” a special ValueHandle that doesn’t automatically update on RAUW.</div><div><br></div><div>-Andy</div><br><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>and so if we run across this situation:<br><br> %x = %y / %z !notrap !{ %c }<br> if (i1 %c = call z_is_never_zero()) {<br> ...<br> }<span class="Apple-converted-space"> </span><br><br> we can test that the %c does not dominate %x, and so the metadata needs to be ignored. The complication here is that you may need to encode all conditional branch inputs along all paths from the entry to the value, and the scheme also needs to deal with maythrow functions.<br><br>Given that the common use case for this seems like it will be for some language frontend to add !notrap to *all* instances of some kind of instruction (divisions, load, etc.), I think that adding a new flag (like the nsw flag) may be more appropriate for efficiency reasons. Even easier, add some more fine-grained function attributes (as you had suggested).<br><br>Also, I think that being able to tag a memory access as no trapping could be a big win for C++ too, because we could tag all loads/stores that come from C++ reference types as not trapping. Because of the way that iterators are defined, I suspect this would have a lot of positive benefits in terms of LICM and other optimizations.<br><br>-Hal<br><br><blockquote type="cite"><br>BR,<br>--<br>--Pekka<br><br>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a><span class="Apple-converted-space"> </span> <a href="http://llvm.cs.uiuc.edu/">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br><br></blockquote><br>--<span class="Apple-converted-space"> </span><br>Hal Finkel<br>Assistant Computational Scientist<br>Leadership Computing Facility<br>Argonne National Laboratory<br>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a><span class="Apple-converted-space"> </span> <a href="http://llvm.cs.uiuc.edu/">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></div></blockquote></div><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">LLVM Developers mailing list</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><a href="mailto:LLVMdev@cs.uiuc.edu" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">LLVMdev@cs.uiuc.edu</a><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;"><span class="Apple-converted-space"> </span> </span><a href="http://llvm.cs.uiuc.edu/" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">http://llvm.cs.uiuc.edu</a><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></blockquote></div><br></body></html>