<div dir="ltr"><div class="gmail_extra">On Mon, Apr 8, 2013 at 1:34 AM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="im">On Sun, Apr 7, 2013 at 9:28 PM, Cameron McInally <span dir="ltr"><<a href="mailto:cameron.mcinally@nyu.edu" target="_blank">cameron.mcinally@nyu.edu</a>></span> wrote:<br>


</div><div class="gmail_quote"><div class="im"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">


<div dir="ltr"><div class="gmail_extra">More productive (IMO) is to emit explicit guards against the undefined behavior in your language, much as -fsanitize does for undefined behavior in C++. Then work to build a mode where a specific target can take advantage of target specific trapping behaviors to emit these guards more efficiently. This will allow LLVM's optimizers to continue to function in the world they were designed for, and with a set of rules that we know how to build efficient optimizers around, and your source programs can operate in a world with checked behavior rather than undefined behavior. As a useful side-effect, you can defer the target-specific optimizations until you have benchmarks (internally is fine!) and can demonstrate the performance problems (if any).</div>


</div></blockquote><div><br></div></div><div>Regrettably, this implementation does not suit my needs.</div></blockquote><div><br></div></div><div>I'm curious why you think so... I think it would very closely match your needs:</div>

</div></div></div></blockquote><div><br></div><div style>It's my current understanding that the actual division would be constant folded away and I would be left with only the guard. Later, the guard would be proved always true. I could, of course, be mistaken.</div>

<div style><br></div><div style>I checked out Clang with -fsanitize=integer-divide-by-zero this weekend and saw similar in the assembly. I also noticed that the division operands are still around in the assembly. If it would be possible to recreate the div instructions, I would be thrilled.</div>

<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote"><div class="im">

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div> The constant folding would still occur</div>

</blockquote><div><br></div></div><div>No? At least, I don't see why.</div>

<div><br></div><div>I don't know what your frontend is doing (but my impression is that it is not Clang? If I'm wrong, let me know...) </div></div></div></div></blockquote><div><br></div><div>It is not Clang. Our compiler has proprietary frontends, optimizer, vectorizer, and more. You may have noticed that I don't often share LLVM IR on the list. Our IR has tons of proprietary changes to it. But, I digress.<br>

</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote"><div>but the common idiom is to emit nothing as a constant other than immediate values, and let LLVM's optimizers figure it out. </div></div></div></div></blockquote><div><br></div><div style>

Our pre-LLVM optimizer may be a culprit here. With the original test case I provided, our inliner will inline both foo(...) and bar(...). In main(), we end up calling CreateSDiv in the IRBuilder with two constants, i.e. (6/0).</div>

<div style><br></div><div style>Would Clang snap these constants into temporaries? Or, does Clang maintain the original source structure? Sorry for these simple questions, I have no insight to Clang at all.</div><div style>

<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote"><div>A consequence of this pattern combined with inserting guards is that (at most) the optimizer will fold directly to the trap. You should be able to control the exact structure in the FE though. </div>

</div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote"><div class="im">

<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div> and I would like to produce the actual division, since the instruction is non-maskable on x86.</div>


</blockquote><div><br></div></div><div>Yes, and this is the point I was driving at with the comments about a target specific optimization. It is reasonable for the x86 backend to recognize the pattern of testing for a zero divisor and trapping if it occurs, and transform that into an unconditional divide relying on the hardware trap instead. I think it is not unreasonable to have this strategy result in one of two generated code patterns in the overwhelming majority of cases:</div>

</div></div></div></blockquote><div><br></div><div>I'm going to be nitpicky over this. It's really not my intention to be unreasonable. I hope you can understand.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>1) An unconditional trap because the optimizer proved divide by zero</div></div></div></div></blockquote><div><br></div><div>This isn't desirable for me. I am afraid an unconditional trap in the assembly will just confuse a user. I suspect that a user will trigger this code and immediately report a compiler bug that we've inserted a rogue trap. <br>

</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">

<div class="gmail_quote"><div>2) A direct divide instruction which relies on the x86 trapping behavior.</div></div></div></div></blockquote><div><br></div><div style>This would be great. Would I be able to accurately recreate the actual constant divide that was folded away? Originally, I had suspected that the operands would be long gone or at least intractable. Again, this may sound unreasonable, but I would like to keep the division intact so that a user can see where their code went wrong in the assembly. Sorry in advance if I'm misunderstanding.</div>

<div style><br></div><div style>Thanks again, Chandler.</div><div style><br></div><div style>-Cameron </div></div></div></div>