<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Stefanos,<div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">So, to justify this transformation as correct, implicitly, poison has<br>_added definedness_ to signed wrapping: specifically, that the<br>computer won't explode if SW happens. AFAIU, that is ok as far as C++ semantics<br>are concerned:<br>Since signed wrapping was UB, making it more defined is ok.</blockquote></div><div><br></div><div><div>Your understanding is correct. Since signed overflow is UB in C/C++, lowering from C to IR can make such programs more defined.</div></div><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Instead, they have to lower it as something like:<br>if (x == INT_MAX)<br> skip or whatever</blockquote></div><div><br></div><div>Yes.</div><div>This means that the overflow check and the actual add operation can be separated. This requires instruction selection to carefully combine the check and add, but for optimization this is beneficial because add can still be freely moved.</div><div><br></div><div>for(i < n) {</div><div>if (x == INT_MAX)</div><div> trap</div><div>y = add nsw x + 1</div><div>use(y)</div><div>}</div><div>=></div><div><div style="color:rgb(0,0,0)">y = add nsw x + 1 // hoisted</div><div style="color:rgb(0,0,0)">for(i < n) {</div><div style="color:rgb(0,0,0)">if (x == INT_MAX)</div><div style="color:rgb(0,0,0)"> trap</div><div style="color:rgb(0,0,0)">use(y)</div><div style="color:rgb(0,0,0)">}</div></div><div style="color:rgb(0,0,0)"><br></div><div style="color:rgb(0,0,0)">Juneyoung</div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 29, 2020 at 5:56 AM Stefanos Baziotis <<a href="mailto:stefanos.baziotis@gmail.com">stefanos.baziotis@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Juneyoung,<br><br>First of all, great job on your talk!<div><br>This is a question I guess you'd be the best person to answer but the rest of the LLVM community might want to participate.<div><br></div><div>I was thinking about a UB-related example that has been discussed by multiple people</div><div>(including you), all of them basically authors of this paper (<a href="https://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf" target="_blank">https://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf</a>):</div><div><br>-- Before opt:</div><div>for (int i = 0; i < n; ++i) {</div><div> a[i] = x + 1;</div><div>}<br></div><div><br></div><div>-- After opt (LICM):</div><div>int tmp = x + 1;</div><div><div>for (int i = 0; i < n; ++i) {</div><div> a[i] = tmp;</div><div>}</div></div><div>// Assume `tmp` is never used again.</div><div><br></div><div>The reasoning here, is let's make signed wrapping _deferred_ UB that will only</div><div>occur if the value is used in one of X ways (e.g. as a denominator). To that end, if</div><div>n == 0 and x == INT_MAX, UB will never occur because the value is never used.</div><div><br></div><div>But, by doing that, the first point is:</div><div>If we translate this into machine code, the signed wrapping _will_ happen, no matter </div><div>the value won't be used.</div><div><br></div><div>Now, imagine that on some platform P, signed wrapping explodes the computer.</div><div>The computer _will_ explode (should explode ? more on that later)</div><div>even if `n == 0`, something that would not happen in the original code.</div><div><br></div><div>So, to justify this transformation as correct, implicitly, poison has</div><div>_added definedness_ to signed wrapping: specifically, that the</div><div>computer won't explode if SW happens. AFAIU, that is ok as far as C++ semantics</div><div>are concerned:</div><div>Since signed wrapping was UB, making it more defined is ok.</div><div><br></div><div>But that definedness now has created a burden to whoever is writing a back-end</div><div>from LLVM IR to P (the SW exploding platform).</div><div>That is, now, if they see a `add <nsw>`, they can't lower it to a trivial signed add,<br></div><div>since if they do that and x == INT_MAX, the computer will explode and that violates</div><div>the semantics of _LLVM IR_ (since we mandated that SW doesn't explode the machine).</div><div><br></div><div>Instead, they have to lower it as something like:</div><div>if (x == INT_MAX)</div><div> skip or whatever</div><div><br></div><div>Is this whole thinking correct ? UB, undef and poison all are very subtle so I'm trying</div><div>to wrap my head around.<br><br>Thanks,<br>Stefanos Baziotis</div></div></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div><font size="1">Juneyoung Lee</font><div><font size="1">Software Foundation Lab, Seoul National University</font></div></div></div>