[llvm-dev] Does poison add implicit "definedness" under the hood ?

Stefanos Baziotis via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 29 03:21:03 PDT 2020


Hi Juneyoung,

I think I lost you in your last example. First, I'm not sure if this last C
code corresponds to LLVM IR code or after-LLVM-IR code. Assuming that we're
talking about LLVM IR and for my imaginary platform P:
The checking for overflow is not something that we have to do in LLVM IR,
since as outlined in my previous email, LLVM IR has added the semantics that
signed overflow does not explode / trap (because if it didn't add these
semantics, additions that exploded and which did not happen in the original
program will happen now etc.). It is the lowering of LLVM IR that has to
add it.

So, AFAIU, the "timeline" looks like this:

-- Before (LLVM IR):
for(i < n) {
  y = add nsw x + 1  <-- notice: no check, we don't need to do it,
semantics say it doesn't trap
  use(y)
}
-- After (LLVM IR):
int tmp = add nsw x + 1;  <-- still no check..
for(i < n) {
  y =  tmp;
  use(y)
}

Now, we get that after and go ahead to lower it (i.e. after-LLVM-IR,
back-end work). I'll actually try to do a better job than
my original email:

int tmp;
if (x != INT_MAX)  <-- notice the !=. This check is to _guarantee_ that the
addition won't explode / trap since if that happened, it would violate the
semantics, that we implicitly added, of LLVM IR.
  tmp = add nsw x + 1;
for(i < n) {
  y =  tmp;
  use(y)
}

So this (your example):
for(i < n) {
if (x == INT_MAX)
  trap
y = add nsw x + 1
use(y)
}
=>
y = add nsw x + 1 // hoisted
for(i < n) {
if (x == INT_MAX)
  trap
use(y)
}

does not make sense to me either in the LLVM IR level or in the
after-LLVM-IR level. For the former, it doesn't make sense since I don't
see why we have to do the check inside the loop;
the semantics mandate that it doesn't happen so we're ok, the burden is on
the back-end. For the latter (i.e. we're in LLVM IR lowering, ISel or
smth), we simply can't hoist this add outside without a check
since if x == INT_MAX, the platform P will explode _immediately_, no matter
what the value of `n` (and of course, if `n == 0`, the original code
wouldn't explode).

Do I miss something ?
Guess: I think that what I miss is this: The semantics we added is not that
signed wrapping won't explode. Rather: It does not explode _if it is not
used in one of the X ways_. That would justify
your example transformation in the LLVM IR level.

I hope this was not confusing, thanks for your time!

- Stefanos


Στις Πέμ, 29 Οκτ 2020 στις 4:29 π.μ., ο/η Juneyoung Lee <
juneyoung.lee at sf.snu.ac.kr> έγραψε:

> Hi Stefanos,
>
> So, to justify this transformation as correct, implicitly, poison has
>> _added definedness_ to signed wrapping: specifically, that the
>> computer won't explode if SW happens. AFAIU, that is ok as far as C++
>> semantics
>> are concerned:
>> Since signed wrapping was UB, making it more defined is ok.
>
>
> Your understanding is correct. Since signed overflow is UB in C/C++,
> lowering from C to IR can make such programs more defined.
>
> Instead, they have to lower it as something like:
>> if (x == INT_MAX)
>>   skip or whatever
>
>
> Yes.
> This means that the overflow check and the actual add operation can be
> separated. This requires instruction selection to carefully combine the
> check and add, but for optimization this is beneficial because add can
> still be freely moved.
>
> for(i < n) {
> if (x == INT_MAX)
>   trap
> y = add nsw x + 1
> use(y)
> }
> =>
> y = add nsw x + 1 // hoisted
> for(i < n) {
> if (x == INT_MAX)
>   trap
> use(y)
> }
>
> Juneyoung
>
> On Thu, Oct 29, 2020 at 5:56 AM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
>
>> Hi Juneyoung,
>>
>> First of all, great job on your talk!
>>
>> This is a question I guess you'd be the best person to answer but the
>> rest of the LLVM community might want to participate.
>>
>> I was thinking about a UB-related example that has been discussed by
>> multiple people
>> (including you), all of them basically authors of this paper (
>> https://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf):
>>
>> -- Before opt:
>> for (int i = 0; i < n; ++i) {
>>   a[i] = x + 1;
>> }
>>
>> -- After opt (LICM):
>> int tmp = x + 1;
>> for (int i = 0; i < n; ++i) {
>>   a[i] = tmp;
>> }
>> // Assume `tmp` is never used again.
>>
>> The reasoning here, is let's make signed wrapping _deferred_ UB that will
>> only
>> occur if the value is used in one of X ways (e.g. as a denominator). To
>> that end, if
>> n == 0 and x == INT_MAX, UB will never occur because the value is never
>> used.
>>
>> But, by doing that, the first point is:
>> If we translate this into machine code, the signed wrapping _will_
>> happen, no matter
>> the value won't be used.
>>
>> Now, imagine that on some platform P, signed wrapping explodes the
>> computer.
>> The computer _will_ explode (should explode ? more on that later)
>> even if `n == 0`, something that would not happen in the original code.
>>
>> So, to justify this transformation as correct, implicitly, poison has
>> _added definedness_ to signed wrapping: specifically, that the
>> computer won't explode if SW happens. AFAIU, that is ok as far as C++
>> semantics
>> are concerned:
>> Since signed wrapping was UB, making it more defined is ok.
>>
>> But that definedness now has created a burden to whoever is writing a
>> back-end
>> from LLVM IR to P (the SW exploding platform).
>> That is, now, if they see a `add <nsw>`, they can't lower it to a trivial
>> signed add,
>> since if they do that and x == INT_MAX, the computer will explode and
>> that violates
>> the semantics of _LLVM IR_ (since we mandated that SW doesn't explode the
>> machine).
>>
>> Instead, they have to lower it as something like:
>> if (x == INT_MAX)
>>   skip or whatever
>>
>> Is this whole thinking correct ? UB, undef and poison all are very subtle
>> so I'm trying
>> to wrap my head around.
>>
>> Thanks,
>> Stefanos Baziotis
>>
>
>
> --
>
> Juneyoung Lee
> Software Foundation Lab, Seoul National University
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/d7c9a900/attachment.html>


More information about the llvm-dev mailing list