[llvm-dev] Is this undefined behavior optimization legal?

Kevin Choi via llvm-dev llvm-dev at lists.llvm.org
Mon Oct 3 15:04:43 PDT 2016


> This assumption is what causes it to remove the 'and' operation.

CMIIW, this assumption appears to be flawed. Initialization values are
escaping side-effects and removing them is making a correct program
incorrect.

-Kevin

On Mon, Oct 3, 2016 at 2:27 PM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> > On Oct 3, 2016, at 1:51 PM, Tom Stellard via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi,
> >
> > I've found a test case where SelectionDAG is doing an undefined behavior
> > optimization, and I need help determining whether or not this is legal.
> >
> > Here is the example IR:
> >
> > define void @test(<4 x i8> addrspace(1)* %out, float %a) {
> >  %uint8 = fptoui float %a to i8
> >  %vec = insertelement <4 x i8> <i8 0, i8 0, i8 0, i8 0>, i8 %uint8, i32 0
> >  store <4 x i8> %vec, <4 x i8> addrspace(1)* %out
> >  ret void
> > }
> >
> > Since %vec is a 32-bit vector, a common way to implement this function
> on a target
> > with 32-bit registers would be to zero initialize a 32-bit register to
> hold
> > the initial vector and then 'mask' and 'or' the inserted value with the
> > initial vector.  In AMDGPU assembly it would look something like:
> >
> > v_mov_b32 v0, 0
> > v_cvt_u32_f32_e32 v1, s0
> > v_and_b32 v1, v1, 0x000000ff
> > v_or_b32 v0, v0, v1
> >
> > The optimization the SelectionDAG does for us in this function, though,
> ends
> > up removing the mask operation.  Which gives us:
> >
> > v_mov_b32 v0, 0
> > v_cvt_u32_f32_e32 v1, s0
> > v_or_b32 v0, v0, v1
> >
> > The reason the SelectionDAG is doing this is because it knows that the
> result
> > of %uint8 = fptoui float %a to i8 is undefined when the result uses more
> than
> > 8-bits.  So, it assumes that the result will only set the low 8-bits,
> because
> > anything else would be undefined behavior and the program would be
> broken.
> > This assumption is what causes it to remove the 'and' operation.
> >
> > So effectively, what has happened here, is that by inserting the result
> of
> > an operation with undefined behavior into one lane of a vector, we have
> > overwritten all the other lanes of the vector.
> >
> > Is this optimization legal?  To me it seems wrong that undefined behavior
> > in one lane of a vector could affect another lane.
>
> Isn’t undefined behavior in a program that all the program is undefined?
> I’m not sure why you think that there should be a limit to what the
> optimizer can do specifically on the vector lane while we don’t put any
> limit usually.
>
> There might be a question about your fptoui conversion here though: is it
> guarantee to write zero to the upper bits of the 32bits register?
> In the IR it produces an i8 value, and insert it in a vector. It isn’t
> clear to me which combine / transformation knows that the fptoui will zero
> the upper part of the register.
>
>> Mehdi
>
>
>
> > However, given that LLVM IR
> > is SSA and we are technically creating a new vector and not modifying
> the old
> > one, then maybe it's OK.  I'm just not sure.
> >
> > Appreciate any insight people may have.
> >
> > Thanks,
> > Tom
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161003/6b26f0f4/attachment.html>


More information about the llvm-dev mailing list