[llvm-dev] Is this undefined behavior optimization legal?
Tom Stellard via llvm-dev
llvm-dev at lists.llvm.org
Mon Oct 3 14:13:29 PDT 2016
On Mon, Oct 03, 2016 at 03:58:01PM -0500, Hal Finkel wrote:
> ----- Original Message -----
> > From: "Tom Stellard via llvm-dev" <llvm-dev at lists.llvm.org>
> > To: llvm-dev at lists.llvm.org
> > Sent: Monday, October 3, 2016 3:51:40 PM
> > Subject: [llvm-dev] Is this undefined behavior optimization legal?
> >
> > Hi,
> >
> > I've found a test case where SelectionDAG is doing an undefined
> > behavior
> > optimization, and I need help determining whether or not this is
> > legal.
> >
> > Here is the example IR:
> >
> > define void @test(<4 x i8> addrspace(1)* %out, float %a) {
> > %uint8 = fptoui float %a to i8
> > %vec = insertelement <4 x i8> <i8 0, i8 0, i8 0, i8 0>, i8 %uint8,
> > i32 0
> > store <4 x i8> %vec, <4 x i8> addrspace(1)* %out
> > ret void
> > }
> >
> > Since %vec is a 32-bit vector, a common way to implement this
> > function on a target
> > with 32-bit registers would be to zero initialize a 32-bit register
> > to hold
> > the initial vector and then 'mask' and 'or' the inserted value with
> > the
> > initial vector. In AMDGPU assembly it would look something like:
> >
> > v_mov_b32 v0, 0
> > v_cvt_u32_f32_e32 v1, s0
> > v_and_b32 v1, v1, 0x000000ff
> > v_or_b32 v0, v0, v1
> >
> > The optimization the SelectionDAG does for us in this function,
> > though, ends
> > up removing the mask operation. Which gives us:
> >
> > v_mov_b32 v0, 0
> > v_cvt_u32_f32_e32 v1, s0
> > v_or_b32 v0, v0, v1
> >
> > The reason the SelectionDAG is doing this is because it knows that
> > the result
> > of %uint8 = fptoui float %a to i8 is undefined when the result uses
> > more than
> > 8-bits. So, it assumes that the result will only set the low 8-bits,
> > because
> > anything else would be undefined behavior and the program would be
> > broken.
> > This assumption is what causes it to remove the 'and' operation.
> >
> > So effectively, what has happened here, is that by inserting the
> > result of
> > an operation with undefined behavior into one lane of a vector, we
> > have
> > overwritten all the other lanes of the vector.
> >
> > Is this optimization legal? To me it seems wrong that undefined
> > behavior
> > in one lane of a vector could affect another lane. However, given
> > that LLVM IR
> > is SSA and we are technically creating a new vector and not modifying
> > the old
> > one, then maybe it's OK. I'm just not sure.
> >
> > Appreciate any insight people may have.
>
> So, to be clear, for values of %a that are not undefined behavior (i.e. that really do produce an integer than can be represented in the i8), the code does indeed store <4 x i8> <i8 %uint8, i8 0, i8 0, i8 0> into *%out? If so, this seems legal to me.
>
That is correct. When there is no undefined behavior then the high 24-bits
(representing lanes 1, 2, 3) of the stored value are always 0.
-Tom
> -Hal
>
> >
> > Thanks,
> > Tom
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
More information about the llvm-dev
mailing list