[PATCH] D64726: AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 16 06:58:47 PDT 2019


arsenm added a comment.

In D64726#1587496 <https://reviews.llvm.org/D64726#1587496>, @nhaehnle wrote:

> In D64726#1587457 <https://reviews.llvm.org/D64726#1587457>, @arsenm wrote:
>
> > In D64726#1587233 <https://reviews.llvm.org/D64726#1587233>, @nhaehnle wrote:
> >
> > > This seems incorrect, doesn't it? The truncation disappeared.... (e.g., what if $sgpr0 is 0x10)
> >
> >
> > My current understanding of G_TRUNC is it's a no-op, and supposed to always be legal. This is supposed to be the legalized MIR, so theoretically this was generated by something that knew the original argument was zeroext from i1
>
>
> Where do you get this from? And if this is the case, what is the real `trunc` instruction lowered to? This smells extremely fishy to me.
>
> Case in point; take this IR:
>
>   define amdgpu_ps float @foo(i32 %a) {
>     %cc = trunc i32 %a to i1
>     %r = zext i1 %cc to i32
>     %r.f = bitcast i32 %r to float
>     ret float %r.f
>   }
>
>
> which becomes, before legalization:
>
>   bb.1 (%ir-block.0):
>     liveins: $vgpr0
>     %0:_(s32) = COPY $vgpr0
>     %1:_(s1) = G_TRUNC %0:_(s32)
>     %2:_(s32) = G_ZEXT %1:_(s1)
>     $vgpr0 = COPY %2:_(s32)
>     SI_RETURN_TO_EPILOG $vgpr0
>
>
> So it is very plain here that we cannot assume the high bits of the input to be zero.
>
> Now //maybe// an exception can be made here due to something that happens earlier in legalization, but it does seem highly suspicious and it would be good to have a test all the way from IR to understand why this is happening -- having different semantics for the same opcode at different stages of the compilation is not cool.


After legalization, you get:

  %0:vgpr(s32) = COPY $vgpr0
  %5:sgpr(s32) = G_CONSTANT i32 1
  %6:vgpr(s32) = COPY %0(s32)
  %7:vgpr(s32) = COPY %5(s32)
  %4:vgpr(s32) = G_AND %6, %7
  $vgpr0 = COPY %4(s32)
  SI_RETURN_TO_EPILOG $vgpr0

The G_ZEXT here is what has the semantics of getting 0 in the high bits, not the G_TRUNC. G_TRUNC is a legalization artifact to get the sizes to match to keep the MIR valid at all points during legalization. This is the point the section about G_ANYEXT and G_TRUNC is getting at in D62423 <https://reviews.llvm.org/D62423>


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64726/new/

https://reviews.llvm.org/D64726





More information about the llvm-commits mailing list