[PATCH] D64726: AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 16 06:58:47 PDT 2019
arsenm added a comment.
In D64726#1587496 <https://reviews.llvm.org/D64726#1587496>, @nhaehnle wrote:
> In D64726#1587457 <https://reviews.llvm.org/D64726#1587457>, @arsenm wrote:
>
> > In D64726#1587233 <https://reviews.llvm.org/D64726#1587233>, @nhaehnle wrote:
> >
> > > This seems incorrect, doesn't it? The truncation disappeared.... (e.g., what if $sgpr0 is 0x10)
> >
> >
> > My current understanding of G_TRUNC is it's a no-op, and supposed to always be legal. This is supposed to be the legalized MIR, so theoretically this was generated by something that knew the original argument was zeroext from i1
>
>
> Where do you get this from? And if this is the case, what is the real `trunc` instruction lowered to? This smells extremely fishy to me.
>
> Case in point; take this IR:
>
> define amdgpu_ps float @foo(i32 %a) {
> %cc = trunc i32 %a to i1
> %r = zext i1 %cc to i32
> %r.f = bitcast i32 %r to float
> ret float %r.f
> }
>
>
> which becomes, before legalization:
>
> bb.1 (%ir-block.0):
> liveins: $vgpr0
> %0:_(s32) = COPY $vgpr0
> %1:_(s1) = G_TRUNC %0:_(s32)
> %2:_(s32) = G_ZEXT %1:_(s1)
> $vgpr0 = COPY %2:_(s32)
> SI_RETURN_TO_EPILOG $vgpr0
>
>
> So it is very plain here that we cannot assume the high bits of the input to be zero.
>
> Now //maybe// an exception can be made here due to something that happens earlier in legalization, but it does seem highly suspicious and it would be good to have a test all the way from IR to understand why this is happening -- having different semantics for the same opcode at different stages of the compilation is not cool.
After legalization, you get:
%0:vgpr(s32) = COPY $vgpr0
%5:sgpr(s32) = G_CONSTANT i32 1
%6:vgpr(s32) = COPY %0(s32)
%7:vgpr(s32) = COPY %5(s32)
%4:vgpr(s32) = G_AND %6, %7
$vgpr0 = COPY %4(s32)
SI_RETURN_TO_EPILOG $vgpr0
The G_ZEXT here is what has the semantics of getting 0 in the high bits, not the G_TRUNC. G_TRUNC is a legalization artifact to get the sizes to match to keep the MIR valid at all points during legalization. This is the point the section about G_ANYEXT and G_TRUNC is getting at in D62423 <https://reviews.llvm.org/D62423>
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D64726/new/
https://reviews.llvm.org/D64726
More information about the llvm-commits
mailing list