[PATCH] D61289: [globalisel] Add G_SEXT_INREG

Wed Jul 3 10:32:07 PDT 2019

dsanders added a comment.

Sorry for the slow reply, I've been split between quite a few tasks recently.

@rovka: I think I resolved the last issue you had which was that the verifier didn't ensure the immediate had been checked. Could you confirm that?

In D61289#1562830 <https://reviews.llvm.org/D61289#1562830>, @arsenm wrote:

> I'm running into a related problem attempting to implement narrowScalar for G_SEXT. AMDGPU is allergic to 64-bit shifts, so I want to implement
>
>   %1:_(s32) = G_TRUNC %0
>   %2:_(s64).= G_SEXT %1
>
>
> As
>
>   %1:_(s32) = G_TRUNC %0
>   %2:_(s32) = G_ASHR %1, 31
>   %3:_:(s64) = G_MERGE_VALUES %1, %2
>  
>
>
> Since the 64-bit shift is possible, the artifact combiner produces the undesirable 64-bit shift combination. Worse, this combination ends up infinitely looping if the required shift requires legalizatiion

With this patch, that will input turn to:

> %1:_(s64) = G_ANYEXT %0 // Skipped if %0 is s64
>  %2:_(s64) = G_SEXT_INREG %0, 32

If you then narrowScalar the G_SEXT_INREG (and I've just noticed my last update lost this part of the patch, I'll fix that in a moment), you'd get:

> %1:_(s64) = G_ANYEXT %0 // Skipped if %0 is s64
>  %3:_(s32), %4:_(s32) = G_UNMERGE_VALUES %1:_(s64)
>  %5:_(s32) = G_ASHR %4, 32
>  %2:_(s64) = G_MERGE_VALUES %3:_(s32), %5:_(s32)

assuming we have a full set of artifact combines, then we'd get one of the following: For %0 is s64:

> %3:_(s32), %4:_(s32) = G_UNMERGE_VALUES %1:_(s64)
>  %5:_(s32) = G_ASHR %4, 32
>  %2:_(s64) = G_MERGE_VALUES %3:_(s32), %5:_(s32)

for %0 is <s64 and >s32:

> %1:_(s64) = G_ANYEXT %0 // Skipped if %0 is s64
>  %3:_(s32), %4:_(s32) = G_UNMERGE_VALUES %1:_(s64)
>  %5:_(s32) = G_ASHR %4, 32
>  %2:_(s64) = G_MERGE_VALUES %3:_(s32), %5:_(s32)

for %0 is s32:

> %5:_(s32) = G_ASHR %0, 32 // %0 is s32
>  %2:_(s64) = G_MERGE_VALUES %0:_(s32), %5:_(s32)

or this for %0 <s32:

> %1:_(s32) = G_ANYEXT %0 // %0 is <s32
>  %5:_(s32) = G_ASHR %1, 32
>  %2:_(s64) = G_MERGE_VALUES %1:_(s32), %5:_(s32)

which all look like they'd do what you want.

In D61289#1563253 <https://reviews.llvm.org/D61289#1563253>, @arsenm wrote:

> I think it would be useful to have generic UBFE/SBFE instructions, which could take the place of sext_inreg. However, those would allow variable offset/width, so it wouldn't bee quite the same.

That's a good point, G_SEXT_INREG %0, 16 is really just SBFE %0, 0, 16. The snag is that, as you hint at, including variable offset and width would require us to support context sensitive legality to cover targets that have sext instructions but don't have a full signed-bit-field-extract since the immediates would have to be represented with G_CONSTANT. We could potentially fix that by supporting instructions that allow both MO.isReg() and MO.isImm() operands somehow (currently the opcode statically indicates one or the other is permitted but not both). Another option would be to have a static SBFE that only allows immediates and a dynamic one that only allows registers. In the worst case for that approach, we'd need four but I'm not sure if we'd really need them all. Do any targets allow both the position and width to be specified as registers?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61289/new/

https://reviews.llvm.org/D61289