[llvm] [AArch64][GlobalISel] Expand abs.v4i8 to v4i16 and abs.v2s16 to v2s32 (PR #81231)

Dhruv Chawla via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 18 21:48:55 PST 2024


dc03-work wrote:

@davemgreen I tried to add legalization support for `G_ANYEXT` by widening <4 x i8> to <8 x i8> and that did work, however it now ends up failing further along in the pipeline in RegBankSelect because of `G_EXTRACT_VECTOR_ELT`. This is the IR that gets fed into RegBankSelect:

```llvm
# *** IR Dump Before RegBankSelect (regbankselect) ***:
# Machine code for function abs_v4i8: IsSSA, TracksLiveness, Legalized
Function Live Ins: $d0

bb.1.entry:
  liveins: $d0
  %1:_(<4 x s16>) = COPY $d0
  %41:_(s64) = G_CONSTANT i64 0
  %21:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %41:_(s64)
  %42:_(s64) = G_CONSTANT i64 1
  %22:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %42:_(s64)
  %43:_(s64) = G_CONSTANT i64 2
  %23:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %43:_(s64)
  %44:_(s64) = G_CONSTANT i64 3
  %24:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %44:_(s64)
  %4:_(s8) = G_TRUNC %21:_(s16)
  %5:_(s8) = G_TRUNC %22:_(s16)
  %6:_(s8) = G_TRUNC %23:_(s16)
  %7:_(s8) = G_TRUNC %24:_(s16)
  %8:_(s8) = G_IMPLICIT_DEF
  %9:_(<8 x s8>) = G_BUILD_VECTOR %4:_(s8), %5:_(s8), %6:_(s8), %7:_(s8), %8:_(s8), %8:_(s8), %8:_(s8), %8:_(s8)
  %10:_(<8 x s8>) = G_ABS %9:_
  %19:_(<4 x s8>), %20:_(<4 x s8>) = G_UNMERGE_VALUES %10:_(<8 x s8>)
  %45:_(s64) = G_CONSTANT i64 0
  %25:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %45:_(s64)
  %46:_(s64) = G_CONSTANT i64 1
  %26:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %46:_(s64)
  %47:_(s64) = G_CONSTANT i64 2
  %27:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %47:_(s64)
  %48:_(s64) = G_CONSTANT i64 3
  %28:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %48:_(s64)
  %29:_(<8 x s8>) = G_BUILD_VECTOR %25:_(s8), %26:_(s8), %27:_(s8), %28:_(s8), %8:_(s8), %8:_(s8), %8:_(s8), %8:_(s8)
  %30:_(<8 x s16>) = G_ANYEXT %29:_(<8 x s8>)
  %39:_(<4 x s16>), %40:_(<4 x s16>) = G_UNMERGE_VALUES %30:_(<8 x s16>)
  $d0 = COPY %39:_(<4 x s16>)
  RET_ReallyLR implicit $d0

# End machine code for function abs_v4i8.
```

Interestingly, widening <2 x i16> to <4 x i16> works just fine, although the codegen is a lot worse than SDAG.

https://github.com/llvm/llvm-project/pull/81231


More information about the llvm-commits mailing list