[llvm] [AArch64][GlobalISel] Expand abs.v4i8 to v4i16 and abs.v2s16 to v2s32 (PR #81231)
Dhruv Chawla via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 18 21:48:55 PST 2024
dc03-work wrote:
@davemgreen I tried to add legalization support for `G_ANYEXT` by widening <4 x i8> to <8 x i8> and that did work, however it now ends up failing further along in the pipeline in RegBankSelect because of `G_EXTRACT_VECTOR_ELT`. This is the IR that gets fed into RegBankSelect:
```llvm
# *** IR Dump Before RegBankSelect (regbankselect) ***:
# Machine code for function abs_v4i8: IsSSA, TracksLiveness, Legalized
Function Live Ins: $d0
bb.1.entry:
liveins: $d0
%1:_(<4 x s16>) = COPY $d0
%41:_(s64) = G_CONSTANT i64 0
%21:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %41:_(s64)
%42:_(s64) = G_CONSTANT i64 1
%22:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %42:_(s64)
%43:_(s64) = G_CONSTANT i64 2
%23:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %43:_(s64)
%44:_(s64) = G_CONSTANT i64 3
%24:_(s16) = G_EXTRACT_VECTOR_ELT %1:_(<4 x s16>), %44:_(s64)
%4:_(s8) = G_TRUNC %21:_(s16)
%5:_(s8) = G_TRUNC %22:_(s16)
%6:_(s8) = G_TRUNC %23:_(s16)
%7:_(s8) = G_TRUNC %24:_(s16)
%8:_(s8) = G_IMPLICIT_DEF
%9:_(<8 x s8>) = G_BUILD_VECTOR %4:_(s8), %5:_(s8), %6:_(s8), %7:_(s8), %8:_(s8), %8:_(s8), %8:_(s8), %8:_(s8)
%10:_(<8 x s8>) = G_ABS %9:_
%19:_(<4 x s8>), %20:_(<4 x s8>) = G_UNMERGE_VALUES %10:_(<8 x s8>)
%45:_(s64) = G_CONSTANT i64 0
%25:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %45:_(s64)
%46:_(s64) = G_CONSTANT i64 1
%26:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %46:_(s64)
%47:_(s64) = G_CONSTANT i64 2
%27:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %47:_(s64)
%48:_(s64) = G_CONSTANT i64 3
%28:_(s8) = G_EXTRACT_VECTOR_ELT %19:_(<4 x s8>), %48:_(s64)
%29:_(<8 x s8>) = G_BUILD_VECTOR %25:_(s8), %26:_(s8), %27:_(s8), %28:_(s8), %8:_(s8), %8:_(s8), %8:_(s8), %8:_(s8)
%30:_(<8 x s16>) = G_ANYEXT %29:_(<8 x s8>)
%39:_(<4 x s16>), %40:_(<4 x s16>) = G_UNMERGE_VALUES %30:_(<8 x s16>)
$d0 = COPY %39:_(<4 x s16>)
RET_ReallyLR implicit $d0
# End machine code for function abs_v4i8.
```
Interestingly, widening <2 x i16> to <4 x i16> works just fine, although the codegen is a lot worse than SDAG.
https://github.com/llvm/llvm-project/pull/81231
More information about the llvm-commits
mailing list