[PATCH] D130692: [AArch64][SVE] Expand gather index to 32 bits instead of 64 bits

Wed Aug 31 10:00:31 PDT 2022

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:296
                              bool Signed) {
-  if (N->getOpcode() != ISD::BUILD_VECTOR)
-    return false;
----------------
I'm probably just being paranoid but with this code now sitting after the call to `getScalarSizeInBits()` can you add something like `assert(N->getValueType(0).isVector() && "Expected a vector!");` here? just in case. 

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:301
+  if (N->getOpcode() == ISD::ZERO_EXTEND)
+    return !Signed;
+  if (N->getOpcode() == ISD::SIGN_EXTEND)
----------------
It's not sufficient to only consider the result type of the extend.  For example, take
```
isVectorShrinkable(zext_to_i64(i48_node, 32, false)
```
With the current implementation you'll return true but:
```
zext_to_i64(trunc_to_i32(i48_node)) != zext_to_i64(i48_node)
```
For the sext & zext cases I think you also need N's operand to be `<= NewEltSize`

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-masked-gather.ll:1036-1041
+; CHECK-NEXT:    ptrue p1.s, vl32
 ; CHECK-NEXT:    ld1d { z0.d }, p0/z, [x0]
-; CHECK-NEXT:    ld1sw { z1.d }, p0/z, [x1]
+; CHECK-NEXT:    ld1w { z1.s }, p1/z, [x1]
 ; CHECK-NEXT:    fcmeq p1.d, p0/z, z0.d, #0.0
-; CHECK-NEXT:    ld1d { z0.d }, p1/z, [x2, z1.d, lsl #3]
+; CHECK-NEXT:    sunpklo z0.d, z1.s
+; CHECK-NEXT:    ld1d { z0.d }, p1/z, [x2, z0.d, lsl #3]
----------------
I've had a look and believe the issue is that for fixed length vectors we don't want to shrink `Index` when the main datatype of the gather/scatter is a vector of 64bit values because the data elements must line up with the offset elements and when that is not the case operation legalisation will "fix" it by explicitly extending which ever is the smaller type.

This is fixable within `findMoreOptimalIndexType` by just bailing about of such types just before the `// Can indices be trivially shrunk?` block.

The reason this is a fixed-length only problem is because for scalable vectors we use an "unpacked" format where each element of a `nxv2i32` vector sits within the bottom half of each element of a `nxv2i64` vector so element of differing sizes remain aligned to each other.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130692/new/

https://reviews.llvm.org/D130692