[PATCH] D117900: [AArch64][SVE] Fold gather/scatter with 32bits when possible
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 31 06:57:32 PST 2022
sdesmalen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16155
+// their recommended values.
+static bool FindMoreOptimalIndexType(const MaskedGatherScatterSDNode *N,
+ SDValue &BasePtr, SDValue &Index,
----------------
nit: functions should start with lower case, i.e. `findMoreOptimalIndexType`.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16159
+ SelectionDAG &DAG) {
+
+ // Only consider element types that are pointer sized as smaller types can
----------------
nit: redundant newline.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16174
+ Stride =
+ dyn_cast<ConstantSDNode>(StepVector.getOperand(0))->getSExtValue();
+ Offset = DAG.getNode(ISD::MUL, DL, MVT::i64, Offset, N->getScale());
----------------
s/dyn_cast/cast/
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16180-16187
+ // Earlier returns because there is no fold change.
+ if (Stride == 0)
+ return false;
+
+ // Ensure all element can be represented as a signed i32 value.
+ if (Stride < std::numeric_limits<int32_t>::min() ||
+ Stride > std::numeric_limits<int32_t>::max())
----------------
nit: maybe just merge these conditions together:
if (!Stride || Stride < std::numeric_limits<int32_t>::min() ||
Stride > std::numeric_limits<int32_t>::max())
return false?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16225
+ // the use of an Index that's more legalisation friendly.
+ if (FindMoreOptimalIndexType(MGS, BasePtr, Index, IndexType, DAG)) {
+ if (auto *MGT = dyn_cast<MaskedGatherSDNode>(MGS)) {
----------------
nit: you can remove one more level of indentation, by writing it like this:
if (!FindMoreOptimalIndexType(MGS, BasePtr, Index, IndexType, DAG))
return SDValue();
if (auto *MGT = ...) {
...
return ...
} else {
...
return ...
}
llvm_unreachable("Unhandled case");
================
Comment at: llvm/test/CodeGen/AArch64/sve-gather-scatter-addr-opts.ll:3
+; RUN: llc < %s -mtriple=aarch64-linux-unknown | FileCheck %s
+
+; As scatter_f16_index_offset_8 but with a variable stride that we cannot prove
----------------
nit: can you group the positive test cases together (`scatter_i8_index_offset_minimum` and `scatter_i8_index_offset_maximum`), followed by the negative test cases?
================
Comment at: llvm/test/CodeGen/AArch64/sve-gather-scatter-addr-opts.ll:4
+
+; As scatter_f16_index_offset_8 but with a variable stride that we cannot prove
+; will not wrap when shrunk to be i32 based.
----------------
This test does not exist in this patch?
================
Comment at: llvm/test/CodeGen/AArch64/sve-gather-scatter-addr-opts.ll:37
+; Ensure we use a "vscale x 4" wide scatter for the maximum supported offset.
+define void @scatter_i8_index_offset_maximum(i8* %base, i64 %offset, <vscale x 4 x i1> %pg, <vscale x 4 x i8> %data) #0 {
+; CHECK-LABEL: scatter_i8_index_offset_maximum:
----------------
both positive tests use `i8` as types, can you also use a different type (e.g. i16/i32), so that we can check the scaling happens correctly? (`sxtw #1` or `sxtw #2`)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117900/new/
https://reviews.llvm.org/D117900
More information about the llvm-commits
mailing list