[PATCH] D133491: [AArch64] Try to fold shuffle (tbl2, tbl2) to tbl4.
Tim Northover via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 15 01:42:52 PDT 2022
t.p.northover added a comment.
At first glance this seems like a hyper-specific optimization, I take it there's some reasonably common idiom that motivates us even bothering?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10702-10703
+ SDValue Mask2 = Tbl2->getOperand(3);
+ // Make sure the tbl2 mask only selects values in the first 8 lanes (i.e. the
+ // last 8 lanes all have an index of -1).
+ auto IsLowerExtractMask = [](SDValue Mask) {
----------------
Why do we care about this? It looks like we've already checked that lanes being filled by this check are discarded by the shuffle.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10707
+ return false;
+ for (unsigned I = 8; I < 16; I++) {
+ auto *C = dyn_cast<ConstantSDNode>(Mask->getOperand(I));
----------------
Won't this overflow if it's a `tbl2` produding an `<8 x i8>`?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10716
+ return SDValue();
+ SmallVector<SDValue, 16> TBLMaskParts(16, Mask1->getOperand(0));
+ for (unsigned I = 0; I < 8; I++) {
----------------
Maybe default fill with `SDValue()`? We just overwrite all of them immediately afterwards anyway so that'd signal early that the reader doesn't have to care about this line.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10719
+ TBLMaskParts[I] = Mask1->getOperand(I);
+ auto *C = cast<ConstantSDNode>(Mask2->getOperand(I));
+ TBLMaskParts[I + 8] = DAG.getConstant(C->getSExtValue() + 32, dl, MVT::i32);
----------------
Have we checked anywhere that the lower 8 operands are actually constant?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D133491/new/
https://reviews.llvm.org/D133491
More information about the llvm-commits
mailing list