[PATCH] D97858: [AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible
Joe Ellis via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 8 02:17:44 PST 2021
joechrisellis added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14385
+ Opc == AArch64ISD::GLD1S_SXTW_SCALED_MERGE_ZERO ||
+ Opc == AArch64ISD::GLD1S_IMM_MERGE_ZERO) &&
+ "Unexpected opcode");
----------------
david-arm wrote:
> Are these opcodes in a range, i.e. could you assert something like:
>
> assert((Opc >= Op1 && Opc <= Op2) || (Opc >= Op3 && Opc <= Op4))
>
> It might make the assert look a bit nicer perhaps?
Much nicer -- thank you!
I've split it into two ranges -- one for unsigned gather loads, and one for signed gather loads.
```
assert(((Opc >= AArch64ISD::GLD1_MERGE_ZERO && // unsigned gather loads
Opc <= AArch64ISD::GLD1_IMM_MERGE_ZERO) ||
(Opc >= AArch64ISD::GLD1S_MERGE_ZERO && // signed gather loads
Opc <= AArch64ISD::GLD1S_IMM_MERGE_ZERO)) &&
"Invalid opcode.");```
These ranges are actually adjacent in the enum at the moment but I guess that isn't always guaranteed to be the case.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14396
+ Opc == AArch64ISD::GLD1_UXTW_SCALED_MERGE_ZERO;
+ const bool Extended = SExt || ZExt;
+
----------------
david-arm wrote:
> In fact, you could move the above assert here instead and have something like:
>
> assert(Scaled || Signed || SExt || ZExt)
>
> maybe?
I am not sure that this will work because it is possible to have an unscaled, unsigned, and unextended opcode -- e.g. `AArch64ISD::GLD1_MERGE_ZERO`. Although this point is definitely valid if we only consider the optimisation patterns that we're performing at the moment.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15827
+ case AArch64ISD::GLD1S_SXTW_SCALED_MERGE_ZERO:
+ case AArch64ISD::GLD1S_IMM_MERGE_ZERO:
+ return performGLD1Combine(N, DAG);
----------------
david-arm wrote:
> I wonder if you maybe want to make the combine function simpler for now in terms of asserts and checking by only passing in the opcodes that you actually intend to combine? If later on you want to add more then it's easy enough to update. This is just a suggestion though.
I am happy to leave it as-is for now, as long as this isn't a blocker? I think it's nice to be able to add new patterns in for the different ISD nodes without too much wrangling. Just my opinion though --override if you want! 😄
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97858/new/
https://reviews.llvm.org/D97858
More information about the llvm-commits
mailing list