[llvm] [AArch64][SVE] Select non-temporal instructions for unpredicated loads/stores with the nontemporal flag (PR #171261)
Yuta Mukai via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 11 05:57:31 PST 2025
================
@@ -3061,29 +3061,22 @@ let Predicates = [HasSVE_or_SME] in {
multiclass unpred_store<PatFrag Store, ValueType Ty, Instruction RegRegInst,
Instruction RegImmInst, Instruction PTrue,
- ComplexPattern AddrCP> {
- let AddedComplexity = 1 in {
+ ComplexPattern AddrCP, int AddedComplexity = 0> {
+ let AddedComplexity = !add(1, AddedComplexity) in {
def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
(RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
}
- let AddedComplexity = 2 in {
+ let AddedComplexity = !add(2, AddedComplexity) in {
def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
(RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
}
----------------
ytmukai wrote:
Just to confirm, do you mean I should refactor the existing `unpred_load`/`unpred_store` first, and then apply this change? I agree this would be ideal if possible.
To test this, I made the following modification in an environment without this patch and compared the results:
```diff
multiclass unpred_store<PatFrag Store, ValueType Ty, Instruction RegRegInst,
Instruction RegImmInst, Instruction PTrue,
ComplexPattern AddrCP> {
- let AddedComplexity = 1 in {
- def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
- (RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
- }
- let AddedComplexity = 2 in {
- def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
- (RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
- }
-
+ def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
+ (RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
+ def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
+ (RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
def : Pat<(Store Ty:$val, GPR64:$base),
(RegImmInst ZPR:$val, (PTrue 31), GPR64:$base, (i64 0))>;
}
```
All tests in `llvm/test/CodeGen/AArch64` passed. However, when I compare the generated `AArch64GenDAGISel.inc` files, their structure and order are different. This makes it difficult to demonstrate at a glance that the behavior is identical.
Do you have any suggestions for a good way to verify this?
https://github.com/llvm/llvm-project/pull/171261
More information about the llvm-commits
mailing list