[llvm] [AArch64][SVE] Select non-temporal instructions for unpredicated loads/stores with the nontemporal flag (PR #171261)

Yuta Mukai via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 11 05:57:31 PST 2025


================
@@ -3061,29 +3061,22 @@ let Predicates = [HasSVE_or_SME] in {
 
   multiclass unpred_store<PatFrag Store, ValueType Ty, Instruction RegRegInst,
                           Instruction RegImmInst, Instruction PTrue,
-                          ComplexPattern AddrCP> {
-    let AddedComplexity = 1 in {
+                          ComplexPattern AddrCP, int AddedComplexity = 0>  {
+    let AddedComplexity = !add(1, AddedComplexity) in {
       def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
                      (RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
     }
-    let AddedComplexity = 2 in {
+    let AddedComplexity = !add(2, AddedComplexity) in {
       def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
                      (RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
     }
----------------
ytmukai wrote:

Just to confirm, do you mean I should refactor the existing `unpred_load`/`unpred_store` first, and then apply this change? I agree this would be ideal if possible.

To test this, I made the following modification in an environment without this patch and compared the results:

```diff
   multiclass unpred_store<PatFrag Store, ValueType Ty, Instruction RegRegInst,
                           Instruction RegImmInst, Instruction PTrue,
                           ComplexPattern AddrCP> {
-    let AddedComplexity = 1 in {
-      def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
-                     (RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
-    }
-    let AddedComplexity = 2 in {
-      def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
-                     (RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
-    }
-
+    def _imm : Pat<(Store Ty:$val, (am_sve_indexed_s4 GPR64sp:$base, simm4s1:$offset)),
+                   (RegImmInst ZPR:$val, (PTrue 31), GPR64sp:$base, simm4s1:$offset)>;
+    def _reg : Pat<(Store Ty:$val, (AddrCP GPR64sp:$base, GPR64:$offset)),
+                   (RegRegInst ZPR:$val, (PTrue 31), GPR64sp:$base, GPR64:$offset)>;
     def : Pat<(Store Ty:$val, GPR64:$base),
               (RegImmInst ZPR:$val, (PTrue 31), GPR64:$base, (i64 0))>;
   }
```

All tests in `llvm/test/CodeGen/AArch64` passed. However, when I compare the generated `AArch64GenDAGISel.inc` files, their structure and order are different. This makes it difficult to demonstrate at a glance that the behavior is identical.

Do you have any suggestions for a good way to verify this?

https://github.com/llvm/llvm-project/pull/171261


More information about the llvm-commits mailing list