[clang] [llvm] [mlir] [AArch64][SME] Improve codegen for aarch64.sme.cnts* when not in streaming mode (PR #154761)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 5 04:49:54 PDT 2025
================
@@ -822,16 +822,18 @@ struct OuterProductWideningOpConversion
}
};
-/// Lower `arm_sme.streaming_vl` to SME CNTS intrinsics.
+/// Lower `arm_sme.streaming_vl` to SME CNTSD intrinsic.
///
/// Example:
///
/// %0 = arm_sme.streaming_vl <half>
///
/// is converted to:
///
-/// %cnt = "arm_sme.intr.cntsh"() : () -> i64
-/// %0 = arith.index_cast %cnt : i64 to index
+/// %cnt = "arm_sme.intr.cntsd"() : () -> i64
+/// %0 = arith.constant 4 : i64
+/// %1 = arith.muli %cnt, %0 : i64
+/// %2 = arith.index_cast %1 : i64 to index
----------------
MacDue wrote:
```suggestion
/// %scale = arith.constant 4 : index
/// %cntIndex = arith.index_cast %cnt : i64 to index
/// %0 = arith.muli %cntIndex, %scale : index
```
https://github.com/llvm/llvm-project/pull/154761
More information about the llvm-commits
mailing list