[llvm] [RISCV] Fix missing scaling by LMUL in cost model (PR #73342)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 26 05:28:10 PST 2024
================
@@ -80,12 +84,44 @@ entry:
}
define <4 x i64> @ctpop_v4i64(ptr %a) {
-; CHECK-LABEL: define <4 x i64> @ctpop_v4i64
-; CHECK-SAME: (ptr [[A:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load <4 x i64>, ptr [[A]], align 32
-; CHECK-NEXT: [[TMP1:%.*]] = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> [[TMP0]])
-; CHECK-NEXT: ret <4 x i64> [[TMP1]]
+; RV32-LABEL: define <4 x i64> @ctpop_v4i64
+; RV32-SAME: (ptr [[A:%.*]]) #[[ATTR0]] {
+; RV32-NEXT: entry:
+; RV32-NEXT: [[TMP0:%.*]] = load <4 x i64>, ptr [[A]], align 32
+; RV32-NEXT: [[TMP1:%.*]] = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> [[TMP0]])
+; RV32-NEXT: ret <4 x i64> [[TMP1]]
+;
+; RV64-LABEL: define <4 x i64> @ctpop_v4i64
+; RV64-SAME: (ptr [[A:%.*]]) #[[ATTR0]] {
+; RV64-NEXT: entry:
+; RV64-NEXT: [[TMP0:%.*]] = load <4 x i64>, ptr [[A]], align 32
+; RV64-NEXT: [[VECEXT:%.*]] = extractelement <4 x i64> [[TMP0]], i32 0
+; RV64-NEXT: [[TMP1:%.*]] = call i64 @llvm.ctpop.i64(i64 [[VECEXT]])
+; RV64-NEXT: [[VECINS:%.*]] = insertelement <4 x i64> undef, i64 [[TMP1]], i64 0
+; RV64-NEXT: [[VECEXT_1:%.*]] = extractelement <4 x i64> [[TMP0]], i32 1
+; RV64-NEXT: [[TMP2:%.*]] = call i64 @llvm.ctpop.i64(i64 [[VECEXT_1]])
+; RV64-NEXT: [[VECINS_1:%.*]] = insertelement <4 x i64> [[VECINS]], i64 [[TMP2]], i64 1
+; RV64-NEXT: [[VECEXT_2:%.*]] = extractelement <4 x i64> [[TMP0]], i32 2
+; RV64-NEXT: [[TMP3:%.*]] = call i64 @llvm.ctpop.i64(i64 [[VECEXT_2]])
+; RV64-NEXT: [[VECINS_2:%.*]] = insertelement <4 x i64> [[VECINS_1]], i64 [[TMP3]], i64 2
+; RV64-NEXT: [[VECEXT_3:%.*]] = extractelement <4 x i64> [[TMP0]], i32 3
+; RV64-NEXT: [[TMP4:%.*]] = call i64 @llvm.ctpop.i64(i64 [[VECEXT_3]])
+; RV64-NEXT: [[VECINS_3:%.*]] = insertelement <4 x i64> [[VECINS_2]], i64 [[TMP4]], i64 3
+; RV64-NEXT: ret <4 x i64> [[VECINS_3]]
+;
+; ZVBB-LABEL: define <4 x i64> @ctpop_v4i64
+; ZVBB-SAME: (ptr [[A:%.*]]) #[[ATTR0]] {
+; ZVBB-NEXT: entry:
+; ZVBB-NEXT: [[TMP0:%.*]] = load <4 x i64>, ptr [[A]], align 32
+; ZVBB-NEXT: [[TMP1:%.*]] = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> [[TMP0]])
+; ZVBB-NEXT: ret <4 x i64> [[TMP1]]
+;
+; ZVBB64-LABEL: define <4 x i64> @ctpop_v4i64
+; ZVBB64-SAME: (ptr [[A:%.*]]) #[[ATTR0]] {
+; ZVBB64-NEXT: entry:
+; ZVBB64-NEXT: [[TMP0:%.*]] = load <4 x i64>, ptr [[A]], align 32
+; ZVBB64-NEXT: [[TMP1:%.*]] = call <4 x i64> @llvm.ctpop.v4i64(<4 x i64> [[TMP0]])
+; ZVBB64-NEXT: ret <4 x i64> [[TMP1]]
----------------
lukel97 wrote:
Nit, aren't these check lines the same? Can we merge them
https://github.com/llvm/llvm-project/pull/73342
More information about the llvm-commits
mailing list