[clang] [llvm] [LV] Support generating masks for switch terminators. (PR #99808)

via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 1 00:19:01 PDT 2024


================
@@ -6,9 +6,43 @@ define void @switch_default_to_latch_common_dest(ptr %start, ptr %end) {
 ; IC1-LABEL: define void @switch_default_to_latch_common_dest(
 ; IC1-SAME: ptr [[START:%.*]], ptr [[END:%.*]]) #[[ATTR0:[0-9]+]] {
 ; IC1-NEXT:  [[ENTRY:.*]]:
+; IC1-NEXT:    [[START2:%.*]] = ptrtoint ptr [[START]] to i64
+; IC1-NEXT:    [[END1:%.*]] = ptrtoint ptr [[END]] to i64
+; IC1-NEXT:    [[TMP0:%.*]] = add i64 [[END1]], -8
+; IC1-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], [[START2]]
+; IC1-NEXT:    [[TMP2:%.*]] = lshr i64 [[TMP1]], 3
+; IC1-NEXT:    [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
+; IC1-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP3]], 4
+; IC1-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; IC1:       [[VECTOR_PH]]:
+; IC1-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[TMP3]], 4
+; IC1-NEXT:    [[N_VEC:%.*]] = sub i64 [[TMP3]], [[N_MOD_VF]]
+; IC1-NEXT:    [[TMP4:%.*]] = mul i64 [[N_VEC]], 8
+; IC1-NEXT:    [[IND_END:%.*]] = getelementptr i8, ptr [[START]], i64 [[TMP4]]
+; IC1-NEXT:    br label %[[VECTOR_BODY:.*]]
+; IC1:       [[VECTOR_BODY]]:
+; IC1-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; IC1-NEXT:    [[OFFSET_IDX:%.*]] = mul i64 [[INDEX]], 8
+; IC1-NEXT:    [[TMP5:%.*]] = add i64 [[OFFSET_IDX]], 0
+; IC1-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[START]], i64 [[TMP5]]
+; IC1-NEXT:    [[TMP6:%.*]] = getelementptr i64, ptr [[NEXT_GEP]], i32 0
+; IC1-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP6]], align 1
+; IC1-NEXT:    [[TMP7:%.*]] = icmp eq <4 x i64> [[WIDE_LOAD]], <i64 -12, i64 -12, i64 -12, i64 -12>
+; IC1-NEXT:    [[TMP8:%.*]] = icmp eq <4 x i64> [[WIDE_LOAD]], <i64 13, i64 13, i64 13, i64 13>
+; IC1-NEXT:    [[TMP9:%.*]] = or <4 x i1> [[TMP7]], [[TMP8]]
+; IC1-NEXT:    [[TMP10:%.*]] = or <4 x i1> [[TMP9]], [[TMP9]]
----------------
ayalz wrote:

Good catch!

This stems from having parallel {pred, succ} edges in the CFG for multiple switch cases that share a common succ destination, coupled with caching a single edge mask per {pred, succ} pair of VPBB's.

The in mask of a VPBB is created by ORing the edge masks from its predecessors - suffice to visit each predecessor once.

https://github.com/llvm/llvm-project/pull/99808


More information about the llvm-commits mailing list