[llvm] [X86] Reduce znver3/4 LoopMicroOpBufferSize to practical loop unrolling values (PR #91340)
via llvm-commits
llvm-commits at lists.llvm.org
Tue May 7 07:32:38 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-transforms
Author: Simon Pilgrim (RKSimon)
<details>
<summary>Changes</summary>
The znver3/4 scheduler models have previously associated the LoopMicroOpBufferSize with the maximum size of their op caches, and when this led to quadratic complexity issues this were reduced to a value of 512 uops, based mainly on compilation time and not its effectiveness on runtime performance.
>From a runtime performance POV, a large LoopMicroOpBufferSize leads to a higher number of loop unrolls, meaning the cpu has to rely on the frontend decode rate (4 ins/cy max) for much longer to fill the op cache before looping begins and we make use of the faster op cache rate (8/9 ops/cy).
This patch proposes we instead cap the size of the LoopMicroOpBufferSize based off the maximum rate from the op cache (znver3 = 8op/cy, znver4 = 9op/cy) and the branch misprediction penalty from the opcache (~12cy) as a estimate of the useful number of ops we can unroll a loop by before mispredictions are likely to cause stalls. This isn't a perfect metric, but does try to be closer to the spirit of how we use LoopMicroOpBufferSize in the compiler vs the size of a similar naming buffer in the cpu.
---
Patch is 83.40 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91340.diff
3 Files Affected:
- (modified) llvm/lib/Target/X86/X86ScheduleZnver3.td (+4-7)
- (modified) llvm/lib/Target/X86/X86ScheduleZnver4.td (+5-11)
- (modified) llvm/test/Transforms/LoopUnroll/X86/znver3.ll (+35-947)
``````````diff
diff --git a/llvm/lib/Target/X86/X86ScheduleZnver3.td b/llvm/lib/Target/X86/X86ScheduleZnver3.td
index 2e87d5262818c..cbf1de8408798 100644
--- a/llvm/lib/Target/X86/X86ScheduleZnver3.td
+++ b/llvm/lib/Target/X86/X86ScheduleZnver3.td
@@ -33,13 +33,10 @@ def Znver3Model : SchedMachineModel {
// The op cache is organized as an associative cache with 64 sets and 8 ways.
// At each set-way intersection is an entry containing up to 8 macro ops.
// The maximum capacity of the op cache is 4K ops.
- // Agner, 22.5 µop cache
- // The size of the µop cache is big enough for holding most critical loops.
- // FIXME: PR50584: MachineScheduler/PostRAScheduler have quadradic complexity,
- // with large values here the compilation of certain loops
- // ends up taking way too long.
- // let LoopMicroOpBufferSize = 4096;
- let LoopMicroOpBufferSize = 512;
+ // Assuming a maximum dispatch of 8 ops/cy and a mispredict cost of 12cy from
+ // the op-cache, we limit the loop buffer to 8*12 = 96 to avoid loop unrolling
+ // leading to excessive filling of the op-cache from frontend.
+ let LoopMicroOpBufferSize = 96;
// AMD SOG 19h, 2.6.2 L1 Data Cache
// The L1 data cache has a 4- or 5- cycle integer load-to-use latency.
// AMD SOG 19h, 2.12 L1 Data Cache
diff --git a/llvm/lib/Target/X86/X86ScheduleZnver4.td b/llvm/lib/Target/X86/X86ScheduleZnver4.td
index dac4d8422582a..7107dbc63e279 100644
--- a/llvm/lib/Target/X86/X86ScheduleZnver4.td
+++ b/llvm/lib/Target/X86/X86ScheduleZnver4.td
@@ -28,17 +28,11 @@ def Znver4Model : SchedMachineModel {
// AMD SOG 19h, 2.9.1 Op Cache
// The op cache is organized as an associative cache with 64 sets and 8 ways.
// At each set-way intersection is an entry containing up to 8 macro ops.
- // The maximum capacity of the op cache is 4K ops.
- // Agner, 22.5 µop cache
- // The size of the µop cache is big enough for holding most critical loops.
- // FIXME: PR50584: MachineScheduler/PostRAScheduler have quadradic complexity,
- // with large values here the compilation of certain loops
- // ends up taking way too long.
- // Ideally for znver4, we should have 6.75K. However we don't add that
- // considerting the impact compile time and prefer using default values
- // instead.
- // Retaining minimal value to influence unrolling as we did for znver3.
- let LoopMicroOpBufferSize = 512;
+ // The maximum capacity of the op cache is 6.75K ops.
+ // Assuming a maximum dispatch of 9 ops/cy and a mispredict cost of 12cy from
+ // the op-cache, we limit the loop buffer to 9*12 = 108 to avoid loop
+ // unrolling leading to excessive filling of the op-cache from frontend.
+ let LoopMicroOpBufferSize = 108;
// AMD SOG 19h, 2.6.2 L1 Data Cache
// The L1 data cache has a 4- or 5- cycle integer load-to-use latency.
// AMD SOG 19h, 2.12 L1 Data Cache
diff --git a/llvm/test/Transforms/LoopUnroll/X86/znver3.ll b/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
index 30389062a0967..b1f1d7d814e6c 100644
--- a/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
+++ b/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
@@ -73,456 +73,8 @@ define i32 @test(ptr %ary) "target-cpu"="znver3" {
; CHECK-NEXT: [[INDVARS_IV_NEXT_14:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 15
; CHECK-NEXT: [[ARRAYIDX_15:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_14]]
; CHECK-NEXT: [[VAL_15:%.*]] = load i32, ptr [[ARRAYIDX_15]], align 4
-; CHECK-NEXT: [[SUM_NEXT_15:%.*]] = add nsw i32 [[VAL_15]], [[SUM_NEXT_14]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_15:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 16
-; CHECK-NEXT: [[ARRAYIDX_16:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_15]]
-; CHECK-NEXT: [[VAL_16:%.*]] = load i32, ptr [[ARRAYIDX_16]], align 4
-; CHECK-NEXT: [[SUM_NEXT_16:%.*]] = add nsw i32 [[VAL_16]], [[SUM_NEXT_15]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_16:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 17
-; CHECK-NEXT: [[ARRAYIDX_17:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_16]]
-; CHECK-NEXT: [[VAL_17:%.*]] = load i32, ptr [[ARRAYIDX_17]], align 4
-; CHECK-NEXT: [[SUM_NEXT_17:%.*]] = add nsw i32 [[VAL_17]], [[SUM_NEXT_16]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_17:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 18
-; CHECK-NEXT: [[ARRAYIDX_18:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_17]]
-; CHECK-NEXT: [[VAL_18:%.*]] = load i32, ptr [[ARRAYIDX_18]], align 4
-; CHECK-NEXT: [[SUM_NEXT_18:%.*]] = add nsw i32 [[VAL_18]], [[SUM_NEXT_17]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_18:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 19
-; CHECK-NEXT: [[ARRAYIDX_19:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_18]]
-; CHECK-NEXT: [[VAL_19:%.*]] = load i32, ptr [[ARRAYIDX_19]], align 4
-; CHECK-NEXT: [[SUM_NEXT_19:%.*]] = add nsw i32 [[VAL_19]], [[SUM_NEXT_18]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_19:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 20
-; CHECK-NEXT: [[ARRAYIDX_20:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_19]]
-; CHECK-NEXT: [[VAL_20:%.*]] = load i32, ptr [[ARRAYIDX_20]], align 4
-; CHECK-NEXT: [[SUM_NEXT_20:%.*]] = add nsw i32 [[VAL_20]], [[SUM_NEXT_19]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_20:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 21
-; CHECK-NEXT: [[ARRAYIDX_21:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_20]]
-; CHECK-NEXT: [[VAL_21:%.*]] = load i32, ptr [[ARRAYIDX_21]], align 4
-; CHECK-NEXT: [[SUM_NEXT_21:%.*]] = add nsw i32 [[VAL_21]], [[SUM_NEXT_20]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_21:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 22
-; CHECK-NEXT: [[ARRAYIDX_22:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_21]]
-; CHECK-NEXT: [[VAL_22:%.*]] = load i32, ptr [[ARRAYIDX_22]], align 4
-; CHECK-NEXT: [[SUM_NEXT_22:%.*]] = add nsw i32 [[VAL_22]], [[SUM_NEXT_21]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_22:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 23
-; CHECK-NEXT: [[ARRAYIDX_23:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_22]]
-; CHECK-NEXT: [[VAL_23:%.*]] = load i32, ptr [[ARRAYIDX_23]], align 4
-; CHECK-NEXT: [[SUM_NEXT_23:%.*]] = add nsw i32 [[VAL_23]], [[SUM_NEXT_22]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_23:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 24
-; CHECK-NEXT: [[ARRAYIDX_24:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_23]]
-; CHECK-NEXT: [[VAL_24:%.*]] = load i32, ptr [[ARRAYIDX_24]], align 4
-; CHECK-NEXT: [[SUM_NEXT_24:%.*]] = add nsw i32 [[VAL_24]], [[SUM_NEXT_23]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_24:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 25
-; CHECK-NEXT: [[ARRAYIDX_25:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_24]]
-; CHECK-NEXT: [[VAL_25:%.*]] = load i32, ptr [[ARRAYIDX_25]], align 4
-; CHECK-NEXT: [[SUM_NEXT_25:%.*]] = add nsw i32 [[VAL_25]], [[SUM_NEXT_24]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_25:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 26
-; CHECK-NEXT: [[ARRAYIDX_26:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_25]]
-; CHECK-NEXT: [[VAL_26:%.*]] = load i32, ptr [[ARRAYIDX_26]], align 4
-; CHECK-NEXT: [[SUM_NEXT_26:%.*]] = add nsw i32 [[VAL_26]], [[SUM_NEXT_25]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_26:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 27
-; CHECK-NEXT: [[ARRAYIDX_27:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_26]]
-; CHECK-NEXT: [[VAL_27:%.*]] = load i32, ptr [[ARRAYIDX_27]], align 4
-; CHECK-NEXT: [[SUM_NEXT_27:%.*]] = add nsw i32 [[VAL_27]], [[SUM_NEXT_26]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_27:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 28
-; CHECK-NEXT: [[ARRAYIDX_28:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_27]]
-; CHECK-NEXT: [[VAL_28:%.*]] = load i32, ptr [[ARRAYIDX_28]], align 4
-; CHECK-NEXT: [[SUM_NEXT_28:%.*]] = add nsw i32 [[VAL_28]], [[SUM_NEXT_27]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_28:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 29
-; CHECK-NEXT: [[ARRAYIDX_29:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_28]]
-; CHECK-NEXT: [[VAL_29:%.*]] = load i32, ptr [[ARRAYIDX_29]], align 4
-; CHECK-NEXT: [[SUM_NEXT_29:%.*]] = add nsw i32 [[VAL_29]], [[SUM_NEXT_28]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_29:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 30
-; CHECK-NEXT: [[ARRAYIDX_30:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_29]]
-; CHECK-NEXT: [[VAL_30:%.*]] = load i32, ptr [[ARRAYIDX_30]], align 4
-; CHECK-NEXT: [[SUM_NEXT_30:%.*]] = add nsw i32 [[VAL_30]], [[SUM_NEXT_29]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_30:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 31
-; CHECK-NEXT: [[ARRAYIDX_31:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_30]]
-; CHECK-NEXT: [[VAL_31:%.*]] = load i32, ptr [[ARRAYIDX_31]], align 4
-; CHECK-NEXT: [[SUM_NEXT_31:%.*]] = add nsw i32 [[VAL_31]], [[SUM_NEXT_30]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_31:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 32
-; CHECK-NEXT: [[ARRAYIDX_32:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_31]]
-; CHECK-NEXT: [[VAL_32:%.*]] = load i32, ptr [[ARRAYIDX_32]], align 4
-; CHECK-NEXT: [[SUM_NEXT_32:%.*]] = add nsw i32 [[VAL_32]], [[SUM_NEXT_31]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_32:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 33
-; CHECK-NEXT: [[ARRAYIDX_33:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_32]]
-; CHECK-NEXT: [[VAL_33:%.*]] = load i32, ptr [[ARRAYIDX_33]], align 4
-; CHECK-NEXT: [[SUM_NEXT_33:%.*]] = add nsw i32 [[VAL_33]], [[SUM_NEXT_32]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_33:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 34
-; CHECK-NEXT: [[ARRAYIDX_34:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_33]]
-; CHECK-NEXT: [[VAL_34:%.*]] = load i32, ptr [[ARRAYIDX_34]], align 4
-; CHECK-NEXT: [[SUM_NEXT_34:%.*]] = add nsw i32 [[VAL_34]], [[SUM_NEXT_33]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_34:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 35
-; CHECK-NEXT: [[ARRAYIDX_35:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_34]]
-; CHECK-NEXT: [[VAL_35:%.*]] = load i32, ptr [[ARRAYIDX_35]], align 4
-; CHECK-NEXT: [[SUM_NEXT_35:%.*]] = add nsw i32 [[VAL_35]], [[SUM_NEXT_34]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_35:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 36
-; CHECK-NEXT: [[ARRAYIDX_36:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_35]]
-; CHECK-NEXT: [[VAL_36:%.*]] = load i32, ptr [[ARRAYIDX_36]], align 4
-; CHECK-NEXT: [[SUM_NEXT_36:%.*]] = add nsw i32 [[VAL_36]], [[SUM_NEXT_35]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_36:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 37
-; CHECK-NEXT: [[ARRAYIDX_37:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_36]]
-; CHECK-NEXT: [[VAL_37:%.*]] = load i32, ptr [[ARRAYIDX_37]], align 4
-; CHECK-NEXT: [[SUM_NEXT_37:%.*]] = add nsw i32 [[VAL_37]], [[SUM_NEXT_36]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_37:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 38
-; CHECK-NEXT: [[ARRAYIDX_38:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_37]]
-; CHECK-NEXT: [[VAL_38:%.*]] = load i32, ptr [[ARRAYIDX_38]], align 4
-; CHECK-NEXT: [[SUM_NEXT_38:%.*]] = add nsw i32 [[VAL_38]], [[SUM_NEXT_37]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_38:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 39
-; CHECK-NEXT: [[ARRAYIDX_39:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_38]]
-; CHECK-NEXT: [[VAL_39:%.*]] = load i32, ptr [[ARRAYIDX_39]], align 4
-; CHECK-NEXT: [[SUM_NEXT_39:%.*]] = add nsw i32 [[VAL_39]], [[SUM_NEXT_38]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_39:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 40
-; CHECK-NEXT: [[ARRAYIDX_40:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_39]]
-; CHECK-NEXT: [[VAL_40:%.*]] = load i32, ptr [[ARRAYIDX_40]], align 4
-; CHECK-NEXT: [[SUM_NEXT_40:%.*]] = add nsw i32 [[VAL_40]], [[SUM_NEXT_39]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_40:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 41
-; CHECK-NEXT: [[ARRAYIDX_41:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_40]]
-; CHECK-NEXT: [[VAL_41:%.*]] = load i32, ptr [[ARRAYIDX_41]], align 4
-; CHECK-NEXT: [[SUM_NEXT_41:%.*]] = add nsw i32 [[VAL_41]], [[SUM_NEXT_40]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_41:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 42
-; CHECK-NEXT: [[ARRAYIDX_42:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_41]]
-; CHECK-NEXT: [[VAL_42:%.*]] = load i32, ptr [[ARRAYIDX_42]], align 4
-; CHECK-NEXT: [[SUM_NEXT_42:%.*]] = add nsw i32 [[VAL_42]], [[SUM_NEXT_41]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_42:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 43
-; CHECK-NEXT: [[ARRAYIDX_43:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_42]]
-; CHECK-NEXT: [[VAL_43:%.*]] = load i32, ptr [[ARRAYIDX_43]], align 4
-; CHECK-NEXT: [[SUM_NEXT_43:%.*]] = add nsw i32 [[VAL_43]], [[SUM_NEXT_42]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_43:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 44
-; CHECK-NEXT: [[ARRAYIDX_44:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_43]]
-; CHECK-NEXT: [[VAL_44:%.*]] = load i32, ptr [[ARRAYIDX_44]], align 4
-; CHECK-NEXT: [[SUM_NEXT_44:%.*]] = add nsw i32 [[VAL_44]], [[SUM_NEXT_43]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_44:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 45
-; CHECK-NEXT: [[ARRAYIDX_45:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_44]]
-; CHECK-NEXT: [[VAL_45:%.*]] = load i32, ptr [[ARRAYIDX_45]], align 4
-; CHECK-NEXT: [[SUM_NEXT_45:%.*]] = add nsw i32 [[VAL_45]], [[SUM_NEXT_44]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_45:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 46
-; CHECK-NEXT: [[ARRAYIDX_46:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_45]]
-; CHECK-NEXT: [[VAL_46:%.*]] = load i32, ptr [[ARRAYIDX_46]], align 4
-; CHECK-NEXT: [[SUM_NEXT_46:%.*]] = add nsw i32 [[VAL_46]], [[SUM_NEXT_45]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_46:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 47
-; CHECK-NEXT: [[ARRAYIDX_47:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_46]]
-; CHECK-NEXT: [[VAL_47:%.*]] = load i32, ptr [[ARRAYIDX_47]], align 4
-; CHECK-NEXT: [[SUM_NEXT_47:%.*]] = add nsw i32 [[VAL_47]], [[SUM_NEXT_46]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_47:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 48
-; CHECK-NEXT: [[ARRAYIDX_48:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_47]]
-; CHECK-NEXT: [[VAL_48:%.*]] = load i32, ptr [[ARRAYIDX_48]], align 4
-; CHECK-NEXT: [[SUM_NEXT_48:%.*]] = add nsw i32 [[VAL_48]], [[SUM_NEXT_47]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_48:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 49
-; CHECK-NEXT: [[ARRAYIDX_49:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_48]]
-; CHECK-NEXT: [[VAL_49:%.*]] = load i32, ptr [[ARRAYIDX_49]], align 4
-; CHECK-NEXT: [[SUM_NEXT_49:%.*]] = add nsw i32 [[VAL_49]], [[SUM_NEXT_48]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_49:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 50
-; CHECK-NEXT: [[ARRAYIDX_50:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_49]]
-; CHECK-NEXT: [[VAL_50:%.*]] = load i32, ptr [[ARRAYIDX_50]], align 4
-; CHECK-NEXT: [[SUM_NEXT_50:%.*]] = add nsw i32 [[VAL_50]], [[SUM_NEXT_49]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_50:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 51
-; CHECK-NEXT: [[ARRAYIDX_51:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_50]]
-; CHECK-NEXT: [[VAL_51:%.*]] = load i32, ptr [[ARRAYIDX_51]], align 4
-; CHECK-NEXT: [[SUM_NEXT_51:%.*]] = add nsw i32 [[VAL_51]], [[SUM_NEXT_50]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_51:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 52
-; CHECK-NEXT: [[ARRAYIDX_52:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_51]]
-; CHECK-NEXT: [[VAL_52:%.*]] = load i32, ptr [[ARRAYIDX_52]], align 4
-; CHECK-NEXT: [[SUM_NEXT_52:%.*]] = add nsw i32 [[VAL_52]], [[SUM_NEXT_51]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_52:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 53
-; CHECK-NEXT: [[ARRAYIDX_53:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_52]]
-; CHECK-NEXT: [[VAL_53:%.*]] = load i32, ptr [[ARRAYIDX_53]], align 4
-; CHECK-NEXT: [[SUM_NEXT_53:%.*]] = add nsw i32 [[VAL_53]], [[SUM_NEXT_52]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_53:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 54
-; CHECK-NEXT: [[ARRAYIDX_54:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_53]]
-; CHECK-NEXT: [[VAL_54:%.*]] = load i32, ptr [[ARRAYIDX_54]], align 4
-; CHECK-NEXT: [[SUM_NEXT_54:%.*]] = add nsw i32 [[VAL_54]], [[SUM_NEXT_53]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_54:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 55
-; CHECK-NEXT: [[ARRAYIDX_55:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_54]]
-; CHECK-NEXT: [[VAL_55:%.*]] = load i32, ptr [[ARRAYIDX_55]], align 4
-; CHECK-NEXT: [[SUM_NEXT_55:%.*]] = add nsw i32 [[VAL_55]], [[SUM_NEXT_54]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_55:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 56
-; CHECK-NEXT: [[ARRAYIDX_56:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_55]]
-; CHECK-NEXT: [[VAL_56:%.*]] = load i32, ptr [[ARRAYIDX_56]], align 4
-; CHECK-NEXT: [[SUM_NEXT_56:%.*]] = add nsw i32 [[VAL_56]], [[SUM_NEXT_55]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_56:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 57
-; CHECK-NEXT: [[ARRAYIDX_57:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_56]]
-; CHECK-NEXT: [[VAL_57:%.*]] = load i32, ptr [[ARRAYIDX_57]], align 4
-; CHECK-NEXT: [[SUM_NEXT_57:%.*]] = add nsw i32 [[VAL_57]], [[SUM_NEXT_56]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_57:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 58
-; CHECK-NEXT: [[ARRAYIDX_58:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_57]]
-; CHECK-NEXT: [[VAL_58:%.*]] = load i32, ptr [[ARRAYIDX_58]], align 4
-; CHECK-NEXT: [[SUM_NEXT_58:%.*]] = add nsw i32 [[VAL_58]], [[SUM_NEXT_57]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_58:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 59
-; CHECK-NEXT: [[ARRAYIDX_59:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_58]]
-; CHECK-NEXT: [[VAL_59:%.*]] = load i32, ptr [[ARRAYIDX_59]], align 4
-; CHECK-NEXT: [[SUM_NEXT_59:%.*]] = add nsw i32 [[VAL_59]], [[SUM_NEXT_58]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_59:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 60
-; CHECK-NEXT: [[ARRAYIDX_60:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_59]]
-; CHECK-NEXT: [[VAL_60:%.*]] = load i32, ptr [[ARRAYIDX_60]], align 4
-; CHECK-NEXT: [[SUM_NEXT_60:%.*]] = add nsw i32 [[VAL_60]], [[SUM_NEXT_59]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_60:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 61
-; CHECK-NEXT: [[ARRAYIDX_61:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_60]]
-; CHECK-NEXT: [[VAL_61:%.*]] = load i32, ptr [[ARRAYIDX_61]], align 4
-; CHECK-NEXT: [[SUM_NEXT_61:%.*]] = add nsw i32 [[VAL_61]], [[SUM_NEXT_60]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_61:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 62
-; CHECK-NEXT: [[ARRAYIDX_62:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_61]]
-; CHECK-NEXT: [[VAL_62:%.*]] = load i32, ptr [[ARRAYIDX_62]], align 4
-; CHECK-NEXT: [[SUM_NEXT_62:%.*]] = add nsw i32 [[VAL_62]], [[SUM_NEXT_61]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_62:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 63
-; CHECK-NEXT: [[ARRAYIDX_63:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_62]]
-; CHECK-NEXT: [[VAL_63:%.*]] = load i32, ptr [[ARRAYIDX_63]], align 4
-; CHECK-NEXT: [[SUM_NEXT_63:%.*]] = add nsw i32 [[VAL_63]], [[SUM_NEXT_62]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_63:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 64
-; CHECK-NEXT: [[ARRAYIDX_64:%.*]] = getelementptr inbounds i32, ptr [[ARY]], i64 [[INDVARS_IV_NEXT_63]]
-; CHECK-NEXT: [[VAL_64:%.*]] = load i32, ptr [[ARRAYIDX_64]], align 4
-; CHECK-NEXT: [[SUM_NEXT_64:%.*]] = add nsw i32 [[VAL_64]], [[SUM_NEXT_63]]
-; CHECK-NEXT: [[INDVARS_IV_NEXT_64:%.*]] = add nuw nsw i64 [[INDVARS_IV...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/91340
More information about the llvm-commits
mailing list