[llvm] cb90e53 - [IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`

Roman Lebedev via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 27 13:26:34 PDT 2021


On Wed, Oct 27, 2021 at 11:06 PM Sanjay Patel <spatel at rotateright.com> wrote:
>
> +1 to revert; the failure example seems viable.
True, the change was already reverted for another reason in
42712698fddba427a56bd9c749310cd9d8900c3b.

Roman

> Hopefully, there's some other way to avoid the problem in D109368.
> The existing folding in the builder goes back a very long time and IIRC was there to prevent some inefficiency that was noticeable in overall compile-time. If we can't see that difference currently, we should remove those folds. We don't want to recreate instsimplify in the IRBuilder.
>
> On Wed, Oct 27, 2021 at 2:55 PM Nikita Popov <nikita.ppv at gmail.com> wrote:
>>
>> On Wed, Oct 27, 2021 at 8:35 PM Roman Lebedev via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>>
>>>
>>> Author: Roman Lebedev
>>> Date: 2021-10-27T21:34:38+03:00
>>> New Revision: cb90e5356ac1594e95fed8e208d6e0e9b6a87db1
>>>
>>> URL: https://github.com/llvm/llvm-project/commit/cb90e5356ac1594e95fed8e208d6e0e9b6a87db1
>>> DIFF: https://github.com/llvm/llvm-project/commit/cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.diff
>>>
>>> LOG: [IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`
>>>
>>> There's precedent for that in `CreateOr()`/`CreateAnd()`.
>>>
>>> The motivation here is to avoid bloating the run-time check's IR
>>> in `SCEVExpander::generateOverflowCheck()`.
>>>
>>> Refs. https://reviews.llvm.org/D109368#3089809
>>
>>
>> I'm not sure that it is appropriate to introduce these kinds of folds in IRBuilder. One thing in particular this breaks in a really dangerous way is the semi-common pattern of doing something like this:
>>
>> Value *X = Builder.CreateAdd(A, B);
>> if (auto *BO = dyn_cast<BinaryOperator>(X))
>>   BO->setHasNoUnsignedWrap(NUW);
>>
>> This can now fail in a very subtle way, because CreateAdd might fold the result to either A or B, which might be a BinaryOperator, and then the flag ends up getting set on the wrong instruction. And it will only happen in degenerate cases (one operand zero and the other a binary operator), so you're unlikely to catch it in simple tests.
>>
>> Here is an example of this kind of usage: https://github.com/llvm/llvm-project/blob/cb90e5356ac1594e95fed8e208d6e0e9b6a87db1/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp#L1276-L1280  From a very cursory look, I think the described issue may affect this usage depending on worklist order (i.e. if a 0 << shamt hasn't been folded yet).
>>
>> I realize that what you're doing here is not entirely new, but it was previously limited to just CreateAnd/CreateOr, which at least don't take attributes or metadata, so it was less problematic in practice. Arguably that fold shouldn't exist either though. I think it would be good to revert and go through a review on these changes, as they change IRBuilder semantics/guarantees in a significant way.
>>
>> Regards,
>> Nikita
>>
>>>
>>> Added:
>>>
>>>
>>> Modified:
>>>     clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c
>>>     clang/test/CodeGen/complex-convert.c
>>>     clang/test/CodeGen/matrix-type-operators.c
>>>     clang/test/CodeGen/volatile-1.c
>>>     clang/test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp
>>>     clang/test/CodeGenCXX/virtual-base-cast.cpp
>>>     clang/test/CodeGenCXX/volatile-1.cpp
>>>     clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
>>>     llvm/include/llvm/IR/IRBuilder.h
>>>     llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll
>>>     llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime-be.ll
>>>     llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime.ll
>>>     llvm/test/Instrumentation/HWAddressSanitizer/basic.ll
>>>     llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64.ll
>>>     llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64el.ll
>>>     llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64.ll
>>>     llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64le.ll
>>>     llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll
>>>     llvm/test/Transforms/LoopIdiom/X86/arithmetic-right-shift-until-zero.ll
>>>     llvm/test/Transforms/LoopIdiom/X86/left-shift-until-zero.ll
>>>     llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll
>>>     llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll
>>>     llvm/test/Transforms/LoopVectorize/AArch64/scalarize-store-with-predication.ll
>>>     llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
>>>     llvm/test/Transforms/LoopVectorize/ARM/tail-folding-scalar-epilogue-fallback.ll
>>>     llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll
>>>     llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll
>>>     llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
>>>     llvm/test/Transforms/LoopVectorize/if-pred-stores.ll
>>>     llvm/test/Transforms/LoopVectorize/pointer-induction.ll
>>>     llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
>>>     llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll
>>>     llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll
>>>     llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll
>>>     llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1.ll
>>>     llvm/test/Transforms/LoopVectorize/unroll_nonlatch.ll
>>>     llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll
>>>     llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll
>>>     llvm/unittests/IR/PatternMatch.cpp
>>>
>>> Removed:
>>>
>>>
>>>
>>> ################################################################################
>>> diff  --git a/clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c b/clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c
>>> index 22acc1e865329..08eea57dcba80 100644
>>> --- a/clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c
>>> +++ b/clang/test/CodeGen/catch-nullptr-and-nonzero-offset.c
>>> @@ -83,16 +83,15 @@ char *var_zero(char *base) {
>>>    // CHECK-NEXT:                          %[[BASE_RELOADED:.*]] = load i8*, i8** %[[BASE_ADDR]], align 8
>>>    // CHECK-NEXT:                          %[[ADD_PTR:.*]] = getelementptr inbounds i8, i8* %[[BASE_RELOADED]], i64 0
>>>    // CHECK-SANITIZE-C-NEXT:               %[[BASE_RELOADED_INT:.*]] = ptrtoint i8* %[[BASE_RELOADED]] to i64, !nosanitize
>>> -  // CHECK-SANITIZE-C-NEXT:               %[[COMPUTED_GEP:.*]] = add i64 %[[BASE_RELOADED_INT]], 0, !nosanitize
>>>    // CHECK-SANITIZE-C-NEXT:               %[[BASE_IS_NOT_NULLPTR:.*]] = icmp ne i8* %[[BASE_RELOADED]], null, !nosanitize
>>> -  // CHECK-SANITIZE-C-NEXT:               %[[COMPUTED_GEP_IS_NOT_NULL:.*]] = icmp ne i64 %[[COMPUTED_GEP]], 0, !nosanitize
>>> +  // CHECK-SANITIZE-C-NEXT:               %[[COMPUTED_GEP_IS_NOT_NULL:.*]] = icmp ne i64 %[[BASE_RELOADED_INT]], 0, !nosanitize
>>>    // CHECK-SANITIZE-C-NEXT:             %[[BOTH_POINTERS_ARE_NULL_OR_BOTH_ARE_NONNULL:.*]] = and i1 %[[BASE_IS_NOT_NULLPTR]], %[[COMPUTED_GEP_IS_NOT_NULL]], !nosanitize
>>> -  // CHECK-SANITIZE-C-NEXT:               %[[COMPUTED_GEP_IS_UGE_BASE:.*]] = icmp uge i64 %[[COMPUTED_GEP]], %[[BASE_RELOADED_INT]], !nosanitize
>>> +  // CHECK-SANITIZE-C-NEXT:               %[[COMPUTED_GEP_IS_UGE_BASE:.*]] = icmp uge i64 %[[BASE_RELOADED_INT]], %[[BASE_RELOADED_INT]], !nosanitize
>>>    // CHECK-SANITIZE-C-NEXT:               %[[GEP_IS_OKAY:.*]] = and i1 %[[BOTH_POINTERS_ARE_NULL_OR_BOTH_ARE_NONNULL]], %[[COMPUTED_GEP_IS_UGE_BASE]], !nosanitize
>>>    // CHECK-SANITIZE-C-NEXT:               br i1 %[[GEP_IS_OKAY]], label %[[CONT:.*]], label %[[HANDLER_POINTER_OVERFLOW:[^,]+]],{{.*}} !nosanitize
>>>    // CHECK-SANITIZE-C:                  [[HANDLER_POINTER_OVERFLOW]]:
>>> -  // CHECK-SANITIZE-NORECOVER-C-NEXT:     call void @__ubsan_handle_pointer_overflow_abort(i8* bitcast ({ {{{.*}}} }* @[[LINE_200]] to i8*), i64 %[[BASE_RELOADED_INT]], i64 %[[COMPUTED_GEP]])
>>> -  // CHECK-SANITIZE-RECOVER-C-NEXT:       call void @__ubsan_handle_pointer_overflow(i8* bitcast ({ {{{.*}}} }* @[[LINE_200]] to i8*), i64 %[[BASE_RELOADED_INT]], i64 %[[COMPUTED_GEP]])
>>> +  // CHECK-SANITIZE-NORECOVER-C-NEXT:     call void @__ubsan_handle_pointer_overflow_abort(i8* bitcast ({ {{{.*}}} }* @[[LINE_200]] to i8*), i64 %[[BASE_RELOADED_INT]], i64 %[[BASE_RELOADED_INT]])
>>> +  // CHECK-SANITIZE-RECOVER-C-NEXT:       call void @__ubsan_handle_pointer_overflow(i8* bitcast ({ {{{.*}}} }* @[[LINE_200]] to i8*), i64 %[[BASE_RELOADED_INT]], i64 %[[BASE_RELOADED_INT]])
>>>    // CHECK-SANITIZE-TRAP-C-NEXT:          call void @llvm.ubsantrap(i8 19){{.*}}, !nosanitize
>>>    // CHECK-SANITIZE-UNREACHABLE-C-NEXT:   unreachable, !nosanitize
>>>    // CHECK-SANITIZE-C:                  [[CONT]]:
>>> @@ -170,18 +169,17 @@ char *nullptr_var(unsigned long offset) {
>>>    // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET_AGGREGATE:.*]] = call { i64, i1 } @llvm.smul.with.overflow.i64(i64 1, i64 %[[OFFSET_RELOADED]]), !nosanitize
>>>    // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET_OVERFLOWED:.*]] = extractvalue { i64, i1 } %[[COMPUTED_OFFSET_AGGREGATE]], 1, !nosanitize
>>>    // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET:.*]] = extractvalue { i64, i1 } %[[COMPUTED_OFFSET_AGGREGATE]], 0, !nosanitize
>>> -  // CHECK-SANITIZE-NEXT:               %[[COMPUTED_GEP:.*]] = add i64 %[[COMPUTED_OFFSET]], 0, !nosanitize
>>> -  // CHECK-SANITIZE-NEXT:               %[[COMPUTED_GEP_IS_NOT_NULL:.*]] = icmp ne i64 %[[COMPUTED_GEP]], 0, !nosanitize
>>> -  // CHECK-SANITIZE-CPP-NEXT:           %[[BOTH_POINTERS_ARE_NULL_OR_BOTH_ARE_NONNULL:.*]] = icmp eq i1 false, %[[COMPUTED_GEP_IS_NOT_NULL]], !nosanitize
>>> +  // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET_IS_NOT_NULL:.*]] = icmp ne i64 %[[COMPUTED_OFFSET]], 0, !nosanitize
>>> +  // CHECK-SANITIZE-CPP-NEXT:           %[[BOTH_POINTERS_ARE_NULL_OR_BOTH_ARE_NONNULL:.*]] = icmp eq i1 false, %[[COMPUTED_OFFSET_IS_NOT_NULL]], !nosanitize
>>>    // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET_DID_NOT_OVERFLOW:.*]] = xor i1 %[[COMPUTED_OFFSET_OVERFLOWED]], true, !nosanitize
>>> -  // CHECK-SANITIZE-NEXT:               %[[COMPUTED_GEP_IS_UGE_BASE:.*]] = icmp uge i64 %[[COMPUTED_GEP]], 0, !nosanitize
>>> -  // CHECK-SANITIZE-NEXT:               %[[GEP_DID_NOT_OVERFLOW:.*]] = and i1 %[[COMPUTED_GEP_IS_UGE_BASE]], %[[COMPUTED_OFFSET_DID_NOT_OVERFLOW]], !nosanitize
>>> +  // CHECK-SANITIZE-NEXT:               %[[COMPUTED_OFFSET_IS_UGE_BASE:.*]] = icmp uge i64 %[[COMPUTED_OFFSET]], 0, !nosanitize
>>> +  // CHECK-SANITIZE-NEXT:               %[[GEP_DID_NOT_OVERFLOW:.*]] = and i1 %[[COMPUTED_OFFSET_IS_UGE_BASE]], %[[COMPUTED_OFFSET_DID_NOT_OVERFLOW]], !nosanitize
>>>    // CHECK-SANITIZE-CPP-NEXT:           %[[GEP_IS_OKAY:.*]] = and i1 %[[BOTH_POINTERS_ARE_NULL_OR_BOTH_ARE_NONNULL]], %[[GEP_DID_NOT_OVERFLOW]], !nosanitize
>>>    // CHECK-SANITIZE-C-NEXT:             br i1 false, label %[[CONT:.*]], label %[[HANDLER_POINTER_OVERFLOW:[^,]+]],{{.*}} !nosanitize
>>>    // CHECK-SANITIZE-CPP-NEXT:           br i1 %[[GEP_IS_OKAY]], label %[[CONT:.*]], label %[[HANDLER_POINTER_OVERFLOW:[^,]+]],{{.*}} !nosanitize
>>>    // CHECK-SANITIZE:                  [[HANDLER_POINTER_OVERFLOW]]:
>>> -  // CHECK-SANITIZE-NORECOVER-NEXT:     call void @__ubsan_handle_pointer_overflow_abort(i8* bitcast ({ {{{.*}}} }* @[[LINE_500]] to i8*), i64 0, i64 %[[COMPUTED_GEP]])
>>> -  // CHECK-SANITIZE-RECOVER-NEXT:       call void @__ubsan_handle_pointer_overflow(i8* bitcast ({ {{{.*}}} }* @[[LINE_500]] to i8*), i64 0, i64 %[[COMPUTED_GEP]])
>>> +  // CHECK-SANITIZE-NORECOVER-NEXT:     call void @__ubsan_handle_pointer_overflow_abort(i8* bitcast ({ {{{.*}}} }* @[[LINE_500]] to i8*), i64 0, i64 %[[COMPUTED_OFFSET]])
>>> +  // CHECK-SANITIZE-RECOVER-NEXT:       call void @__ubsan_handle_pointer_overflow(i8* bitcast ({ {{{.*}}} }* @[[LINE_500]] to i8*), i64 0, i64 %[[COMPUTED_OFFSET]])
>>>    // CHECK-SANITIZE-TRAP-NEXT:          call void @llvm.ubsantrap(i8 19){{.*}}, !nosanitize
>>>    // CHECK-SANITIZE-UNREACHABLE-NEXT:   unreachable, !nosanitize
>>>    // CHECK-SANITIZE:                  [[CONT]]:
>>>
>>> diff  --git a/clang/test/CodeGen/complex-convert.c b/clang/test/CodeGen/complex-convert.c
>>> index 14ba5c2ae8e2d..f295cc86f172e 100644
>>> --- a/clang/test/CodeGen/complex-convert.c
>>> +++ b/clang/test/CodeGen/complex-convert.c
>>> @@ -251,9 +251,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR100:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR97]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR101:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR99]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR102:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR95]], %[[VAR100]]
>>> -  // CHECK-NEXT: %[[VAR103:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR101]], 0
>>>    // CHECK-NEXT: %[[VAR104:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR102]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR105:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR103]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR105:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR101]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR106:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR107:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR104]], i[[CHSIZE]]* %[[VAR106]]
>>> @@ -269,9 +268,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR114:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR111]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR115:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR113]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR116:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR109]], %[[VAR114]]
>>> -  // CHECK-NEXT: %[[VAR117:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR115]], 0
>>>    // CHECK-NEXT: %[[VAR118:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR116]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR119:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR117]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR119:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR115]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR120:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR121:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR118]], i[[CHSIZE]]* %[[VAR120]]
>>> @@ -285,11 +283,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR126:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR127:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR126]]
>>>    // CHECK-NEXT: %[[VAR128:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR123]], %[[VAR125]]
>>> -  // CHECK-NEXT: %[[VAR129:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR127]], 0
>>>    // CHECK-NEXT: %[[VAR130:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR131:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR128]], i[[LLSIZE]]* %[[VAR130]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR129]], i[[LLSIZE]]* %[[VAR131]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR127]], i[[LLSIZE]]* %[[VAR131]]
>>>
>>>    cull1 = sc + cull;
>>>    // CHECK-NEXT: %[[VAR132:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[SCADDR]], align [[CHALIGN]]
>>> @@ -299,11 +296,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR136:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR137:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR136]]
>>>    // CHECK-NEXT: %[[VAR138:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR133]], %[[VAR135]]
>>> -  // CHECK-NEXT: %[[VAR139:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR137]], 0
>>>    // CHECK-NEXT: %[[VAR140:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR141:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR138]], i[[LLSIZE]]* %[[VAR140]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR139]], i[[LLSIZE]]* %[[VAR141]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR137]], i[[LLSIZE]]* %[[VAR141]]
>>>
>>>    csc1 = uc + csc;
>>>    // CHECK-NEXT: %[[VAR142:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>> @@ -315,9 +311,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR148:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR145]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR149:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR147]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR150:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR143]], %[[VAR148]]
>>> -  // CHECK-NEXT: %[[VAR151:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR149]], 0
>>>    // CHECK-NEXT: %[[VAR152:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR150]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR153:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR151]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR153:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR149]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR154:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR155:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR152]], i[[CHSIZE]]* %[[VAR154]]
>>> @@ -333,9 +328,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR162:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR159]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR163:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR161]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR164:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR157]], %[[VAR162]]
>>> -  // CHECK-NEXT: %[[VAR165:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR163]], 0
>>>    // CHECK-NEXT: %[[VAR166:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR164]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR167:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR165]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR167:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR163]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR168:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR169:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR166]], i[[CHSIZE]]* %[[VAR168]]
>>> @@ -349,11 +343,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR174:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR175:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR174]]
>>>    // CHECK-NEXT: %[[VAR176:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR171]], %[[VAR173]]
>>> -  // CHECK-NEXT: %[[VAR177:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR175]], 0
>>>    // CHECK-NEXT: %[[VAR178:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR179:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR176]], i[[LLSIZE]]* %[[VAR178]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR177]], i[[LLSIZE]]* %[[VAR179]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR175]], i[[LLSIZE]]* %[[VAR179]]
>>>
>>>    cull1 = uc + cull;
>>>    // CHECK-NEXT: %[[VAR180:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>> @@ -363,11 +356,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR184:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR185:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR184]]
>>>    // CHECK-NEXT: %[[VAR186:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR181]], %[[VAR183]]
>>> -  // CHECK-NEXT: %[[VAR187:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR185]], 0
>>>    // CHECK-NEXT: %[[VAR188:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR189:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR186]], i[[LLSIZE]]* %[[VAR188]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR187]], i[[LLSIZE]]* %[[VAR189]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR185]], i[[LLSIZE]]* %[[VAR189]]
>>>
>>>    csll1 = sll + csc;
>>>    // CHECK-NEXT: %[[VAR190:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>> @@ -378,11 +370,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR195:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR192]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR196:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR194]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR197:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR190]], %[[VAR195]]
>>> -  // CHECK-NEXT: %[[VAR198:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR196]], 0
>>>    // CHECK-NEXT: %[[VAR199:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR200:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR197]], i[[LLSIZE]]* %[[VAR199]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR198]], i[[LLSIZE]]* %[[VAR200]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR196]], i[[LLSIZE]]* %[[VAR200]]
>>>
>>>    csll1 = sll + cuc;
>>>    // CHECK-NEXT: %[[VAR201:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>> @@ -393,11 +384,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR206:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR203]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR207:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR205]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR208:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR201]], %[[VAR206]]
>>> -  // CHECK-NEXT: %[[VAR209:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR207]], 0
>>>    // CHECK-NEXT: %[[VAR210:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR211:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR208]], i[[LLSIZE]]* %[[VAR210]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR209]], i[[LLSIZE]]* %[[VAR211]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR207]], i[[LLSIZE]]* %[[VAR211]]
>>>
>>>    csll1 = sll + csll;
>>>    // CHECK-NEXT: %[[VAR212:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>> @@ -406,11 +396,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR215:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR216:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR215]]
>>>    // CHECK-NEXT: %[[VAR217:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR212]], %[[VAR214]]
>>> -  // CHECK-NEXT: %[[VAR218:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR216]], 0
>>>    // CHECK-NEXT: %[[VAR219:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR220:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR217]], i[[LLSIZE]]* %[[VAR219]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR218]], i[[LLSIZE]]* %[[VAR220]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR216]], i[[LLSIZE]]* %[[VAR220]]
>>>
>>>    csll1 = sll + cull;
>>>    // CHECK-NEXT: %[[VAR221:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>> @@ -419,11 +408,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR224:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR225:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR224]]
>>>    // CHECK-NEXT: %[[VAR226:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR221]], %[[VAR223]]
>>> -  // CHECK-NEXT: %[[VAR227:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR225]], 0
>>>    // CHECK-NEXT: %[[VAR228:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR229:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR226]], i[[LLSIZE]]* %[[VAR228]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR227]], i[[LLSIZE]]* %[[VAR229]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR225]], i[[LLSIZE]]* %[[VAR229]]
>>>
>>>    csll1 = ull + csc;
>>>    // CHECK-NEXT: %[[VAR230:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>> @@ -434,11 +422,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR235:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR232]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR236:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR234]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR237:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR230]], %[[VAR235]]
>>> -  // CHECK-NEXT: %[[VAR238:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR236]], 0
>>>    // CHECK-NEXT: %[[VAR239:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR240:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR237]], i[[LLSIZE]]* %[[VAR239]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR238]], i[[LLSIZE]]* %[[VAR240]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR236]], i[[LLSIZE]]* %[[VAR240]]
>>>
>>>    cull1 = ull + cuc;
>>>    // CHECK-NEXT: %[[VAR241:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>> @@ -449,11 +436,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR246:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR243]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR247:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR245]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR248:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR241]], %[[VAR246]]
>>> -  // CHECK-NEXT: %[[VAR249:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR247]], 0
>>>    // CHECK-NEXT: %[[VAR250:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR251:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR248]], i[[LLSIZE]]* %[[VAR250]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR249]], i[[LLSIZE]]* %[[VAR251]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR247]], i[[LLSIZE]]* %[[VAR251]]
>>>
>>>    csll1 = ull + csll;
>>>    // CHECK-NEXT: %[[VAR252:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>> @@ -462,11 +448,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR255:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR256:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR255]]
>>>    // CHECK-NEXT: %[[VAR257:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR252]], %[[VAR254]]
>>> -  // CHECK-NEXT: %[[VAR258:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR256]], 0
>>>    // CHECK-NEXT: %[[VAR259:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR260:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR257]], i[[LLSIZE]]* %[[VAR259]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR258]], i[[LLSIZE]]* %[[VAR260]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR256]], i[[LLSIZE]]* %[[VAR260]]
>>>
>>>    cull1 = ull + cull;
>>>    // CHECK-NEXT: %[[VAR261:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>> @@ -475,11 +460,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR264:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: %[[VAR265:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR264]]
>>>    // CHECK-NEXT: %[[VAR266:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR261]], %[[VAR263]]
>>> -  // CHECK-NEXT: %[[VAR267:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR265]], 0
>>>    // CHECK-NEXT: %[[VAR268:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR269:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR266]], i[[LLSIZE]]* %[[VAR268]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR267]], i[[LLSIZE]]* %[[VAR269]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR265]], i[[LLSIZE]]* %[[VAR269]]
>>>
>>>    csc1 = csc + sc;
>>>    // CHECK-NEXT: %[[VAR270:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -491,9 +475,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR276:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[SCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR277:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR276]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR278:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR274]], %[[VAR277]]
>>> -  // CHECK-NEXT: %[[VAR279:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR275]], 0
>>>    // CHECK-NEXT: %[[VAR280:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR278]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR281:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR279]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR281:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR275]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR282:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR283:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR280]], i[[CHSIZE]]* %[[VAR282]]
>>> @@ -509,9 +492,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR290:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR291:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR290]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR292:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR288]], %[[VAR291]]
>>> -  // CHECK-NEXT: %[[VAR293:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR289]], 0
>>>    // CHECK-NEXT: %[[VAR294:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR292]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR295:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR293]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR295:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR289]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR296:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR297:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR294]], i[[CHSIZE]]* %[[VAR296]]
>>> @@ -526,11 +508,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR303:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR301]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR304:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR305:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR302]], %[[VAR304]]
>>> -  // CHECK-NEXT: %[[VAR306:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR303]], 0
>>>    // CHECK-NEXT: %[[VAR307:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR308:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR305]], i[[LLSIZE]]* %[[VAR307]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR306]], i[[LLSIZE]]* %[[VAR308]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR303]], i[[LLSIZE]]* %[[VAR308]]
>>>
>>>    csll1 = csc + ull;
>>>    // CHECK-NEXT: %[[VAR309:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -541,11 +522,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR314:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR312]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR315:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR316:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR313]], %[[VAR315]]
>>> -  // CHECK-NEXT: %[[VAR317:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR314]], 0
>>>    // CHECK-NEXT: %[[VAR318:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR319:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR316]], i[[LLSIZE]]* %[[VAR318]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR317]], i[[LLSIZE]]* %[[VAR319]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR314]], i[[LLSIZE]]* %[[VAR319]]
>>>
>>>    csc1 = cuc + sc;
>>>    // CHECK-NEXT: %[[VAR320:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -557,9 +537,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR326:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[SCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR327:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR326]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR328:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR324]], %[[VAR327]]
>>> -  // CHECK-NEXT: %[[VAR329:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR325]], 0
>>>    // CHECK-NEXT: %[[VAR330:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR328]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR331:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR329]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR331:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR325]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR332:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR333:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CSC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR330]], i[[CHSIZE]]* %[[VAR332]]
>>> @@ -575,9 +554,8 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR340:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR341:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR340]] to i[[ARSIZE]]
>>>    // CHECK-NEXT: %[[VAR342:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR338]], %[[VAR341]]
>>> -  // CHECK-NEXT: %[[VAR343:[A-Za-z0-9.]+]] = add i[[ARSIZE]] %[[VAR339]], 0
>>>    // CHECK-NEXT: %[[VAR344:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR342]] to i[[CHSIZE]]
>>> -  // CHECK-NEXT: %[[VAR345:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR343]] to i[[CHSIZE]]
>>> +  // CHECK-NEXT: %[[VAR345:[A-Za-z0-9.]+]] = trunc i[[ARSIZE]] %[[VAR339]] to i[[CHSIZE]]
>>>    // CHECK-NEXT: %[[VAR346:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR347:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[CHSIZE]] %[[VAR344]], i[[CHSIZE]]* %[[VAR346]]
>>> @@ -592,11 +570,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR353:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR351]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR354:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR355:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR352]], %[[VAR354]]
>>> -  // CHECK-NEXT: %[[VAR356:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR353]], 0
>>>    // CHECK-NEXT: %[[VAR357:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR358:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR355]], i[[LLSIZE]]* %[[VAR357]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR356]], i[[LLSIZE]]* %[[VAR358]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR353]], i[[LLSIZE]]* %[[VAR358]]
>>>
>>>    cull1 = cuc + ull;
>>>    // CHECK-NEXT: %[[VAR357:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[CHSIZE]], i[[CHSIZE]] }, { i[[CHSIZE]], i[[CHSIZE]] }* %[[CUC]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -607,11 +584,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR362:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR360]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR363:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR364:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR361]], %[[VAR363]]
>>> -  // CHECK-NEXT: %[[VAR365:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR362]], 0
>>>    // CHECK-NEXT: %[[VAR366:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR367:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR364]], i[[LLSIZE]]* %[[VAR366]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR365]], i[[LLSIZE]]* %[[VAR367]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR362]], i[[LLSIZE]]* %[[VAR367]]
>>>
>>>    csll1 = csll + sc;
>>>    // CHECK-NEXT: %[[VAR368:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -621,11 +597,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR372:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[SCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR373:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR372]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR374:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR369]], %[[VAR373]]
>>> -  // CHECK-NEXT: %[[VAR375:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR371]], 0
>>>    // CHECK-NEXT: %[[VAR376:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR377:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR374]], i[[LLSIZE]]* %[[VAR376]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR375]], i[[LLSIZE]]* %[[VAR377]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR371]], i[[LLSIZE]]* %[[VAR377]]
>>>
>>>    csll1 = csll + uc;
>>>    // CHECK-NEXT: %[[VAR378:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -635,11 +610,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR382:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR383:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR382]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR384:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR379]], %[[VAR383]]
>>> -  // CHECK-NEXT: %[[VAR385:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR381]], 0
>>>    // CHECK-NEXT: %[[VAR386:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR387:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR384]], i[[LLSIZE]]* %[[VAR386]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR385]], i[[LLSIZE]]* %[[VAR387]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR381]], i[[LLSIZE]]* %[[VAR387]]
>>>
>>>    csll1 = csll + sll;
>>>    // CHECK-NEXT: %[[VAR388:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -648,11 +622,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR391:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR390]]
>>>    // CHECK-NEXT: %[[VAR392:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR393:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR389]], %[[VAR392]]
>>> -  // CHECK-NEXT: %[[VAR394:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR391]], 0
>>>    // CHECK-NEXT: %[[VAR395:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR396:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR393]], i[[LLSIZE]]* %[[VAR395]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR394]], i[[LLSIZE]]* %[[VAR396]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR391]], i[[LLSIZE]]* %[[VAR396]]
>>>
>>>    csll1 = csll + ull;
>>>    // CHECK-NEXT: %[[VAR397:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -661,11 +634,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR400:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR399]]
>>>    // CHECK-NEXT: %[[VAR401:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR402:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR398]], %[[VAR401]]
>>> -  // CHECK-NEXT: %[[VAR403:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR400]], 0
>>>    // CHECK-NEXT: %[[VAR404:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR405:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR402]], i[[LLSIZE]]* %[[VAR404]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR403]], i[[LLSIZE]]* %[[VAR405]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR400]], i[[LLSIZE]]* %[[VAR405]]
>>>
>>>    csll1 = cull + sc;
>>>    // CHECK-NEXT: %[[VAR406:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -675,11 +647,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR410:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[SCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR411:[A-Za-z0-9.]+]] = sext i[[CHSIZE]] %[[VAR410]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR412:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR407]], %[[VAR411]]
>>> -  // CHECK-NEXT: %[[VAR413:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR409]], 0
>>>    // CHECK-NEXT: %[[VAR414:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR415:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR412]], i[[LLSIZE]]* %[[VAR414]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR413]], i[[LLSIZE]]* %[[VAR415]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR409]], i[[LLSIZE]]* %[[VAR415]]
>>>
>>>    cull1 = cull + uc;
>>>    // CHECK-NEXT: %[[VAR416:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -689,11 +660,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR420:[A-Za-z0-9.]+]] = load i[[CHSIZE]], i[[CHSIZE]]* %[[UCADDR]], align [[CHALIGN]]
>>>    // CHECK-NEXT: %[[VAR421:[A-Za-z0-9.]+]] = zext i[[CHSIZE]] %[[VAR420]] to i[[LLSIZE]]
>>>    // CHECK-NEXT: %[[VAR422:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR417]], %[[VAR421]]
>>> -  // CHECK-NEXT: %[[VAR423:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR419]], 0
>>>    // CHECK-NEXT: %[[VAR424:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR425:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR422]], i[[LLSIZE]]* %[[VAR424]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR423]], i[[LLSIZE]]* %[[VAR425]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR419]], i[[LLSIZE]]* %[[VAR425]]
>>>
>>>    csll1 = cull + sll;
>>>    // CHECK-NEXT: %[[VAR426:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -702,11 +672,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR429:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR428]]
>>>    // CHECK-NEXT: %[[VAR430:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[SLLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR431:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR427]], %[[VAR430]]
>>> -  // CHECK-NEXT: %[[VAR432:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR429]], 0
>>>    // CHECK-NEXT: %[[VAR433:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR434:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CSLL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR431]], i[[LLSIZE]]* %[[VAR433]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR432]], i[[LLSIZE]]* %[[VAR434]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR429]], i[[LLSIZE]]* %[[VAR434]]
>>>
>>>    cull1 = cull + ull;
>>>    // CHECK-NEXT: %[[VAR435:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>> @@ -715,11 +684,10 @@ void foo(signed char sc, unsigned char uc, signed long long sll,
>>>    // CHECK-NEXT: %[[VAR438:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[VAR437]]
>>>    // CHECK-NEXT: %[[VAR439:[A-Za-z0-9.]+]] = load i[[LLSIZE]], i[[LLSIZE]]* %[[ULLADDR]], align [[LLALIGN]]
>>>    // CHECK-NEXT: %[[VAR440:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR436]], %[[VAR439]]
>>> -  // CHECK-NEXT: %[[VAR441:[A-Za-z0-9.]+]] = add i[[LLSIZE]] %[[VAR438]], 0
>>>    // CHECK-NEXT: %[[VAR442:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 0
>>>    // CHECK-NEXT: %[[VAR443:[A-Za-z0-9.]+]] = getelementptr inbounds { i[[LLSIZE]], i[[LLSIZE]] }, { i[[LLSIZE]], i[[LLSIZE]] }* %[[CULL1]], i{{[0-9]+}} 0, i{{[0-9]+}} 1
>>>    // CHECK-NEXT: store i[[LLSIZE]] %[[VAR440]], i[[LLSIZE]]* %[[VAR442]]
>>> -  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR441]], i[[LLSIZE]]* %[[VAR443]]
>>> +  // CHECK-NEXT: store i[[LLSIZE]] %[[VAR438]], i[[LLSIZE]]* %[[VAR443]]
>>>  }
>>>
>>>  // This code used to cause a crash; test that it no longer does so.
>>>
>>> diff  --git a/clang/test/CodeGen/matrix-type-operators.c b/clang/test/CodeGen/matrix-type-operators.c
>>> index 8f6bc7e87a1ba..b108e09905504 100644
>>> --- a/clang/test/CodeGen/matrix-type-operators.c
>>> +++ b/clang/test/CodeGen/matrix-type-operators.c
>>> @@ -1039,11 +1039,10 @@ void insert_extract(dx5x5_t a, fx3x3_t b, unsigned long j, short k) {
>>>    // CHECK:         [[K:%.*]] = load i16, i16* %k.addr, align 2
>>>    // CHECK-NEXT:    [[K_EXT:%.*]] = sext i16 [[K]] to i64
>>>    // CHECK-NEXT:    [[IDX1:%.*]] = mul i64 [[K_EXT]], 3
>>> -  // CHECK-NEXT:    [[IDX2:%.*]] = add i64 [[IDX1]], 0
>>> -  // OPT-NEXT:      [[CMP:%.*]] = icmp ult i64 [[IDX2]], 9
>>> +  // OPT-NEXT:      [[CMP:%.*]] = icmp ult i64 [[IDX1]], 9
>>>    // OPT-NEXT:      call void @llvm.assume(i1 [[CMP]])
>>>    // CHECK-NEXT:    [[MAT:%.*]] = load <9 x float>, <9 x float>* [[MAT_ADDR:%.*]], align 4
>>> -  // CHECK-NEXT:    [[MATEXT:%.*]] = extractelement <9 x float> [[MAT]], i64 [[IDX2]]
>>> +  // CHECK-NEXT:    [[MATEXT:%.*]] = extractelement <9 x float> [[MAT]], i64 [[IDX1]]
>>>    // CHECK-NEXT:    [[J:%.*]] = load i64, i64* %j.addr, align 8
>>>    // CHECK-NEXT:    [[IDX3:%.*]] = mul i64 [[J]], 3
>>>    // CHECK-NEXT:    [[IDX4:%.*]] = add i64 [[IDX3]], 2
>>>
>>> diff  --git a/clang/test/CodeGen/volatile-1.c b/clang/test/CodeGen/volatile-1.c
>>> index a0c7093363d70..2e5983f8a8999 100644
>>> --- a/clang/test/CodeGen/volatile-1.c
>>> +++ b/clang/test/CodeGen/volatile-1.c
>>> @@ -201,7 +201,7 @@ void test() {
>>>    __real (i = j);
>>>    // CHECK-NEXT: load volatile
>>>    __imag i;
>>> -
>>> +
>>>    // ============================================================
>>>    // FIXME: Test cases we get wrong.
>>>
>>> @@ -219,7 +219,7 @@ void test() {
>>>    // CHECK-NEXT: call void @llvm.memcpy{{.*}}, i1 true
>>>    ((a=a),a);
>>>
>>> -  // Not a use.  gcc gets this wrong, it doesn't emit the copy!
>>> +  // Not a use.  gcc gets this wrong, it doesn't emit the copy!
>>>    // (void)(a=a);
>>>
>>>    // Not a use.  gcc got this wrong in 4.2 and omitted the side effects
>>> @@ -278,7 +278,7 @@ void test() {
>>>    // A use.
>>>    // CHECK-NEXT: load volatile
>>>    // CHECK-NEXT: add
>>> -  i + 0;
>>> +  i + 1;
>>>    // A use.
>>>    // CHECK-NEXT: load volatile
>>>    // CHECK-NEXT: store volatile
>>> @@ -290,7 +290,7 @@ void test() {
>>>    // CHECK-NEXT: load volatile
>>>    // CHECK-NEXT: store volatile
>>>    // CHECK-NEXT: add
>>> -  (i=j) + 0;
>>> +  (i=j) + 1;
>>>
>>>  #ifdef __cplusplus
>>>    (i,j)=k;
>>> @@ -320,7 +320,6 @@ int test2() {
>>>    // CHECK-NEXT: load volatile i32, i32*
>>>    // CHECK-NEXT: load volatile i32, i32*
>>>    // CHECK-NEXT: add i32
>>> -  // CHECK-NEXT: add i32
>>>    // CHECK-NEXT: store volatile i32
>>>    // CHECK-NEXT: ret i32
>>>    return i += ci;
>>>
>>> diff  --git a/clang/test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp b/clang/test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp
>>> index 06194e4a30431..99ae065427d63 100644
>>> --- a/clang/test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp
>>> +++ b/clang/test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp
>>> @@ -39,10 +39,9 @@ B::B() {
>>>    // CHECK:   %[[THIS_i8:.*]] = bitcast %struct.B* %[[THIS]] to i8*
>>>    // CHECK:   %[[VBPTR:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 0
>>>    // ...
>>> -  // CHECK:   %[[VBASE_OFFSET:.*]] = add nsw i32 %{{.*}}, 0
>>> -  // CHECK:   %[[VTORDISP_VAL:.*]] = sub i32 %[[VBASE_OFFSET]], 8
>>> +  // CHECK:   %[[VTORDISP_VAL:.*]] = sub i32 %{{.*}}, 8
>>>    // CHECK:   %[[THIS_i8:.*]] = bitcast %struct.B* %[[THIS]] to i8*
>>> -  // CHECK:   %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 %[[VBASE_OFFSET]]
>>> +  // CHECK:   %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 %{{.*}}
>>>    // CHECK:   %[[VTORDISP_i8:.*]] = getelementptr i8, i8* %[[VBASE_i8]], i32 -4
>>>    // CHECK:   %[[VTORDISP_PTR:.*]] = bitcast i8* %[[VTORDISP_i8]] to i32*
>>>    // CHECK:   store i32 %[[VTORDISP_VAL]], i32* %[[VTORDISP_PTR]]
>>> @@ -74,10 +73,9 @@ B::~B() {
>>>    // CHECK:   %[[THIS_i8:.*]] = bitcast %struct.B* %[[THIS]] to i8*
>>>    // CHECK:   %[[VBPTR:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 0
>>>    // ...
>>> -  // CHECK:   %[[VBASE_OFFSET:.*]] = add nsw i32 %{{.*}}, 0
>>> -  // CHECK:   %[[VTORDISP_VAL:.*]] = sub i32 %[[VBASE_OFFSET]], 8
>>> +  // CHECK:   %[[VTORDISP_VAL:.*]] = sub i32 %{{.*}}, 8
>>>    // CHECK:   %[[THIS_i8:.*]] = bitcast %struct.B* %[[THIS]] to i8*
>>> -  // CHECK:   %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 %[[VBASE_OFFSET]]
>>> +  // CHECK:   %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS_i8]], i32 %{{.*}}
>>>    // CHECK:   %[[VTORDISP_i8:.*]] = getelementptr i8, i8* %[[VBASE_i8]], i32 -4
>>>    // CHECK:   %[[VTORDISP_PTR:.*]] = bitcast i8* %[[VTORDISP_i8]] to i32*
>>>    // CHECK:   store i32 %[[VTORDISP_VAL]], i32* %[[VTORDISP_PTR]]
>>> @@ -138,9 +136,8 @@ void B::foo() {
>>>  // CHECK: %[[VBTABLE:.*]] = load i32*, i32** %[[VBPTR8]]
>>>  // CHECK: %[[VBENTRY:.*]] = getelementptr inbounds i32, i32* %[[VBTABLE]], i32 1
>>>  // CHECK: %[[VBOFFSET32:.*]] = load i32, i32* %[[VBENTRY]]
>>> -// CHECK: %[[VBOFFSET:.*]] = add nsw i32 %[[VBOFFSET32]], 0
>>>  // CHECK: %[[THIS8:.*]] = bitcast %struct.B* %[[THIS]] to i8*
>>> -// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS8]], i32 %[[VBOFFSET]]
>>> +// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[THIS8]], i32 %[[VBOFFSET32]]
>>>  // CHECK: %[[VBASE:.*]] = bitcast i8* %[[VBASE_i8]] to %struct.VBase*
>>>  // CHECK: %[[FIELD:.*]] = getelementptr inbounds %struct.VBase, %struct.VBase* %[[VBASE]], i32 0, i32 1
>>>  // CHECK: store i32 42, i32* %[[FIELD]], align 4
>>> @@ -162,8 +159,7 @@ void call_vbase_bar(B *obj) {
>>>  // CHECK: %[[VBTABLE:.*]] = load i32*, i32** %[[VBPTR8]]
>>>  // CHECK: %[[VBENTRY:.*]] = getelementptr inbounds i32, i32* %[[VBTABLE]], i32 1
>>>  // CHECK: %[[VBOFFSET32:.*]] = load i32, i32* %[[VBENTRY]]
>>> -// CHECK: %[[VBOFFSET:.*]] = add nsw i32 %[[VBOFFSET32]], 0
>>> -// CHECK: %[[VBASE:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET]]
>>> +// CHECK: %[[VBASE:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET32]]
>>>  //
>>>  // CHECK: %[[OBJ_i8:.*]] = bitcast %struct.B* %[[OBJ]] to i8*
>>>  // CHECK: %[[VBPTR:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 0
>>> @@ -171,8 +167,7 @@ void call_vbase_bar(B *obj) {
>>>  // CHECK: %[[VBTABLE:.*]] = load i32*, i32** %[[VBPTR8]]
>>>  // CHECK: %[[VBENTRY:.*]] = getelementptr inbounds i32, i32* %[[VBTABLE]], i32 1
>>>  // CHECK: %[[VBOFFSET32:.*]] = load i32, i32* %[[VBENTRY]]
>>> -// CHECK: %[[VBOFFSET:.*]] = add nsw i32 %[[VBOFFSET32]], 0
>>> -// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET]]
>>> +// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET32]]
>>>  // CHECK: %[[VFPTR:.*]] = bitcast i8* %[[VBASE_i8]] to void (i8*)***
>>>  // CHECK: %[[VFTABLE:.*]] = load void (i8*)**, void (i8*)*** %[[VFPTR]]
>>>  // CHECK: %[[VFUN:.*]] = getelementptr inbounds void (i8*)*, void (i8*)** %[[VFTABLE]], i64 2
>>> @@ -194,8 +189,7 @@ void delete_B(B *obj) {
>>>  // CHECK: %[[VBTABLE:.*]] = load i32*, i32** %[[VBPTR8]]
>>>  // CHECK: %[[VBENTRY:.*]] = getelementptr inbounds i32, i32* %[[VBTABLE]], i32 1
>>>  // CHECK: %[[VBOFFSET32:.*]] = load i32, i32* %[[VBENTRY]]
>>> -// CHECK: %[[VBOFFSET:.*]] = add nsw i32 %[[VBOFFSET32]], 0
>>> -// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET]]
>>> +// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET32]]
>>>  // CHECK: %[[VBASE:.*]] = bitcast i8* %[[VBASE_i8]] to %struct.B*
>>>  //
>>>  // CHECK: %[[OBJ_i8:.*]] = bitcast %struct.B* %[[OBJ]] to i8*
>>> @@ -204,8 +198,7 @@ void delete_B(B *obj) {
>>>  // CHECK: %[[VBTABLE:.*]] = load i32*, i32** %[[VBPTR8]]
>>>  // CHECK: %[[VBENTRY:.*]] = getelementptr inbounds i32, i32* %[[VBTABLE]], i32 1
>>>  // CHECK: %[[VBOFFSET32:.*]] = load i32, i32* %[[VBENTRY]]
>>> -// CHECK: %[[VBOFFSET:.*]] = add nsw i32 %[[VBOFFSET32]], 0
>>> -// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET]]
>>> +// CHECK: %[[VBASE_i8:.*]] = getelementptr inbounds i8, i8* %[[OBJ_i8]], i32 %[[VBOFFSET32]]
>>>  // CHECK: %[[VFPTR:.*]] = bitcast i8* %[[VBASE_i8]] to i8* (%struct.B*, i32)***
>>>  // CHECK: %[[VFTABLE:.*]] = load i8* (%struct.B*, i32)**, i8* (%struct.B*, i32)*** %[[VFPTR]]
>>>  // CHECK: %[[VFUN:.*]] = getelementptr inbounds i8* (%struct.B*, i32)*, i8* (%struct.B*, i32)** %[[VFTABLE]], i64 0
>>>
>>> diff  --git a/clang/test/CodeGenCXX/virtual-base-cast.cpp b/clang/test/CodeGenCXX/virtual-base-cast.cpp
>>> index 9cde81ee05de9..466b0933ec12c 100644
>>> --- a/clang/test/CodeGenCXX/virtual-base-cast.cpp
>>> +++ b/clang/test/CodeGenCXX/virtual-base-cast.cpp
>>> @@ -24,7 +24,6 @@ A* a() { return x; }
>>>  // MSVC:   %[[vbtable:.*]] = load i32*, i32** %[[vbptr]]
>>>  // MSVC:   %[[entry:.*]] = getelementptr inbounds i32, i32* {{.*}}, i32 1
>>>  // MSVC:   %[[offset:.*]] = load i32, i32* %[[entry]]
>>> -// MSVC:   add nsw i32 %[[offset]], 0
>>>  // MSVC: }
>>>
>>>  B* b() { return x; }
>>> @@ -41,7 +40,6 @@ B* b() { return x; }
>>>  // MSVC:   %[[vbtable:.*]] = load i32*, i32** %[[vbptr]]
>>>  // MSVC:   %[[entry:.*]] = getelementptr inbounds i32, i32* {{.*}}, i32 2
>>>  // MSVC:   %[[offset:.*]] = load i32, i32* %[[entry]]
>>> -// MSVC:   add nsw i32 %[[offset]], 0
>>>  // MSVC: }
>>>
>>>
>>> @@ -60,7 +58,6 @@ BB* c() { return x; }
>>>  // MSVC:   %[[vbtable:.*]] = load i32*, i32** %[[vbptr]]
>>>  // MSVC:   %[[entry:.*]] = getelementptr inbounds i32, i32* {{.*}}, i32 4
>>>  // MSVC:   %[[offset:.*]] = load i32, i32* %[[entry]]
>>> -// MSVC:   add nsw i32 %[[offset]], 0
>>>  // MSVC: }
>>>
>>>  // Put the vbptr at a non-zero offset inside a non-virtual base.
>>>
>>> diff  --git a/clang/test/CodeGenCXX/volatile-1.cpp b/clang/test/CodeGenCXX/volatile-1.cpp
>>> index 525e828da3934..db2155cc5fe8b 100644
>>> --- a/clang/test/CodeGenCXX/volatile-1.cpp
>>> +++ b/clang/test/CodeGenCXX/volatile-1.cpp
>>> @@ -248,7 +248,7 @@ void test() {
>>>    // CHECK-NEXT: store volatile
>>>
>>>    __imag i;
>>> -
>>> +
>>>    // ============================================================
>>>    // FIXME: Test cases we get wrong.
>>>
>>> @@ -264,7 +264,7 @@ void test() {
>>>    // CHECK-NEXT: call {{.*}}void
>>>    ((a=a),a);
>>>
>>> -  // Not a use.  gcc gets this wrong, it doesn't emit the copy!
>>> +  // Not a use.  gcc gets this wrong, it doesn't emit the copy!
>>>    // CHECK-NEXT: call {{.*}}void
>>>    (void)(a=a);
>>>
>>> @@ -331,7 +331,7 @@ void test() {
>>>    // CHECK-NEXT: store volatile
>>>
>>>    // A use.
>>> -  i + 0;
>>> +  i + 1;
>>>    // CHECK-NEXT: load volatile
>>>    // CHECK-NEXT: add
>>>
>>> @@ -345,7 +345,7 @@ void test() {
>>>
>>>    // A use.  gcc treats this as not a use, that's probably a bug due to tree
>>>    // folding ignoring volatile.
>>> -  (i=j) + 0;
>>> +  (i=j) + 1;
>>>    // CHECK-NEXT: load volatile
>>>    // CHECK-NEXT: store volatile
>>>    // CHECK-NEXT: load volatile
>>>
>>> diff  --git a/clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp b/clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
>>> index bd3832635d9b1..b15a12463d292 100644
>>> --- a/clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
>>> +++ b/clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
>>> @@ -34,7 +34,7 @@ void bar() {
>>>    //CHECK: store i32 %or, i32 addrspace(1)* @globI
>>>    globI |= b;
>>>    //CHECK: store i32 %add, i32 addrspace(1)* @globI
>>> -  globI += a;
>>> +  globI += b;
>>>    //CHECK: [[GVIV1:%[0-9]+]] = load volatile i32, i32 addrspace(1)* @globVI
>>>    //CHECK: [[AND:%[a-z0-9]+]] = and i32 [[GVIV1]], 1
>>>    //CHECK: store volatile i32 [[AND]], i32 addrspace(1)* @globVI
>>>
>>> diff  --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
>>> index d2a2f947adac3..5bbeb5727adc1 100644
>>> --- a/llvm/include/llvm/IR/IRBuilder.h
>>> +++ b/llvm/include/llvm/IR/IRBuilder.h
>>> @@ -1213,6 +1213,9 @@ class IRBuilderBase {
>>>                     bool HasNUW = false, bool HasNSW = false) {
>>>      if (!isa<Constant>(RHS) && isa<Constant>(LHS))
>>>        std::swap(LHS, RHS);
>>> +    if (auto RCI = dyn_cast<ConstantInt>(RHS))
>>> +      if (RCI->isZero())
>>> +        return LHS; // LHS + 0 -> LHS
>>>      if (auto *LC = dyn_cast<Constant>(LHS))
>>>        if (auto *RC = dyn_cast<Constant>(RHS))
>>>          return Insert(Folder.CreateAdd(LC, RC, HasNUW, HasNSW), Name);
>>>
>>> diff  --git a/llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll b/llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll
>>> index b64d67bdc4bc9..6272ac92109e4 100644
>>> --- a/llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll
>>> +++ b/llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll
>>> @@ -33,14 +33,12 @@ define void @Simple() uwtable sanitize_address {
>>>  ; NEVER-NEXT:    store i64 ptrtoint (void ()* @Simple to i64), i64* [[TMP7]], align 8
>>>  ; NEVER-NEXT:    [[TMP8:%.*]] = lshr i64 [[TMP0]], 3
>>>  ; NEVER-NEXT:    [[TMP9:%.*]] = add i64 [[TMP8]], 2147450880
>>> -; NEVER-NEXT:    [[TMP10:%.*]] = add i64 [[TMP9]], 0
>>> -; NEVER-NEXT:    [[TMP11:%.*]] = inttoptr i64 [[TMP10]] to i64*
>>> -; NEVER-NEXT:    store i64 -868083113472691727, i64* [[TMP11]], align 1
>>> +; NEVER-NEXT:    [[TMP10:%.*]] = inttoptr i64 [[TMP9]] to i64*
>>> +; NEVER-NEXT:    store i64 -868083113472691727, i64* [[TMP10]], align 1
>>>  ; NEVER-NEXT:    call void @Foo(i8* [[TMP2]])
>>>  ; NEVER-NEXT:    store i64 1172321806, i64* [[TMP3]], align 8
>>> -; NEVER-NEXT:    [[TMP12:%.*]] = add i64 [[TMP9]], 0
>>> -; NEVER-NEXT:    [[TMP13:%.*]] = inttoptr i64 [[TMP12]] to i64*
>>> -; NEVER-NEXT:    store i64 0, i64* [[TMP13]], align 1
>>> +; NEVER-NEXT:    [[TMP11:%.*]] = inttoptr i64 [[TMP9]] to i64*
>>> +; NEVER-NEXT:    store i64 0, i64* [[TMP11]], align 1
>>>  ; NEVER-NEXT:    ret void
>>>  ;
>>>  ; RUNTIME-LABEL: @Simple(
>>> @@ -75,29 +73,26 @@ define void @Simple() uwtable sanitize_address {
>>>  ; RUNTIME-NEXT:    store i64 ptrtoint (void ()* @Simple to i64), i64* [[TMP17]], align 8
>>>  ; RUNTIME-NEXT:    [[TMP18:%.*]] = lshr i64 [[TMP10]], 3
>>>  ; RUNTIME-NEXT:    [[TMP19:%.*]] = add i64 [[TMP18]], 2147450880
>>> -; RUNTIME-NEXT:    [[TMP20:%.*]] = add i64 [[TMP19]], 0
>>> -; RUNTIME-NEXT:    [[TMP21:%.*]] = inttoptr i64 [[TMP20]] to i64*
>>> -; RUNTIME-NEXT:    store i64 -868083113472691727, i64* [[TMP21]], align 1
>>> +; RUNTIME-NEXT:    [[TMP20:%.*]] = inttoptr i64 [[TMP19]] to i64*
>>> +; RUNTIME-NEXT:    store i64 -868083113472691727, i64* [[TMP20]], align 1
>>>  ; RUNTIME-NEXT:    call void @Foo(i8* [[TMP12]])
>>>  ; RUNTIME-NEXT:    store i64 1172321806, i64* [[TMP13]], align 8
>>> -; RUNTIME-NEXT:    [[TMP22:%.*]] = icmp ne i64 [[TMP5]], 0
>>> -; RUNTIME-NEXT:    br i1 [[TMP22]], label [[TMP23:%.*]], label [[TMP30:%.*]]
>>> -; RUNTIME:       23:
>>> -; RUNTIME-NEXT:    [[TMP24:%.*]] = add i64 [[TMP19]], 0
>>> +; RUNTIME-NEXT:    [[TMP21:%.*]] = icmp ne i64 [[TMP5]], 0
>>> +; RUNTIME-NEXT:    br i1 [[TMP21]], label [[TMP22:%.*]], label [[TMP28:%.*]]
>>> +; RUNTIME:       22:
>>> +; RUNTIME-NEXT:    [[TMP23:%.*]] = inttoptr i64 [[TMP19]] to i64*
>>> +; RUNTIME-NEXT:    store i64 -723401728380766731, i64* [[TMP23]], align 1
>>> +; RUNTIME-NEXT:    [[TMP24:%.*]] = add i64 [[TMP5]], 56
>>>  ; RUNTIME-NEXT:    [[TMP25:%.*]] = inttoptr i64 [[TMP24]] to i64*
>>> -; RUNTIME-NEXT:    store i64 -723401728380766731, i64* [[TMP25]], align 1
>>> -; RUNTIME-NEXT:    [[TMP26:%.*]] = add i64 [[TMP5]], 56
>>> -; RUNTIME-NEXT:    [[TMP27:%.*]] = inttoptr i64 [[TMP26]] to i64*
>>> -; RUNTIME-NEXT:    [[TMP28:%.*]] = load i64, i64* [[TMP27]], align 8
>>> -; RUNTIME-NEXT:    [[TMP29:%.*]] = inttoptr i64 [[TMP28]] to i8*
>>> -; RUNTIME-NEXT:    store i8 0, i8* [[TMP29]], align 1
>>> -; RUNTIME-NEXT:    br label [[TMP33:%.*]]
>>> +; RUNTIME-NEXT:    [[TMP26:%.*]] = load i64, i64* [[TMP25]], align 8
>>> +; RUNTIME-NEXT:    [[TMP27:%.*]] = inttoptr i64 [[TMP26]] to i8*
>>> +; RUNTIME-NEXT:    store i8 0, i8* [[TMP27]], align 1
>>> +; RUNTIME-NEXT:    br label [[TMP30:%.*]]
>>> +; RUNTIME:       28:
>>> +; RUNTIME-NEXT:    [[TMP29:%.*]] = inttoptr i64 [[TMP19]] to i64*
>>> +; RUNTIME-NEXT:    store i64 0, i64* [[TMP29]], align 1
>>> +; RUNTIME-NEXT:    br label [[TMP30]]
>>>  ; RUNTIME:       30:
>>> -; RUNTIME-NEXT:    [[TMP31:%.*]] = add i64 [[TMP19]], 0
>>> -; RUNTIME-NEXT:    [[TMP32:%.*]] = inttoptr i64 [[TMP31]] to i64*
>>> -; RUNTIME-NEXT:    store i64 0, i64* [[TMP32]], align 1
>>> -; RUNTIME-NEXT:    br label [[TMP33]]
>>> -; RUNTIME:       33:
>>>  ; RUNTIME-NEXT:    ret void
>>>  ;
>>>  ; ALWAYS-LABEL: @Simple(
>>> @@ -125,29 +120,26 @@ define void @Simple() uwtable sanitize_address {
>>>  ; ALWAYS-NEXT:    store i64 ptrtoint (void ()* @Simple to i64), i64* [[TMP12]], align 8
>>>  ; ALWAYS-NEXT:    [[TMP13:%.*]] = lshr i64 [[TMP5]], 3
>>>  ; ALWAYS-NEXT:    [[TMP14:%.*]] = add i64 [[TMP13]], 2147450880
>>> -; ALWAYS-NEXT:    [[TMP15:%.*]] = add i64 [[TMP14]], 0
>>> -; ALWAYS-NEXT:    [[TMP16:%.*]] = inttoptr i64 [[TMP15]] to i64*
>>> -; ALWAYS-NEXT:    store i64 -868083113472691727, i64* [[TMP16]], align 1
>>> +; ALWAYS-NEXT:    [[TMP15:%.*]] = inttoptr i64 [[TMP14]] to i64*
>>> +; ALWAYS-NEXT:    store i64 -868083113472691727, i64* [[TMP15]], align 1
>>>  ; ALWAYS-NEXT:    call void @Foo(i8* [[TMP7]])
>>>  ; ALWAYS-NEXT:    store i64 1172321806, i64* [[TMP8]], align 8
>>> -; ALWAYS-NEXT:    [[TMP17:%.*]] = icmp ne i64 [[TMP0]], 0
>>> -; ALWAYS-NEXT:    br i1 [[TMP17]], label [[TMP18:%.*]], label [[TMP25:%.*]]
>>> -; ALWAYS:       18:
>>> -; ALWAYS-NEXT:    [[TMP19:%.*]] = add i64 [[TMP14]], 0
>>> +; ALWAYS-NEXT:    [[TMP16:%.*]] = icmp ne i64 [[TMP0]], 0
>>> +; ALWAYS-NEXT:    br i1 [[TMP16]], label [[TMP17:%.*]], label [[TMP23:%.*]]
>>> +; ALWAYS:       17:
>>> +; ALWAYS-NEXT:    [[TMP18:%.*]] = inttoptr i64 [[TMP14]] to i64*
>>> +; ALWAYS-NEXT:    store i64 -723401728380766731, i64* [[TMP18]], align 1
>>> +; ALWAYS-NEXT:    [[TMP19:%.*]] = add i64 [[TMP0]], 56
>>>  ; ALWAYS-NEXT:    [[TMP20:%.*]] = inttoptr i64 [[TMP19]] to i64*
>>> -; ALWAYS-NEXT:    store i64 -723401728380766731, i64* [[TMP20]], align 1
>>> -; ALWAYS-NEXT:    [[TMP21:%.*]] = add i64 [[TMP0]], 56
>>> -; ALWAYS-NEXT:    [[TMP22:%.*]] = inttoptr i64 [[TMP21]] to i64*
>>> -; ALWAYS-NEXT:    [[TMP23:%.*]] = load i64, i64* [[TMP22]], align 8
>>> -; ALWAYS-NEXT:    [[TMP24:%.*]] = inttoptr i64 [[TMP23]] to i8*
>>> -; ALWAYS-NEXT:    store i8 0, i8* [[TMP24]], align 1
>>> -; ALWAYS-NEXT:    br label [[TMP28:%.*]]
>>> +; ALWAYS-NEXT:    [[TMP21:%.*]] = load i64, i64* [[TMP20]], align 8
>>> +; ALWAYS-NEXT:    [[TMP22:%.*]] = inttoptr i64 [[TMP21]] to i8*
>>> +; ALWAYS-NEXT:    store i8 0, i8* [[TMP22]], align 1
>>> +; ALWAYS-NEXT:    br label [[TMP25:%.*]]
>>> +; ALWAYS:       23:
>>> +; ALWAYS-NEXT:    [[TMP24:%.*]] = inttoptr i64 [[TMP14]] to i64*
>>> +; ALWAYS-NEXT:    store i64 0, i64* [[TMP24]], align 1
>>> +; ALWAYS-NEXT:    br label [[TMP25]]
>>>  ; ALWAYS:       25:
>>> -; ALWAYS-NEXT:    [[TMP26:%.*]] = add i64 [[TMP14]], 0
>>> -; ALWAYS-NEXT:    [[TMP27:%.*]] = inttoptr i64 [[TMP26]] to i64*
>>> -; ALWAYS-NEXT:    store i64 0, i64* [[TMP27]], align 1
>>> -; ALWAYS-NEXT:    br label [[TMP28]]
>>> -; ALWAYS:       28:
>>>  ; ALWAYS-NEXT:    ret void
>>>  ;
>>>  entry:
>>> @@ -173,39 +165,37 @@ define void @Huge() uwtable sanitize_address {
>>>  ; CHECK-NEXT:    store i64 ptrtoint (void ()* @Huge to i64), i64* [[TMP7]], align 8
>>>  ; CHECK-NEXT:    [[TMP8:%.*]] = lshr i64 [[TMP0]], 3
>>>  ; CHECK-NEXT:    [[TMP9:%.*]] = add i64 [[TMP8]], 2147450880
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = add i64 [[TMP9]], 0
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = inttoptr i64 [[TMP10]] to i32*
>>> -; CHECK-NEXT:    store i32 -235802127, i32* [[TMP11]], align 1
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = add i64 [[TMP9]], 12504
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = inttoptr i64 [[TMP12]] to i64*
>>> -; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP13]], align 1
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = add i64 [[TMP9]], 12512
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = inttoptr i64 [[TMP14]] to i64*
>>> -; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP15]], align 1
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = add i64 [[TMP9]], 12520
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = inttoptr i64 [[TMP16]] to i64*
>>> -; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP17]], align 1
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = add i64 [[TMP9]], 12528
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = inttoptr i64 [[TMP18]] to i64*
>>> -; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP19]], align 1
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = inttoptr i64 [[TMP9]] to i32*
>>> +; CHECK-NEXT:    store i32 -235802127, i32* [[TMP10]], align 1
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = add i64 [[TMP9]], 12504
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = inttoptr i64 [[TMP11]] to i64*
>>> +; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP12]], align 1
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = add i64 [[TMP9]], 12512
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = inttoptr i64 [[TMP13]] to i64*
>>> +; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP14]], align 1
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = add i64 [[TMP9]], 12520
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = inttoptr i64 [[TMP15]] to i64*
>>> +; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP16]], align 1
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = add i64 [[TMP9]], 12528
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = inttoptr i64 [[TMP17]] to i64*
>>> +; CHECK-NEXT:    store i64 -868082074056920077, i64* [[TMP18]], align 1
>>>  ; CHECK-NEXT:    [[XX:%.*]] = getelementptr inbounds [100000 x i8], [100000 x i8]* [[TMP2]], i64 0, i64 0
>>>  ; CHECK-NEXT:    call void @Foo(i8* [[XX]])
>>>  ; CHECK-NEXT:    store i64 1172321806, i64* [[TMP3]], align 8
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = add i64 [[TMP9]], 0
>>> -; CHECK-NEXT:    [[TMP21:%.*]] = inttoptr i64 [[TMP20]] to i32*
>>> -; CHECK-NEXT:    store i32 0, i32* [[TMP21]], align 1
>>> -; CHECK-NEXT:    [[TMP22:%.*]] = add i64 [[TMP9]], 12504
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = inttoptr i64 [[TMP9]] to i32*
>>> +; CHECK-NEXT:    store i32 0, i32* [[TMP19]], align 1
>>> +; CHECK-NEXT:    [[TMP20:%.*]] = add i64 [[TMP9]], 12504
>>> +; CHECK-NEXT:    [[TMP21:%.*]] = inttoptr i64 [[TMP20]] to i64*
>>> +; CHECK-NEXT:    store i64 0, i64* [[TMP21]], align 1
>>> +; CHECK-NEXT:    [[TMP22:%.*]] = add i64 [[TMP9]], 12512
>>>  ; CHECK-NEXT:    [[TMP23:%.*]] = inttoptr i64 [[TMP22]] to i64*
>>>  ; CHECK-NEXT:    store i64 0, i64* [[TMP23]], align 1
>>> -; CHECK-NEXT:    [[TMP24:%.*]] = add i64 [[TMP9]], 12512
>>> +; CHECK-NEXT:    [[TMP24:%.*]] = add i64 [[TMP9]], 12520
>>>  ; CHECK-NEXT:    [[TMP25:%.*]] = inttoptr i64 [[TMP24]] to i64*
>>>  ; CHECK-NEXT:    store i64 0, i64* [[TMP25]], align 1
>>> -; CHECK-NEXT:    [[TMP26:%.*]] = add i64 [[TMP9]], 12520
>>> +; CHECK-NEXT:    [[TMP26:%.*]] = add i64 [[TMP9]], 12528
>>>  ; CHECK-NEXT:    [[TMP27:%.*]] = inttoptr i64 [[TMP26]] to i64*
>>>  ; CHECK-NEXT:    store i64 0, i64* [[TMP27]], align 1
>>> -; CHECK-NEXT:    [[TMP28:%.*]] = add i64 [[TMP9]], 12528
>>> -; CHECK-NEXT:    [[TMP29:%.*]] = inttoptr i64 [[TMP28]] to i64*
>>> -; CHECK-NEXT:    store i64 0, i64* [[TMP29]], align 1
>>>  ; CHECK-NEXT:    ret void
>>>  ;
>>>  entry:
>>>
>>> diff  --git a/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime-be.ll b/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime-be.ll
>>> index 91b0f53d4c6d4..d1423a868ca8b 100644
>>> --- a/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime-be.ll
>>> +++ b/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime-be.ll
>>> @@ -23,8 +23,7 @@ entry:
>>>    ; CHECK: [[SHADOW_BASE:%[0-9]+]] = add i64 %{{[0-9]+}}, 17592186044416
>>>
>>>    ; F1F1F1F1
>>> -  ; ENTRY-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; ENTRY-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; ENTRY-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; ENTRY-NEXT: store [[TYPE]] -235802127, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 02F2F2F2F2F2F2F2
>>> @@ -53,8 +52,7 @@ entry:
>>>    ; ENTRY-NEXT: store [[TYPE]] -13, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; F1F1F1F1
>>> -  ; ENTRY-UAS-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; ENTRY-UAS-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; ENTRY-UAS-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; ENTRY-UAS-NEXT: store [[TYPE]] -235802127, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; F8F8F8...
>>> @@ -161,16 +159,14 @@ entry:
>>>
>>>    ; CHECK: {{^[0-9]+}}:
>>>
>>> -  ; CHECK-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; CHECK-NEXT: call void @__asan_set_shadow_f5(i64 [[OFFSET]], i64 128)
>>> +  ; CHECK-NEXT: call void @__asan_set_shadow_f5(i64 [[SHADOW_BASE]], i64 128)
>>>
>>>    ; CHECK-NOT: add i64 [[SHADOW_BASE]]
>>>
>>>    ; CHECK: {{^[0-9]+}}:
>>>
>>>    ; 00000000
>>> -  ; EXIT-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; EXIT-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; EXIT-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; EXIT-NEXT: store [[TYPE]] 0, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 0000000000000000
>>> @@ -199,8 +195,7 @@ entry:
>>>    ; EXIT-NEXT: store [[TYPE]] 0, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 0000...
>>> -  ; EXIT-UAS-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; EXIT-UAS-NEXT: call void @__asan_set_shadow_00(i64 [[OFFSET]], i64 116)
>>> +  ; EXIT-UAS-NEXT: call void @__asan_set_shadow_00(i64 [[SHADOW_BASE]], i64 116)
>>>
>>>    ; CHECK-NOT: add i64 [[SHADOW_BASE]]
>>>
>>>
>>> diff  --git a/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime.ll b/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime.ll
>>> index 6604f5f924dae..a970554e469cb 100644
>>> --- a/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime.ll
>>> +++ b/llvm/test/Instrumentation/AddressSanitizer/stack-poisoning-and-lifetime.ll
>>> @@ -23,8 +23,7 @@ entry:
>>>    ; CHECK: [[SHADOW_BASE:%[0-9]+]] = add i64 %{{[0-9]+}}, 2147450880
>>>
>>>    ; F1F1F1F1
>>> -  ; ENTRY-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; ENTRY-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; ENTRY-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; ENTRY-NEXT: store [[TYPE]] -235802127, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 02F2F2F2F2F2F2F2
>>> @@ -53,8 +52,7 @@ entry:
>>>    ; ENTRY-NEXT: store [[TYPE]] -13, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; F1F1F1F1
>>> -  ; ENTRY-UAS-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; ENTRY-UAS-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; ENTRY-UAS-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; ENTRY-UAS-NEXT: store [[TYPE]] -235802127, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; F8F8F8...
>>> @@ -161,16 +159,14 @@ entry:
>>>
>>>    ; CHECK: {{^[0-9]+}}:
>>>
>>> -  ; CHECK-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; CHECK-NEXT: call void @__asan_set_shadow_f5(i64 [[OFFSET]], i64 128)
>>> +  ; CHECK-NEXT: call void @__asan_set_shadow_f5(i64 [[SHADOW_BASE]], i64 128)
>>>
>>>    ; CHECK-NOT: add i64 [[SHADOW_BASE]]
>>>
>>>    ; CHECK: {{^[0-9]+}}:
>>>
>>>    ; 00000000
>>> -  ; EXIT-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; EXIT-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[OFFSET]] to [[TYPE:i32]]*
>>> +  ; EXIT-NEXT: [[PTR:%[0-9]+]] = inttoptr i64 [[SHADOW_BASE]] to [[TYPE:i32]]*
>>>    ; EXIT-NEXT: store [[TYPE]] 0, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 0000000000000000
>>> @@ -199,8 +195,7 @@ entry:
>>>    ; EXIT-NEXT: store [[TYPE]] 0, [[TYPE]]* [[PTR]], align 1
>>>
>>>    ; 0000...
>>> -  ; EXIT-UAS-NEXT: [[OFFSET:%[0-9]+]] = add i64 [[SHADOW_BASE]], 0
>>> -  ; EXIT-UAS-NEXT: call void @__asan_set_shadow_00(i64 [[OFFSET]], i64 116)
>>> +  ; EXIT-UAS-NEXT: call void @__asan_set_shadow_00(i64 [[SHADOW_BASE]], i64 116)
>>>
>>>    ; CHECK-NOT: add i64 [[SHADOW_BASE]]
>>>
>>>
>>> diff  --git a/llvm/test/Instrumentation/HWAddressSanitizer/basic.ll b/llvm/test/Instrumentation/HWAddressSanitizer/basic.ll
>>> index d4e4e11865630..473a69783e17e 100644
>>> --- a/llvm/test/Instrumentation/HWAddressSanitizer/basic.ll
>>> +++ b/llvm/test/Instrumentation/HWAddressSanitizer/basic.ll
>>> @@ -35,8 +35,7 @@ define i8 @test_load8(i8* %a) sanitize_hwaddress {
>>>  ; RECOVER: [[SHORT]]:
>>>  ; RECOVER: %[[LOWBITS:[^ ]*]] = and i64 %[[A]], 15
>>>  ; RECOVER: %[[LOWBITS_I8:[^ ]*]] = trunc i64 %[[LOWBITS]] to i8
>>> -; RECOVER: %[[LAST:[^ ]*]] = add i8 %[[LOWBITS_I8]], 0
>>> -; RECOVER: %[[OOB:[^ ]*]] = icmp uge i8 %[[LAST]], %[[MEMTAG]]
>>> +; RECOVER: %[[OOB:[^ ]*]] = icmp uge i8 %[[LOWBITS_I8]], %[[MEMTAG]]
>>>  ; RECOVER: br i1 %[[OOB]], label %[[FAIL]], label %[[INBOUNDS:[0-9]*]], !prof {{.*}}
>>>
>>>  ; RECOVER: [[INBOUNDS]]:
>>>
>>> diff  --git a/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64.ll b/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64.ll
>>> index c87cce471a849..a27311f95bfde 100644
>>> --- a/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64.ll
>>> +++ b/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64.ll
>>> @@ -18,10 +18,9 @@ define i32 @foo(i32 %guard, ...) {
>>>
>>>  ; CHECK-LABEL: @foo
>>>  ; CHECK: [[A:%.*]] = load {{.*}} @__msan_va_arg_overflow_size_tls
>>> -; CHECK: [[B:%.*]] = add i64 [[A]], 0
>>> -; CHECK: [[C:%.*]] = alloca {{.*}} [[B]]
>>> +; CHECK: [[C:%.*]] = alloca {{.*}} [[A]]
>>>
>>> -; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[B]], i1 false)
>>> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[A]], i1 false)
>>>
>>>  declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1
>>>  declare void @llvm.va_start(i8*) #2
>>>
>>> diff  --git a/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64el.ll b/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64el.ll
>>> index 31efc7aa5eab4..f7dd2aef525ed 100644
>>> --- a/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64el.ll
>>> +++ b/llvm/test/Instrumentation/MemorySanitizer/Mips/vararg-mips64el.ll
>>> @@ -18,10 +18,9 @@ define i32 @foo(i32 %guard, ...) {
>>>
>>>  ; CHECK-LABEL: @foo
>>>  ; CHECK: [[A:%.*]] = load {{.*}} @__msan_va_arg_overflow_size_tls
>>> -; CHECK: [[B:%.*]] = add i64 [[A]], 0
>>> -; CHECK: [[C:%.*]] = alloca {{.*}} [[B]]
>>> +; CHECK: [[C:%.*]] = alloca {{.*}} [[A]]
>>>
>>> -; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[B]], i1 false)
>>> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[A]], i1 false)
>>>
>>>  declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1
>>>  declare void @llvm.va_start(i8*) #2
>>>
>>> diff  --git a/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64.ll b/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64.ll
>>> index 2dc2bde7b835e..b45a4fc32352a 100644
>>> --- a/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64.ll
>>> +++ b/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64.ll
>>> @@ -18,10 +18,9 @@ define i32 @foo(i32 %guard, ...) {
>>>
>>>  ; CHECK-LABEL: @foo
>>>  ; CHECK: [[A:%.*]] = load {{.*}} @__msan_va_arg_overflow_size_tls
>>> -; CHECK: [[B:%.*]] = add i64 [[A]], 0
>>> -; CHECK: [[C:%.*]] = alloca {{.*}} [[B]]
>>> +; CHECK: [[C:%.*]] = alloca {{.*}} [[A]]
>>>
>>> -; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[B]], i1 false)
>>> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[A]], i1 false)
>>>
>>>  declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1
>>>  declare void @llvm.va_start(i8*) #2
>>>
>>> diff  --git a/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64le.ll b/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64le.ll
>>> index 4db0785e8dd2e..b7515e040ab17 100644
>>> --- a/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64le.ll
>>> +++ b/llvm/test/Instrumentation/MemorySanitizer/PowerPC/vararg-ppc64le.ll
>>> @@ -18,10 +18,9 @@ define i32 @foo(i32 %guard, ...) {
>>>
>>>  ; CHECK-LABEL: @foo
>>>  ; CHECK: [[A:%.*]] = load {{.*}} @__msan_va_arg_overflow_size_tls
>>> -; CHECK: [[B:%.*]] = add i64 [[A]], 0
>>> -; CHECK: [[C:%.*]] = alloca {{.*}} [[B]]
>>> +; CHECK: [[C:%.*]] = alloca {{.*}} [[A]]
>>>
>>> -; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[B]], i1 false)
>>> +; CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[C]], i8* align 8 bitcast ({{.*}} @__msan_va_arg_tls to i8*), i64 [[A]], i1 false)
>>>
>>>  declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture) #1
>>>  declare void @llvm.va_start(i8*) #2
>>>
>>> diff  --git a/llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll b/llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll
>>> index 077d604ced395..301f24813255b 100644
>>> --- a/llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll
>>> +++ b/llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll
>>> @@ -17,24 +17,23 @@ define void @f(i32* noalias %a, i32* noalias %b, i32* noalias %c, i32* noalias %
>>>  ; CHECK-NEXT:    [[MUL1:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 2, i32 [[TMP1]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL1]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL1]], 1
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i32 [[MUL_RESULT]], 0
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i32 [[TMP3]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i32 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp ugt i32 [[TMP2]], 0
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i32 [[MUL_RESULT]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP6]], [[MUL_OVERFLOW]]
>>>  ; CHECK-NEXT:    [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 8, i64 [[TMP0]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP9]]
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = icmp ugt i8* [[TMP11]], [[A5]]
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = icmp ult i8* [[TMP10]], [[A5]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW4]]
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP8]], [[TMP14]]
>>> -; CHECK-NEXT:    br i1 [[TMP15]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH_LDIST1:%.*]]
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP8]]
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = icmp ugt i8* [[TMP10]], [[A5]]
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = icmp ult i8* [[TMP9]], [[A5]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP12]], [[MUL_OVERFLOW4]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP7]], [[TMP13]]
>>> +; CHECK-NEXT:    br i1 [[TMP14]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH_LDIST1:%.*]]
>>>  ; CHECK:       for.body.ph.lver.orig:
>>>  ; CHECK-NEXT:    br label [[FOR_BODY_LVER_ORIG:%.*]]
>>>  ; CHECK:       for.body.lver.orig:
>>> @@ -163,24 +162,23 @@ define void @f_with_offset(i32* noalias %b, i32* noalias %c, i32* noalias %d, i3
>>>  ; CHECK-NEXT:    [[MUL1:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 2, i32 [[TMP1]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL1]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL1]], 1
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i32 [[MUL_RESULT]], 0
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i32 [[TMP3]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i32 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp ugt i32 [[TMP2]], 0
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i32 [[MUL_RESULT]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP6]], [[MUL_OVERFLOW]]
>>>  ; CHECK-NEXT:    [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 8, i64 [[TMP0]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*), i64 [[MUL_RESULT3]]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr i8, i8* bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*), i64 [[TMP9]]
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = icmp ugt i8* [[TMP11]], bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*)
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = icmp ult i8* [[TMP10]], bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*)
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW4]]
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP8]], [[TMP14]]
>>> -; CHECK-NEXT:    br i1 [[TMP15]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH_LDIST1:%.*]]
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr i8, i8* bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*), i64 [[MUL_RESULT3]]
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*), i64 [[TMP8]]
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = icmp ugt i8* [[TMP10]], bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*)
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = icmp ult i8* [[TMP9]], bitcast (i32* getelementptr inbounds ([8192 x i32], [8192 x i32]* @global_a, i64 0, i64 42) to i8*)
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP12]], [[MUL_OVERFLOW4]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP7]], [[TMP13]]
>>> +; CHECK-NEXT:    br i1 [[TMP14]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH_LDIST1:%.*]]
>>>  ; CHECK:       for.body.ph.lver.orig:
>>>  ; CHECK-NEXT:    br label [[FOR_BODY_LVER_ORIG:%.*]]
>>>  ; CHECK:       for.body.lver.orig:
>>>
>>> diff  --git a/llvm/test/Transforms/LoopIdiom/X86/arithmetic-right-shift-until-zero.ll b/llvm/test/Transforms/LoopIdiom/X86/arithmetic-right-shift-until-zero.ll
>>> index 9079773ca1a8d..d42743ea622b7 100644
>>> --- a/llvm/test/Transforms/LoopIdiom/X86/arithmetic-right-shift-until-zero.ll
>>> +++ b/llvm/test/Transforms/LoopIdiom/X86/arithmetic-right-shift-until-zero.ll
>>> @@ -327,8 +327,7 @@ define i8 @p6(i8 %val, i8 %start) mustprogress {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i8 @llvm.ctlz.i8(i8 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i8 8, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i8 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS_OFFSET]], i8 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS]], i8 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i8 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i8 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1263,8 +1262,7 @@ define i1 @t24_nooffset_i1(i1 %val, i1 %start) mustprogress {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i1 @llvm.ctlz.i1(i1 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i1 true, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i1 [[VAL_NUMACTIVEBITS]], false
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS_OFFSET]], i1 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS]], i1 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i1 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i1 [[LOOP_BACKEDGETAKENCOUNT]], true
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1313,8 +1311,7 @@ define i2 @t25_nooffset_i2(i2 %val, i2 %start) mustprogress {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i2 @llvm.ctlz.i2(i2 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw i2 -2, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i2 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS_OFFSET]], i2 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS]], i2 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i2 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw i2 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1363,8 +1360,7 @@ define i3 @t26_nooffset_i3(i3 %val, i3 %start) mustprogress {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i3 @llvm.ctlz.i3(i3 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i3 3, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i3 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS_OFFSET]], i3 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS]], i3 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i3 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i3 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopIdiom/X86/left-shift-until-zero.ll b/llvm/test/Transforms/LoopIdiom/X86/left-shift-until-zero.ll
>>> index d32fd50f29921..7b156442b59c5 100644
>>> --- a/llvm/test/Transforms/LoopIdiom/X86/left-shift-until-zero.ll
>>> +++ b/llvm/test/Transforms/LoopIdiom/X86/left-shift-until-zero.ll
>>> @@ -327,8 +327,7 @@ define i8 @p6(i8 %val, i8 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i8 @llvm.cttz.i8(i8 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i8 8, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i8 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS_OFFSET]], i8 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS]], i8 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i8 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i8 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1263,8 +1262,7 @@ define i1 @t24_nooffset_i1(i1 %val, i1 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i1 @llvm.cttz.i1(i1 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i1 true, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i1 [[VAL_NUMACTIVEBITS]], false
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS_OFFSET]], i1 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS]], i1 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i1 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i1 [[LOOP_BACKEDGETAKENCOUNT]], true
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1313,8 +1311,7 @@ define i2 @t25_nooffset_i2(i2 %val, i2 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i2 @llvm.cttz.i2(i2 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw i2 -2, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i2 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS_OFFSET]], i2 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS]], i2 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i2 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw i2 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1363,8 +1360,7 @@ define i3 @t26_nooffset_i3(i3 %val, i3 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i3 @llvm.cttz.i3(i3 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i3 3, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i3 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS_OFFSET]], i3 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS]], i3 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i3 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i3 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll b/llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll
>>> index a851ed5e38562..0c3de8754048f 100644
>>> --- a/llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll
>>> +++ b/llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll
>>> @@ -327,8 +327,7 @@ define i8 @p6(i8 %val, i8 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i8 @llvm.ctlz.i8(i8 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i8 8, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i8 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS_OFFSET]], i8 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i8 @llvm.smax.i8(i8 [[VAL_NUMACTIVEBITS]], i8 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i8 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i8 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1263,8 +1262,7 @@ define i1 @t24_nooffset_i1(i1 %val, i1 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i1 @llvm.ctlz.i1(i1 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i1 true, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i1 [[VAL_NUMACTIVEBITS]], false
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS_OFFSET]], i1 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i1 @llvm.smax.i1(i1 [[VAL_NUMACTIVEBITS]], i1 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i1 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i1 [[LOOP_BACKEDGETAKENCOUNT]], true
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1313,8 +1311,7 @@ define i2 @t25_nooffset_i2(i2 %val, i2 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i2 @llvm.ctlz.i2(i2 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw i2 -2, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i2 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS_OFFSET]], i2 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i2 @llvm.smax.i2(i2 [[VAL_NUMACTIVEBITS]], i2 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i2 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw i2 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>> @@ -1363,8 +1360,7 @@ define i3 @t26_nooffset_i3(i3 %val, i3 %start) {
>>>  ; CHECK-NEXT:  entry:
>>>  ; CHECK-NEXT:    [[VAL_NUMLEADINGZEROS:%.*]] = call i3 @llvm.ctlz.i3(i3 [[VAL:%.*]], i1 false)
>>>  ; CHECK-NEXT:    [[VAL_NUMACTIVEBITS:%.*]] = sub nuw nsw i3 3, [[VAL_NUMLEADINGZEROS]]
>>> -; CHECK-NEXT:    [[VAL_NUMACTIVEBITS_OFFSET:%.*]] = add nuw nsw i3 [[VAL_NUMACTIVEBITS]], 0
>>> -; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS_OFFSET]], i3 [[START:%.*]])
>>> +; CHECK-NEXT:    [[IV_FINAL:%.*]] = call i3 @llvm.smax.i3(i3 [[VAL_NUMACTIVEBITS]], i3 [[START:%.*]])
>>>  ; CHECK-NEXT:    [[LOOP_BACKEDGETAKENCOUNT:%.*]] = sub nuw nsw i3 [[IV_FINAL]], [[START]]
>>>  ; CHECK-NEXT:    [[LOOP_TRIPCOUNT:%.*]] = add nuw nsw i3 [[LOOP_BACKEDGETAKENCOUNT]], 1
>>>  ; CHECK-NEXT:    br label [[LOOP:%.*]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll b/llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll
>>> index 2bee64107d7c9..b5094c112ce05 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/AArch64/induction-trunc.ll
>>> @@ -7,9 +7,8 @@ target triple = "aarch64--linux-gnu"
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %vector.ph ], [ [[INDEX_NEXT:%.*]], %vector.body ]
>>>  ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = mul i64 [[INDEX]], 5
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add i64 [[OFFSET_IDX]], 0
>>>  ; CHECK-NEXT:    [[INDUCTION1:%.*]] = add i64 [[OFFSET_IDX]], 5
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = trunc i64 [[INDUCTION]] to i32
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = trunc i64 [[OFFSET_IDX]] to i32
>>>  ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i64 [[INDUCTION1]] to i32
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; CHECK:         br i1 {{.*}}, label %middle.block, label %vector.body
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/AArch64/scalarize-store-with-predication.ll b/llvm/test/Transforms/LoopVectorize/AArch64/scalarize-store-with-predication.ll
>>> index cf0dbb30d0d37..2b97659a8a92b 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/AArch64/scalarize-store-with-predication.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/AArch64/scalarize-store-with-predication.ll
>>> @@ -22,11 +22,11 @@ define void @foo(i32* %data1, i32* %data2) {
>>>  ; CHECK-NEXT:    store i32 {{%.*}}, i32* {{%.*}}
>>>  ; CHECK-NEXT:    br label %pred.store.continue
>>>  ; CHECK:       pred.store.continue:
>>> -; CHECK-NEXT:    br i1 {{%.*}}, label %pred.store.if2, label %pred.store.continue3
>>> -; CHECK:       pred.store.if2:
>>> +; CHECK-NEXT:    br i1 {{%.*}}, label %pred.store.if1, label %pred.store.continue2
>>> +; CHECK:       pred.store.if1:
>>>  ; CHECK-NEXT:    store i32 {{%.*}}, i32* {{%.*}}
>>> -; CHECK-NEXT:    br label %pred.store.continue3
>>> -; CHECK:       pred.store.continue3:
>>> +; CHECK-NEXT:    br label %pred.store.continue2
>>> +; CHECK:       pred.store.continue2:
>>>
>>>  entry:
>>>    br label %while.body
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
>>> index 76a596d896149..7c2f64e10b5d0 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
>>> @@ -42,16 +42,14 @@ define void @pointer_induction_used_as_vector(i8** noalias %start.1, i8* noalias
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[TMP4:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8*, i8** [[START_1]], i64 [[TMP5]]
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8*, i8** [[START_1]], i64 [[INDEX]]
>>>  ; CHECK-NEXT:    [[TMP6:%.*]] = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
>>>  ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <vscale x 2 x i64> poison, i64 [[INDEX]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
>>>  ; CHECK-NEXT:    [[TMP7:%.*]] = add <vscale x 2 x i64> [[TMP6]], shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 0, i32 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
>>>  ; CHECK-NEXT:    [[TMP8:%.*]] = add <vscale x 2 x i64> [[DOTSPLAT]], [[TMP7]]
>>>  ; CHECK-NEXT:    [[NEXT_GEP4:%.*]] = getelementptr i8, i8* [[START_2]], <vscale x 2 x i64> [[TMP8]]
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP5:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP9]]
>>> +; CHECK-NEXT:    [[NEXT_GEP5:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[INDEX]]
>>>  ; CHECK-NEXT:    [[TMP10:%.*]] = add i64 [[INDEX]], 1
>>>  ; CHECK-NEXT:    [[NEXT_GEP6:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP10]]
>>>  ; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr inbounds i8, <vscale x 2 x i8*> [[NEXT_GEP4]], i64 1
>>> @@ -129,8 +127,7 @@ define void @pointer_induction(i8* noalias %start, i64 %N) {
>>>  ; CHECK-NEXT:    [[TMP6:%.*]] = add <vscale x 2 x i64> [[TMP5]], shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 0, i32 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
>>>  ; CHECK-NEXT:    [[TMP7:%.*]] = add <vscale x 2 x i64> [[DOTSPLAT]], [[TMP6]]
>>>  ; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[START]], <vscale x 2 x i64> [[TMP7]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = add i64 [[INDEX1]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP3:%.*]] = getelementptr i8, i8* [[START]], i64 [[TMP8]]
>>> +; CHECK-NEXT:    [[NEXT_GEP3:%.*]] = getelementptr i8, i8* [[START]], i64 [[INDEX1]]
>>>  ; CHECK-NEXT:    [[TMP9:%.*]] = add i64 [[INDEX1]], 1
>>>  ; CHECK-NEXT:    [[NEXT_GEP4:%.*]] = getelementptr i8, i8* [[START]], i64 [[TMP9]]
>>>  ; CHECK-NEXT:    [[TMP10:%.*]] = add i64 [[INDEX1]], 0
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/ARM/tail-folding-scalar-epilogue-fallback.ll b/llvm/test/Transforms/LoopVectorize/ARM/tail-folding-scalar-epilogue-fallback.ll
>>> index 08325ea1047e9..1e271b4c79bfc 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/ARM/tail-folding-scalar-epilogue-fallback.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/ARM/tail-folding-scalar-epilogue-fallback.ll
>>> @@ -25,18 +25,17 @@ define void @outside_user_blocks_tail_folding(i8* nocapture readonly %ptr, i32 %
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i32 [[SIZE]], [[INDEX]]
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[OFFSET_IDX]], 0
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i32 0
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i8* [[TMP3]] to <16 x i8>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <16 x i8>, <16 x i8>* [[TMP4]], align 1
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i8* [[TMP5]] to <16 x i8>*
>>> -; CHECK-NEXT:    store <16 x i8> [[WIDE_LOAD]], <16 x i8>* [[TMP6]], align 1
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[INDEX]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[TMP1]], i32 0
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = bitcast i8* [[TMP2]] to <16 x i8>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <16 x i8>, <16 x i8>* [[TMP3]], align 1
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = bitcast i8* [[TMP4]] to <16 x i8>*
>>> +; CHECK-NEXT:    store <16 x i8> [[WIDE_LOAD]], <16 x i8>* [[TMP5]], align 1
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 16
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[SIZE]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[END:%.*]], label [[SCALAR_PH]]
>>> @@ -49,10 +48,10 @@ define void @outside_user_blocks_tail_folding(i8* nocapture readonly %ptr, i32 %
>>>  ; CHECK-NEXT:    [[BUFF:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[BODY]] ], [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ]
>>>  ; CHECK-NEXT:    [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[BUFF]], i32 1
>>>  ; CHECK-NEXT:    [[DEC]] = add nsw i32 [[DEC66]], -1
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> -; CHECK-NEXT:    store i8 [[TMP8]], i8* [[BUFF]], align 1
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> +; CHECK-NEXT:    store i8 [[TMP7]], i8* [[BUFF]], align 1
>>>  ; CHECK-NEXT:    [[TOBOOL11:%.*]] = icmp eq i32 [[DEC]], 0
>>> -; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop !2
>>> +; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ; CHECK:       end:
>>>  ; CHECK-NEXT:    [[INCDEC_PTR_LCSSA:%.*]] = phi i8* [ [[INCDEC_PTR]], [[BODY]] ], [ [[IND_END2]], [[MIDDLE_BLOCK]] ]
>>>  ; CHECK-NEXT:    store i8* [[INCDEC_PTR_LCSSA]], i8** [[POS]], align 4
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll b/llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll
>>> index 20980c520a3cc..223b6e134f747 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll
>>> @@ -34,53 +34,52 @@ define void @cff_index_load_offsets(i1 %cond, i8 %x, i8* %p) #0 {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = mul i64 [[TMP4]], 4
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* null, i64 [[TMP5]]
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = add i64 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = mul i64 [[TMP6]], 4
>>> -; CHECK-NEXT:    [[NEXT_GEP1:%.*]] = getelementptr i8, i8* null, i64 [[TMP7]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT]] to <4 x i32>
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT3]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = mul i64 [[INDEX]], 4
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* null, i64 [[TMP4]]
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = add i64 [[INDEX]], 4
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = mul i64 [[TMP5]], 4
>>> +; CHECK-NEXT:    [[NEXT_GEP1:%.*]] = getelementptr i8, i8* null, i64 [[TMP6]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT3]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = shl nuw <4 x i32> [[TMP7]], <i32 24, i32 24, i32 24, i32 24>
>>>  ; CHECK-NEXT:    [[TMP10:%.*]] = shl nuw <4 x i32> [[TMP8]], <i32 24, i32 24, i32 24, i32 24>
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = shl nuw <4 x i32> [[TMP9]], <i32 24, i32 24, i32 24, i32 24>
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = load i8, i8* [[P:%.*]], align 1, !tbaa [[TBAA1:![0-9]+]]
>>> -; CHECK-NEXT:    [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <4 x i8> poison, i8 [[TMP12]], i32 0
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = load i8, i8* [[P:%.*]], align 1, !tbaa [[TBAA1:![0-9]+]]
>>> +; CHECK-NEXT:    [[BROADCAST_SPLATINSERT4:%.*]] = insertelement <4 x i8> poison, i8 [[TMP11]], i32 0
>>>  ; CHECK-NEXT:    [[BROADCAST_SPLAT5:%.*]] = shufflevector <4 x i8> [[BROADCAST_SPLATINSERT4]], <4 x i8> poison, <4 x i32> zeroinitializer
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = load i8, i8* [[P]], align 1, !tbaa [[TBAA1]]
>>> -; CHECK-NEXT:    [[BROADCAST_SPLATINSERT6:%.*]] = insertelement <4 x i8> poison, i8 [[TMP13]], i32 0
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = load i8, i8* [[P]], align 1, !tbaa [[TBAA1]]
>>> +; CHECK-NEXT:    [[BROADCAST_SPLATINSERT6:%.*]] = insertelement <4 x i8> poison, i8 [[TMP12]], i32 0
>>>  ; CHECK-NEXT:    [[BROADCAST_SPLAT7:%.*]] = shufflevector <4 x i8> [[BROADCAST_SPLATINSERT6]], <4 x i8> poison, <4 x i32> zeroinitializer
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT5]] to <4 x i32>
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT7]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT5]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = zext <4 x i8> [[BROADCAST_SPLAT7]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = shl nuw nsw <4 x i32> [[TMP13]], <i32 16, i32 16, i32 16, i32 16>
>>>  ; CHECK-NEXT:    [[TMP16:%.*]] = shl nuw nsw <4 x i32> [[TMP14]], <i32 16, i32 16, i32 16, i32 16>
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = shl nuw nsw <4 x i32> [[TMP15]], <i32 16, i32 16, i32 16, i32 16>
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = or <4 x i32> [[TMP15]], [[TMP9]]
>>>  ; CHECK-NEXT:    [[TMP18:%.*]] = or <4 x i32> [[TMP16]], [[TMP10]]
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = or <4 x i32> [[TMP17]], [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = load i8, i8* undef, align 1, !tbaa [[TBAA1]]
>>>  ; CHECK-NEXT:    [[TMP20:%.*]] = load i8, i8* undef, align 1, !tbaa [[TBAA1]]
>>> -; CHECK-NEXT:    [[TMP21:%.*]] = load i8, i8* undef, align 1, !tbaa [[TBAA1]]
>>> +; CHECK-NEXT:    [[TMP21:%.*]] = or <4 x i32> [[TMP17]], zeroinitializer
>>>  ; CHECK-NEXT:    [[TMP22:%.*]] = or <4 x i32> [[TMP18]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP23:%.*]] = or <4 x i32> [[TMP19]], zeroinitializer
>>> +; CHECK-NEXT:    [[TMP23:%.*]] = or <4 x i32> [[TMP21]], zeroinitializer
>>>  ; CHECK-NEXT:    [[TMP24:%.*]] = or <4 x i32> [[TMP22]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP25:%.*]] = or <4 x i32> [[TMP23]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP26:%.*]] = extractelement <4 x i32> [[TMP24]], i32 0
>>> -; CHECK-NEXT:    store i32 [[TMP26]], i32* undef, align 4, !tbaa [[TBAA4:![0-9]+]]
>>> -; CHECK-NEXT:    [[TMP27:%.*]] = extractelement <4 x i32> [[TMP24]], i32 1
>>> +; CHECK-NEXT:    [[TMP25:%.*]] = extractelement <4 x i32> [[TMP23]], i32 0
>>> +; CHECK-NEXT:    store i32 [[TMP25]], i32* undef, align 4, !tbaa [[TBAA4:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP26:%.*]] = extractelement <4 x i32> [[TMP23]], i32 1
>>> +; CHECK-NEXT:    store i32 [[TMP26]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> +; CHECK-NEXT:    [[TMP27:%.*]] = extractelement <4 x i32> [[TMP23]], i32 2
>>>  ; CHECK-NEXT:    store i32 [[TMP27]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP28:%.*]] = extractelement <4 x i32> [[TMP24]], i32 2
>>> +; CHECK-NEXT:    [[TMP28:%.*]] = extractelement <4 x i32> [[TMP23]], i32 3
>>>  ; CHECK-NEXT:    store i32 [[TMP28]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP29:%.*]] = extractelement <4 x i32> [[TMP24]], i32 3
>>> +; CHECK-NEXT:    [[TMP29:%.*]] = extractelement <4 x i32> [[TMP24]], i32 0
>>>  ; CHECK-NEXT:    store i32 [[TMP29]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP30:%.*]] = extractelement <4 x i32> [[TMP25]], i32 0
>>> +; CHECK-NEXT:    [[TMP30:%.*]] = extractelement <4 x i32> [[TMP24]], i32 1
>>>  ; CHECK-NEXT:    store i32 [[TMP30]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP31:%.*]] = extractelement <4 x i32> [[TMP25]], i32 1
>>> +; CHECK-NEXT:    [[TMP31:%.*]] = extractelement <4 x i32> [[TMP24]], i32 2
>>>  ; CHECK-NEXT:    store i32 [[TMP31]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP32:%.*]] = extractelement <4 x i32> [[TMP25]], i32 2
>>> +; CHECK-NEXT:    [[TMP32:%.*]] = extractelement <4 x i32> [[TMP24]], i32 3
>>>  ; CHECK-NEXT:    store i32 [[TMP32]], i32* undef, align 4, !tbaa [[TBAA4]]
>>> -; CHECK-NEXT:    [[TMP33:%.*]] = extractelement <4 x i32> [[TMP25]], i32 3
>>> -; CHECK-NEXT:    store i32 [[TMP33]], i32* undef, align 4, !tbaa [[TBAA4]]
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
>>> -; CHECK-NEXT:    [[TMP34:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP34]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP33:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP33]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP2]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[SW_EPILOG:%.*]], label [[SCALAR_PH]]
>>> @@ -91,11 +90,11 @@ define void @cff_index_load_offsets(i1 %cond, i8 %x, i8* %p) #0 {
>>>  ; CHECK-NEXT:    [[P_359:%.*]] = phi i8* [ [[ADD_PTR86:%.*]], [[FOR_BODY68]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
>>>  ; CHECK-NEXT:    [[CONV70:%.*]] = zext i8 [[X]] to i32
>>>  ; CHECK-NEXT:    [[SHL71:%.*]] = shl nuw i32 [[CONV70]], 24
>>> -; CHECK-NEXT:    [[TMP35:%.*]] = load i8, i8* [[P]], align 1, !tbaa [[TBAA1]]
>>> -; CHECK-NEXT:    [[CONV73:%.*]] = zext i8 [[TMP35]] to i32
>>> +; CHECK-NEXT:    [[TMP34:%.*]] = load i8, i8* [[P]], align 1, !tbaa [[TBAA1]]
>>> +; CHECK-NEXT:    [[CONV73:%.*]] = zext i8 [[TMP34]] to i32
>>>  ; CHECK-NEXT:    [[SHL74:%.*]] = shl nuw nsw i32 [[CONV73]], 16
>>>  ; CHECK-NEXT:    [[OR75:%.*]] = or i32 [[SHL74]], [[SHL71]]
>>> -; CHECK-NEXT:    [[TMP36:%.*]] = load i8, i8* undef, align 1, !tbaa [[TBAA1]]
>>> +; CHECK-NEXT:    [[TMP35:%.*]] = load i8, i8* undef, align 1, !tbaa [[TBAA1]]
>>>  ; CHECK-NEXT:    [[SHL78:%.*]] = shl nuw nsw i32 undef, 8
>>>  ; CHECK-NEXT:    [[OR79:%.*]] = or i32 [[OR75]], [[SHL78]]
>>>  ; CHECK-NEXT:    [[CONV81:%.*]] = zext i8 undef to i32
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll b/llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll
>>> index c4a17acdb0428..a12680ab49884 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/first-order-recurrence-complex.ll
>>> @@ -645,37 +645,36 @@ define void @sink_dominance(i32* %ptr, i32 %N) {
>>>  ; CHECK:       vector.scevcheck:
>>>  ; CHECK-NEXT:    [[UMAX:%.*]] = call i32 @llvm.umax.i32(i32 [[N]], i32 1)
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[UMAX]], -1
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[TMP0]], 0
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 0, [[TMP0]]
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp slt i32 [[TMP1]], 0
>>> -; CHECK-NEXT:    br i1 [[TMP4]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 0, [[TMP0]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt i32 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp slt i32 [[TMP0]], 0
>>> +; CHECK-NEXT:    br i1 [[TMP3]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i32 [[UMAX1]], 4
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i32 [[UMAX1]], [[N_MOD_VF]]
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[VECTOR_RECUR:%.*]] = phi <4 x i64> [ <i64 poison, i64 poison, i64 poison, i64 0>, [[VECTOR_PH]] ], [ [[TMP9:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[PTR:%.*]], i32 [[TMP5]]
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[TMP6]], i32 0
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = bitcast i32* [[TMP7]] to <4 x i32>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP8]], align 4
>>> -; CHECK-NEXT:    [[TMP9]] = zext <4 x i32> [[WIDE_LOAD]] to <4 x i64>
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = shufflevector <4 x i64> [[VECTOR_RECUR]], <4 x i64> [[TMP9]], <4 x i32> <i32 3, i32 4, i32 5, i32 6>
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = trunc <4 x i64> [[TMP10]] to <4 x i32>
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = icmp slt <4 x i32> [[TMP11]], <i32 213, i32 213, i32 213, i32 213>
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = select <4 x i1> [[TMP12]], <4 x i32> [[TMP11]], <4 x i32> <i32 22, i32 22, i32 22, i32 22>
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i32* [[TMP7]] to <4 x i32>*
>>> -; CHECK-NEXT:    store <4 x i32> [[TMP13]], <4 x i32>* [[TMP14]], align 4
>>> +; CHECK-NEXT:    [[VECTOR_RECUR:%.*]] = phi <4 x i64> [ <i64 poison, i64 poison, i64 poison, i64 0>, [[VECTOR_PH]] ], [ [[TMP8:%.*]], [[VECTOR_BODY]] ]
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = add i32 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[PTR:%.*]], i32 [[TMP4]]
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[TMP5]], i32 0
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = bitcast i32* [[TMP6]] to <4 x i32>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP7]], align 4
>>> +; CHECK-NEXT:    [[TMP8]] = zext <4 x i32> [[WIDE_LOAD]] to <4 x i64>
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <4 x i64> [[VECTOR_RECUR]], <4 x i64> [[TMP8]], <4 x i32> <i32 3, i32 4, i32 5, i32 6>
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = trunc <4 x i64> [[TMP9]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = icmp slt <4 x i32> [[TMP10]], <i32 213, i32 213, i32 213, i32 213>
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = select <4 x i1> [[TMP11]], <4 x i32> [[TMP10]], <4 x i32> <i32 22, i32 22, i32 22, i32 22>
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = bitcast i32* [[TMP6]] to <4 x i32>*
>>> +; CHECK-NEXT:    store <4 x i32> [[TMP12]], <4 x i32>* [[TMP13]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP15]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[UMAX1]], [[N_VEC]]
>>> -; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i64> [[TMP9]], i32 3
>>> -; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT_FOR_PHI:%.*]] = extractelement <4 x i64> [[TMP9]], i32 2
>>> +; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i64> [[TMP8]], i32 3
>>> +; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT_FOR_PHI:%.*]] = extractelement <4 x i64> [[TMP8]], i32 2
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>  ; CHECK:       scalar.ph:
>>>  ; CHECK-NEXT:    [[SCALAR_RECUR_INIT:%.*]] = phi i64 [ 0, [[VECTOR_SCEVCHECK]] ], [ 0, [[ENTRY:%.*]] ], [ [[VECTOR_RECUR_EXTRACT]], [[MIDDLE_BLOCK]] ]
>>> @@ -732,39 +731,38 @@ define void @sink_dominance_2(i32* %ptr, i32 %N) {
>>>  ; CHECK:       vector.scevcheck:
>>>  ; CHECK-NEXT:    [[UMAX:%.*]] = call i32 @llvm.umax.i32(i32 [[N]], i32 1)
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[UMAX]], -1
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[TMP0]], 0
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 0, [[TMP0]]
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp slt i32 [[TMP1]], 0
>>> -; CHECK-NEXT:    br i1 [[TMP4]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 0, [[TMP0]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt i32 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp slt i32 [[TMP0]], 0
>>> +; CHECK-NEXT:    br i1 [[TMP3]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i32 [[UMAX1]], 4
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i32 [[UMAX1]], [[N_MOD_VF]]
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[VECTOR_RECUR:%.*]] = phi <4 x i64> [ <i64 poison, i64 poison, i64 poison, i64 0>, [[VECTOR_PH]] ], [ [[TMP9:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[PTR:%.*]], i32 [[TMP5]]
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[TMP6]], i32 0
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = bitcast i32* [[TMP7]] to <4 x i32>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP8]], align 4
>>> -; CHECK-NEXT:    [[TMP9]] = zext <4 x i32> [[WIDE_LOAD]] to <4 x i64>
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = shufflevector <4 x i64> [[VECTOR_RECUR]], <4 x i64> [[TMP9]], <4 x i32> <i32 3, i32 4, i32 5, i32 6>
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = trunc <4 x i64> [[TMP10]] to <4 x i32>
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = add <4 x i32> [[TMP11]], <i32 2, i32 2, i32 2, i32 2>
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = mul <4 x i32> [[TMP12]], <i32 99, i32 99, i32 99, i32 99>
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = icmp slt <4 x i32> [[TMP11]], <i32 213, i32 213, i32 213, i32 213>
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = select <4 x i1> [[TMP14]], <4 x i32> [[TMP11]], <4 x i32> [[TMP13]]
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = bitcast i32* [[TMP7]] to <4 x i32>*
>>> -; CHECK-NEXT:    store <4 x i32> [[TMP15]], <4 x i32>* [[TMP16]], align 4
>>> +; CHECK-NEXT:    [[VECTOR_RECUR:%.*]] = phi <4 x i64> [ <i64 poison, i64 poison, i64 poison, i64 0>, [[VECTOR_PH]] ], [ [[TMP8:%.*]], [[VECTOR_BODY]] ]
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = add i32 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[PTR:%.*]], i32 [[TMP4]]
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[TMP5]], i32 0
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = bitcast i32* [[TMP6]] to <4 x i32>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP7]], align 4
>>> +; CHECK-NEXT:    [[TMP8]] = zext <4 x i32> [[WIDE_LOAD]] to <4 x i64>
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <4 x i64> [[VECTOR_RECUR]], <4 x i64> [[TMP8]], <4 x i32> <i32 3, i32 4, i32 5, i32 6>
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = trunc <4 x i64> [[TMP9]] to <4 x i32>
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = add <4 x i32> [[TMP10]], <i32 2, i32 2, i32 2, i32 2>
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = mul <4 x i32> [[TMP11]], <i32 99, i32 99, i32 99, i32 99>
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = icmp slt <4 x i32> [[TMP10]], <i32 213, i32 213, i32 213, i32 213>
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = select <4 x i1> [[TMP13]], <4 x i32> [[TMP10]], <4 x i32> [[TMP12]]
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = bitcast i32* [[TMP6]] to <4 x i32>*
>>> +; CHECK-NEXT:    store <4 x i32> [[TMP14]], <4 x i32>* [[TMP15]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[UMAX1]], [[N_VEC]]
>>> -; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i64> [[TMP9]], i32 3
>>> -; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT_FOR_PHI:%.*]] = extractelement <4 x i64> [[TMP9]], i32 2
>>> +; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i64> [[TMP8]], i32 3
>>> +; CHECK-NEXT:    [[VECTOR_RECUR_EXTRACT_FOR_PHI:%.*]] = extractelement <4 x i64> [[TMP8]], i32 2
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>  ; CHECK:       scalar.ph:
>>>  ; CHECK-NEXT:    [[SCALAR_RECUR_INIT:%.*]] = phi i64 [ 0, [[VECTOR_SCEVCHECK]] ], [ 0, [[ENTRY:%.*]] ], [ [[VECTOR_RECUR_EXTRACT]], [[MIDDLE_BLOCK]] ]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll b/llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
>>> index c65f62cef65ef..7f2f18ea185c8 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
>>> @@ -378,10 +378,9 @@ for.end:
>>>  ; Check the case when unrolled but not vectorized.
>>>  ; UNROLL-NO-VF-LABEL: extract_second_last_iteration
>>>  ; UNROLL-NO-VF: vector.body:
>>> -; UNROLL-NO-VF:   %induction = add i32 %index, 0
>>> -; UNROLL-NO-VF:   %induction1 = add i32 %index, 1
>>> -; UNROLL-NO-VF:   %[[L1:.+]] = add i32 %induction, %x
>>> -; UNROLL-NO-VF:   %[[L2:.+]] = add i32 %induction1, %x
>>> +; UNROLL-NO-VF:   %induction = add i32 %index, 1
>>> +; UNROLL-NO-VF:   %[[L1:.+]] = add i32 %index, %x
>>> +; UNROLL-NO-VF:   %[[L2:.+]] = add i32 %induction, %x
>>>  ; UNROLL-NO-VF:   %index.next = add nuw i32 %index, 2
>>>  ; UNROLL-NO-VF:   icmp eq i32 %index.next, 96
>>>  ; UNROLL-NO-VF: for.end:
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/if-pred-stores.ll b/llvm/test/Transforms/LoopVectorize/if-pred-stores.ll
>>> index 0b84468dfa6e1..30bb3f7bb5116 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/if-pred-stores.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/if-pred-stores.ll
>>> @@ -11,11 +11,10 @@ define i32 @test(i32* nocapture %f) #0 {
>>>  ; UNROLL-NEXT:  entry:
>>>  ; UNROLL-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL:       vector.body:
>>> -; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE3:%.*]] ]
>>> -; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; UNROLL-NEXT:    [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
>>> -; UNROLL-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[F:%.*]], i64 [[INDUCTION]]
>>> -; UNROLL-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[F]], i64 [[INDUCTION1]]
>>> +; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE2:%.*]] ]
>>> +; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; UNROLL-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[F:%.*]], i64 [[INDEX]]
>>> +; UNROLL-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[F]], i64 [[INDUCTION]]
>>>  ; UNROLL-NEXT:    [[TMP2:%.*]] = load i32, i32* [[TMP0]], align 4
>>>  ; UNROLL-NEXT:    [[TMP3:%.*]] = load i32, i32* [[TMP1]], align 4
>>>  ; UNROLL-NEXT:    [[TMP4:%.*]] = icmp sgt i32 [[TMP2]], 100
>>> @@ -26,12 +25,12 @@ define i32 @test(i32* nocapture %f) #0 {
>>>  ; UNROLL-NEXT:    store i32 [[TMP6]], i32* [[TMP0]], align 4
>>>  ; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; UNROLL:       pred.store.continue:
>>> -; UNROLL-NEXT:    br i1 [[TMP5]], label [[PRED_STORE_IF2:%.*]], label [[PRED_STORE_CONTINUE3]]
>>> -; UNROLL:       pred.store.if2:
>>> +; UNROLL-NEXT:    br i1 [[TMP5]], label [[PRED_STORE_IF1:%.*]], label [[PRED_STORE_CONTINUE2]]
>>> +; UNROLL:       pred.store.if1:
>>>  ; UNROLL-NEXT:    [[TMP7:%.*]] = add nsw i32 [[TMP3]], 20
>>>  ; UNROLL-NEXT:    store i32 [[TMP7]], i32* [[TMP1]], align 4
>>> -; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE3]]
>>> -; UNROLL:       pred.store.continue3:
>>> +; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE2]]
>>> +; UNROLL:       pred.store.continue2:
>>>  ; UNROLL-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
>>>  ; UNROLL-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>> @@ -61,11 +60,10 @@ define i32 @test(i32* nocapture %f) #0 {
>>>  ; UNROLL-NOSIMPLIFY:       vector.ph:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL-NOSIMPLIFY:       vector.body:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE3:%.*]] ]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[F:%.*]], i64 [[INDUCTION]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[F]], i64 [[INDUCTION1]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE2:%.*]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[F:%.*]], i64 [[INDEX]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[F]], i64 [[INDUCTION]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP2:%.*]] = load i32, i32* [[TMP0]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP3:%.*]] = load i32, i32* [[TMP1]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP4:%.*]] = icmp sgt i32 [[TMP2]], 100
>>> @@ -76,12 +74,12 @@ define i32 @test(i32* nocapture %f) #0 {
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i32 [[TMP6]], i32* [[TMP0]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; UNROLL-NOSIMPLIFY:       pred.store.continue:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP5]], label [[PRED_STORE_IF2:%.*]], label [[PRED_STORE_CONTINUE3]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.if2:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP5]], label [[PRED_STORE_IF1:%.*]], label [[PRED_STORE_CONTINUE2]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.if1:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP7:%.*]] = add nsw i32 [[TMP3]], 20
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i32 [[TMP7]], i32* [[TMP1]], align 4
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE3]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.continue3:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE2]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.continue2:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 128
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>> @@ -209,30 +207,29 @@ define void @bug18724(i1 %cond) {
>>>  ; UNROLL-NEXT:    [[IND_END:%.*]] = add i64 [[N_VEC]], undef
>>>  ; UNROLL-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL:       vector.body:
>>> -; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
>>> -; UNROLL-NEXT:    [[VEC_PHI:%.*]] = phi i32 [ undef, [[VECTOR_PH]] ], [ [[PREDPHI:%.*]], [[PRED_STORE_CONTINUE4]] ]
>>> -; UNROLL-NEXT:    [[VEC_PHI2:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[PREDPHI5:%.*]], [[PRED_STORE_CONTINUE4]] ]
>>> +; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE3:%.*]] ]
>>> +; UNROLL-NEXT:    [[VEC_PHI:%.*]] = phi i32 [ undef, [[VECTOR_PH]] ], [ [[PREDPHI:%.*]], [[PRED_STORE_CONTINUE3]] ]
>>> +; UNROLL-NEXT:    [[VEC_PHI1:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[PREDPHI4:%.*]], [[PRED_STORE_CONTINUE3]] ]
>>>  ; UNROLL-NEXT:    [[OFFSET_IDX:%.*]] = add i64 [[INDEX]], undef
>>> -; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[OFFSET_IDX]], 0
>>> -; UNROLL-NEXT:    [[INDUCTION1:%.*]] = add i64 [[OFFSET_IDX]], 1
>>> -; UNROLL-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION]]
>>> -; UNROLL-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION1]]
>>> +; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[OFFSET_IDX]], 1
>>> +; UNROLL-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[OFFSET_IDX]]
>>> +; UNROLL-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION]]
>>>  ; UNROLL-NEXT:    [[TMP6:%.*]] = load i32, i32* [[TMP4]], align 4
>>>  ; UNROLL-NEXT:    [[TMP7:%.*]] = load i32, i32* [[TMP5]], align 4
>>> -; UNROLL-NEXT:    br i1 undef, label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL-NEXT:    br i1 undef, label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE3]]
>>>  ; UNROLL:       pred.store.if:
>>>  ; UNROLL-NEXT:    store i32 2, i32* [[TMP4]], align 4
>>> -; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> -; UNROLL:       pred.store.continue4:
>>> +; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE3]]
>>> +; UNROLL:       pred.store.continue3:
>>>  ; UNROLL-NEXT:    [[TMP8:%.*]] = add i32 [[VEC_PHI]], 1
>>> -; UNROLL-NEXT:    [[TMP9:%.*]] = add i32 [[VEC_PHI2]], 1
>>> +; UNROLL-NEXT:    [[TMP9:%.*]] = add i32 [[VEC_PHI1]], 1
>>>  ; UNROLL-NEXT:    [[PREDPHI]] = select i1 undef, i32 [[VEC_PHI]], i32 [[TMP8]]
>>> -; UNROLL-NEXT:    [[PREDPHI5]] = select i1 undef, i32 [[VEC_PHI2]], i32 [[TMP9]]
>>> +; UNROLL-NEXT:    [[PREDPHI4]] = select i1 undef, i32 [[VEC_PHI1]], i32 [[TMP9]]
>>>  ; UNROLL-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NEXT:    [[TMP10:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>>  ; UNROLL-NEXT:    br i1 [[TMP10]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
>>>  ; UNROLL:       middle.block:
>>> -; UNROLL-NEXT:    [[BIN_RDX:%.*]] = add i32 [[PREDPHI5]], [[PREDPHI]]
>>> +; UNROLL-NEXT:    [[BIN_RDX:%.*]] = add i32 [[PREDPHI4]], [[PREDPHI]]
>>>  ; UNROLL-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP3]], [[N_VEC]]
>>>  ; UNROLL-NEXT:    [[TMP11:%.*]] = xor i1 [[CMP_N]], true
>>>  ; UNROLL-NEXT:    call void @llvm.assume(i1 [[TMP11]])
>>> @@ -277,14 +274,13 @@ define void @bug18724(i1 %cond) {
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[IND_END:%.*]] = add i64 [[N_VEC]], undef
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL-NOSIMPLIFY:       vector.body:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[VEC_PHI:%.*]] = phi i32 [ undef, [[VECTOR_PH]] ], [ [[PREDPHI:%.*]], [[PRED_STORE_CONTINUE4]] ]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[VEC_PHI2:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[PREDPHI5:%.*]], [[PRED_STORE_CONTINUE4]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE3:%.*]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[VEC_PHI:%.*]] = phi i32 [ undef, [[VECTOR_PH]] ], [ [[PREDPHI:%.*]], [[PRED_STORE_CONTINUE3]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[VEC_PHI1:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[PREDPHI4:%.*]], [[PRED_STORE_CONTINUE3]] ]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[OFFSET_IDX:%.*]] = add i64 [[INDEX]], undef
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[OFFSET_IDX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION1:%.*]] = add i64 [[OFFSET_IDX]], 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP3:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION1]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[OFFSET_IDX]], 1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP3:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[OFFSET_IDX]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [768 x i32], [768 x i32]* undef, i64 0, i64 [[INDUCTION]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP5:%.*]] = load i32, i32* [[TMP3]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP6:%.*]] = load i32, i32* [[TMP4]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 undef, label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
>>> @@ -292,20 +288,20 @@ define void @bug18724(i1 %cond) {
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i32 2, i32* [[TMP3]], align 4
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; UNROLL-NOSIMPLIFY:       pred.store.continue:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br i1 undef, label [[PRED_STORE_IF3:%.*]], label [[PRED_STORE_CONTINUE4]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.if3:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br i1 undef, label [[PRED_STORE_IF2:%.*]], label [[PRED_STORE_CONTINUE3]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.if2:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i32 2, i32* [[TMP4]], align 4
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.continue4:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE3]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.continue3:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP7:%.*]] = add i32 [[VEC_PHI]], 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP8:%.*]] = add i32 [[VEC_PHI2]], 1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP8:%.*]] = add i32 [[VEC_PHI1]], 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[PREDPHI]] = select i1 undef, i32 [[VEC_PHI]], i32 [[TMP7]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[PREDPHI5]] = select i1 undef, i32 [[VEC_PHI2]], i32 [[TMP8]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[PREDPHI4]] = select i1 undef, i32 [[VEC_PHI1]], i32 [[TMP8]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP9:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP9]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
>>>  ; UNROLL-NOSIMPLIFY:       middle.block:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[BIN_RDX:%.*]] = add i32 [[PREDPHI5]], [[PREDPHI]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[BIN_RDX:%.*]] = add i32 [[PREDPHI4]], [[PREDPHI]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP2]], [[N_VEC]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[CMP_N]], label [[FOR_INC26_LOOPEXIT:%.*]], label [[SCALAR_PH]]
>>>  ; UNROLL-NOSIMPLIFY:       scalar.ph:
>>> @@ -437,26 +433,24 @@ define void @minimal_bit_widths(i1 %c) {
>>>  ; UNROLL-NEXT:  entry:
>>>  ; UNROLL-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL:       vector.body:
>>> -; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE6:%.*]] ]
>>> +; UNROLL-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
>>>  ; UNROLL-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 undef, [[INDEX]]
>>> -; UNROLL-NEXT:    [[INDUCTION3:%.*]] = add i64 [[OFFSET_IDX]], 0
>>> -; UNROLL-NEXT:    [[INDUCTION4:%.*]] = add i64 [[OFFSET_IDX]], -1
>>> -; UNROLL-NEXT:    br i1 [[C:%.*]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE6]]
>>> +; UNROLL-NEXT:    [[INDUCTION2:%.*]] = add i64 [[OFFSET_IDX]], -1
>>> +; UNROLL-NEXT:    br i1 [[C:%.*]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE4]]
>>>  ; UNROLL:       pred.store.if:
>>> -; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; UNROLL-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION]]
>>> +; UNROLL-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* undef, i64 [[INDEX]]
>>>  ; UNROLL-NEXT:    [[TMP1:%.*]] = load i8, i8* [[TMP0]], align 1
>>>  ; UNROLL-NEXT:    [[TMP2:%.*]] = zext i8 [[TMP1]] to i32
>>>  ; UNROLL-NEXT:    [[TMP3:%.*]] = trunc i32 [[TMP2]] to i8
>>>  ; UNROLL-NEXT:    store i8 [[TMP3]], i8* [[TMP0]], align 1
>>> -; UNROLL-NEXT:    [[INDUCTION2:%.*]] = add i64 [[INDEX]], 1
>>> -; UNROLL-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION2]]
>>> +; UNROLL-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; UNROLL-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION]]
>>>  ; UNROLL-NEXT:    [[TMP5:%.*]] = load i8, i8* [[TMP4]], align 1
>>>  ; UNROLL-NEXT:    [[TMP6:%.*]] = zext i8 [[TMP5]] to i32
>>>  ; UNROLL-NEXT:    [[TMP7:%.*]] = trunc i32 [[TMP6]] to i8
>>>  ; UNROLL-NEXT:    store i8 [[TMP7]], i8* [[TMP4]], align 1
>>> -; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE6]]
>>> -; UNROLL:       pred.store.continue6:
>>> +; UNROLL-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL:       pred.store.continue4:
>>>  ; UNROLL-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], undef
>>>  ; UNROLL-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
>>> @@ -488,30 +482,28 @@ define void @minimal_bit_widths(i1 %c) {
>>>  ; UNROLL-NOSIMPLIFY:       vector.ph:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL-NOSIMPLIFY:       vector.body:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE6:%.*]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 undef, [[INDEX]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION3:%.*]] = add i64 [[OFFSET_IDX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION4:%.*]] = add i64 [[OFFSET_IDX]], -1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION2:%.*]] = add i64 [[OFFSET_IDX]], -1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C:%.*]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
>>>  ; UNROLL-NOSIMPLIFY:       pred.store.if:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* undef, i64 [[INDEX]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP1:%.*]] = load i8, i8* [[TMP0]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP2:%.*]] = zext i8 [[TMP1]] to i32
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP3:%.*]] = trunc i32 [[TMP2]] to i8
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 [[TMP3]], i8* [[TMP0]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; UNROLL-NOSIMPLIFY:       pred.store.continue:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C]], label [[PRED_STORE_IF5:%.*]], label [[PRED_STORE_CONTINUE6]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.if5:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION2:%.*]] = add i64 [[INDEX]], 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION2]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C]], label [[PRED_STORE_IF3:%.*]], label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.if3:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* undef, i64 [[INDUCTION]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP5:%.*]] = load i8, i8* [[TMP4]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP6:%.*]] = zext i8 [[TMP5]] to i32
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP7:%.*]] = trunc i32 [[TMP6]] to i8
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 [[TMP7]], i8* [[TMP4]], align 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE6]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.continue6:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.continue4:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], undef
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>> @@ -657,14 +649,12 @@ define void @minimal_bit_widths_with_aliasing_store(i1 %c, i8* %ptr) {
>>>  ; UNROLL-NOSIMPLIFY:       vector.ph:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; UNROLL-NOSIMPLIFY:       vector.body:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE6:%.*]] ]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION2:%.*]] = add i64 [[INDEX]], 1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 0, [[INDEX]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION3:%.*]] = add i64 [[OFFSET_IDX]], 0
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION4:%.*]] = add i64 [[OFFSET_IDX]], -1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* [[PTR:%.*]], i64 [[INDUCTION]]
>>> -; UNROLL-NOSIMPLIFY-NEXT:    [[TMP1:%.*]] = getelementptr i8, i8* [[PTR]], i64 [[INDUCTION2]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[INDUCTION2:%.*]] = add i64 [[OFFSET_IDX]], -1
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP0:%.*]] = getelementptr i8, i8* [[PTR:%.*]], i64 [[INDEX]]
>>> +; UNROLL-NOSIMPLIFY-NEXT:    [[TMP1:%.*]] = getelementptr i8, i8* [[PTR]], i64 [[INDUCTION]]
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 0, i8* [[TMP0]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 0, i8* [[TMP1]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C:%.*]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
>>> @@ -675,14 +665,14 @@ define void @minimal_bit_widths_with_aliasing_store(i1 %c, i8* %ptr) {
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 [[TMP4]], i8* [[TMP0]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; UNROLL-NOSIMPLIFY:       pred.store.continue:
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C]], label [[PRED_STORE_IF5:%.*]], label [[PRED_STORE_CONTINUE6]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.if5:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[C]], label [[PRED_STORE_IF3:%.*]], label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.if3:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP5:%.*]] = load i8, i8* [[TMP1]], align 1
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP6:%.*]] = zext i8 [[TMP5]] to i32
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP7:%.*]] = trunc i32 [[TMP6]] to i8
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    store i8 [[TMP7]], i8* [[TMP1]], align 1
>>> -; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE6]]
>>> -; UNROLL-NOSIMPLIFY:       pred.store.continue6:
>>> +; UNROLL-NOSIMPLIFY-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> +; UNROLL-NOSIMPLIFY:       pred.store.continue4:
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], 0
>>>  ; UNROLL-NOSIMPLIFY-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/pointer-induction.ll b/llvm/test/Transforms/LoopVectorize/pointer-induction.ll
>>> index d77ce32e753ee..7a69c80c319c4 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/pointer-induction.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/pointer-induction.ll
>>> @@ -136,33 +136,31 @@ define void @pointer_induction_used_as_vector(i8** noalias %start.1, i8* noalias
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8*, i8** [[START_1]], i64 [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP4:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP2]]
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[INDEX]], 1
>>> -; CHECK-NEXT:    [[NEXT_GEP5:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP3]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = add i64 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[NEXT_GEP6:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP4]]
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = add i64 [[INDEX]], 3
>>> -; CHECK-NEXT:    [[NEXT_GEP7:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP5]]
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = insertelement <4 x i8*> poison, i8* [[NEXT_GEP4]], i32 0
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <4 x i8*> [[TMP6]], i8* [[NEXT_GEP5]], i32 1
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = insertelement <4 x i8*> [[TMP7]], i8* [[NEXT_GEP6]], i32 2
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = insertelement <4 x i8*> [[TMP8]], i8* [[NEXT_GEP7]], i32 3
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i8, <4 x i8*> [[TMP9]], i64 1
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr i8*, i8** [[NEXT_GEP]], i32 0
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = bitcast i8** [[TMP11]] to <4 x i8*>*
>>> -; CHECK-NEXT:    store <4 x i8*> [[TMP10]], <4 x i8*>* [[TMP12]], align 8
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr i8, i8* [[NEXT_GEP4]], i32 0
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i8* [[TMP13]] to <4 x i8>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP14]], align 1
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = add <4 x i8> [[WIDE_LOAD]], <i8 1, i8 1, i8 1, i8 1>
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = bitcast i8* [[TMP13]] to <4 x i8>*
>>> -; CHECK-NEXT:    store <4 x i8> [[TMP15]], <4 x i8>* [[TMP16]], align 1
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8*, i8** [[START_1]], i64 [[INDEX]]
>>> +; CHECK-NEXT:    [[NEXT_GEP4:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[INDEX]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[INDEX]], 1
>>> +; CHECK-NEXT:    [[NEXT_GEP5:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP1]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[INDEX]], 2
>>> +; CHECK-NEXT:    [[NEXT_GEP6:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP2]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[INDEX]], 3
>>> +; CHECK-NEXT:    [[NEXT_GEP7:%.*]] = getelementptr i8, i8* [[START_2]], i64 [[TMP3]]
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <4 x i8*> poison, i8* [[NEXT_GEP4]], i32 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <4 x i8*> [[TMP4]], i8* [[NEXT_GEP5]], i32 1
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = insertelement <4 x i8*> [[TMP5]], i8* [[NEXT_GEP6]], i32 2
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <4 x i8*> [[TMP6]], i8* [[NEXT_GEP7]], i32 3
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i8, <4 x i8*> [[TMP7]], i64 1
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr i8*, i8** [[NEXT_GEP]], i32 0
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = bitcast i8** [[TMP9]] to <4 x i8*>*
>>> +; CHECK-NEXT:    store <4 x i8*> [[TMP8]], <4 x i8*>* [[TMP10]], align 8
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr i8, i8* [[NEXT_GEP4]], i32 0
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = bitcast i8* [[TMP11]] to <4 x i8>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP12]], align 1
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = add <4 x i8> [[WIDE_LOAD]], <i8 1, i8 1, i8 1, i8 1>
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i8* [[TMP11]] to <4 x i8>*
>>> +; CHECK-NEXT:    store <4 x i8> [[TMP13]], <4 x i8>* [[TMP14]], align 1
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP15]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[N]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll b/llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
>>> index 242e41195e841..238ba6ccc63db 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/pr30654-phiscev-sext-trunc.ll
>>> @@ -51,20 +51,19 @@ define void @doit1(i32 %n, i32 %step) local_unnamed_addr {
>>>  ; CHECK-NEXT:    [[MUL:%.*]] = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 [[TMP4]], i8 [[TMP5]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i8, i1 } [[MUL]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i8, i1 } [[MUL]], 1
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = add i8 [[MUL_RESULT]], 0
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i8 [[TMP7]], 0
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = icmp slt i8 [[TMP6]], 0
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = select i1 [[TMP3]], i1 [[TMP8]], i1 [[TMP9]]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = icmp ne i8 [[TMP1]], 0
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = and i1 [[TMP11]], [[TMP12]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP10]], [[TMP13]]
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP14]], [[MUL_OVERFLOW]]
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = sext i8 [[TMP1]] to i32
>>> -; CHECK-NEXT:    [[IDENT_CHECK:%.*]] = icmp ne i32 [[STEP]], [[TMP16]]
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = or i1 [[TMP15]], [[IDENT_CHECK]]
>>> -; CHECK-NEXT:    br i1 [[TMP17]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = icmp sgt i8 [[TMP6]], 0
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = icmp slt i8 [[MUL_RESULT]], 0
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = select i1 [[TMP3]], i1 [[TMP7]], i1 [[TMP8]]
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = icmp ne i8 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = and i1 [[TMP10]], [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP9]], [[TMP12]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW]]
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = sext i8 [[TMP1]] to i32
>>> +; CHECK-NEXT:    [[IDENT_CHECK:%.*]] = icmp ne i32 [[STEP]], [[TMP15]]
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = or i1 [[TMP14]], [[IDENT_CHECK]]
>>> +; CHECK-NEXT:    br i1 [[TMP16]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[WIDE_TRIP_COUNT]], 4
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[WIDE_TRIP_COUNT]], [[N_MOD_VF]]
>>> @@ -72,24 +71,24 @@ define void @doit1(i32 %n, i32 %step) local_unnamed_addr {
>>>  ; CHECK-NEXT:    [[IND_END:%.*]] = mul i32 [[CAST_CRD]], [[STEP]]
>>>  ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[STEP]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP18]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = mul i32 [[STEP]], 4
>>> -; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP19]], i32 0
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> +; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP17]], zeroinitializer
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = mul i32 [[STEP]], 4
>>> +; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP18]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT3:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT2]], <4 x i32> poison, <4 x i32> zeroinitializer
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP20]]
>>> -; CHECK-NEXT:    [[TMP22:%.*]] = getelementptr inbounds i32, i32* [[TMP21]], i32 0
>>> -; CHECK-NEXT:    [[TMP23:%.*]] = bitcast i32* [[TMP22]] to <4 x i32>*
>>> -; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP23]], align 4
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = add i64 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP19]]
>>> +; CHECK-NEXT:    [[TMP21:%.*]] = getelementptr inbounds i32, i32* [[TMP20]], i32 0
>>> +; CHECK-NEXT:    [[TMP22:%.*]] = bitcast i32* [[TMP21]] to <4 x i32>*
>>> +; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP22]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
>>>  ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], [[DOTSPLAT3]]
>>> -; CHECK-NEXT:    [[TMP24:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP24]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP23:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP23]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
>>> @@ -177,20 +176,19 @@ define void @doit2(i32 %n, i32 %step) local_unnamed_addr  {
>>>  ; CHECK-NEXT:    [[MUL:%.*]] = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 [[TMP4]], i8 [[TMP5]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i8, i1 } [[MUL]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i8, i1 } [[MUL]], 1
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = add i8 [[MUL_RESULT]], 0
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = icmp ugt i8 [[TMP7]], 0
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = icmp ult i8 [[TMP6]], 0
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = select i1 [[TMP3]], i1 [[TMP8]], i1 [[TMP9]]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = icmp ne i8 [[TMP1]], 0
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = and i1 [[TMP11]], [[TMP12]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP10]], [[TMP13]]
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = or i1 [[TMP14]], [[MUL_OVERFLOW]]
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = sext i8 [[TMP1]] to i32
>>> -; CHECK-NEXT:    [[IDENT_CHECK:%.*]] = icmp ne i32 [[STEP]], [[TMP16]]
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = or i1 [[TMP15]], [[IDENT_CHECK]]
>>> -; CHECK-NEXT:    br i1 [[TMP17]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = icmp ugt i8 [[TMP6]], 0
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = icmp ult i8 [[MUL_RESULT]], 0
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = select i1 [[TMP3]], i1 [[TMP7]], i1 [[TMP8]]
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = icmp ne i8 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = and i1 [[TMP10]], [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP9]], [[TMP12]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW]]
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = sext i8 [[TMP1]] to i32
>>> +; CHECK-NEXT:    [[IDENT_CHECK:%.*]] = icmp ne i32 [[STEP]], [[TMP15]]
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = or i1 [[TMP14]], [[IDENT_CHECK]]
>>> +; CHECK-NEXT:    br i1 [[TMP16]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[WIDE_TRIP_COUNT]], 4
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[WIDE_TRIP_COUNT]], [[N_MOD_VF]]
>>> @@ -198,24 +196,24 @@ define void @doit2(i32 %n, i32 %step) local_unnamed_addr  {
>>>  ; CHECK-NEXT:    [[IND_END:%.*]] = mul i32 [[CAST_CRD]], [[STEP]]
>>>  ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[STEP]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP18]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = mul i32 [[STEP]], 4
>>> -; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP19]], i32 0
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> +; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP17]], zeroinitializer
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = mul i32 [[STEP]], 4
>>> +; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP18]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT3:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT2]], <4 x i32> poison, <4 x i32> zeroinitializer
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP20]]
>>> -; CHECK-NEXT:    [[TMP22:%.*]] = getelementptr inbounds i32, i32* [[TMP21]], i32 0
>>> -; CHECK-NEXT:    [[TMP23:%.*]] = bitcast i32* [[TMP22]] to <4 x i32>*
>>> -; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP23]], align 4
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = add i64 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP19]]
>>> +; CHECK-NEXT:    [[TMP21:%.*]] = getelementptr inbounds i32, i32* [[TMP20]], i32 0
>>> +; CHECK-NEXT:    [[TMP22:%.*]] = bitcast i32* [[TMP21]] to <4 x i32>*
>>> +; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP22]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
>>>  ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], [[DOTSPLAT3]]
>>> -; CHECK-NEXT:    [[TMP24:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP24]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP23:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP23]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
>>> @@ -377,17 +375,16 @@ define void @doit4(i32 %n, i8 signext %cstep) local_unnamed_addr {
>>>  ; CHECK-NEXT:    [[MUL:%.*]] = call { i8, i1 } @llvm.umul.with.overflow.i8(i8 [[TMP3]], i8 [[TMP4]])
>>>  ; CHECK-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i8, i1 } [[MUL]], 0
>>>  ; CHECK-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i8, i1 } [[MUL]], 1
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = add i8 [[MUL_RESULT]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = icmp sgt i8 [[TMP6]], 0
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = icmp slt i8 [[TMP5]], 0
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = select i1 [[TMP2]], i1 [[TMP7]], i1 [[TMP8]]
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = icmp ne i8 [[CSTEP]], 0
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = and i1 [[TMP10]], [[TMP11]]
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP9]], [[TMP12]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW]]
>>> -; CHECK-NEXT:    br i1 [[TMP14]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = sub i8 0, [[MUL_RESULT]]
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = icmp sgt i8 [[TMP5]], 0
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = icmp slt i8 [[MUL_RESULT]], 0
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = select i1 [[TMP2]], i1 [[TMP6]], i1 [[TMP7]]
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = icmp ugt i64 [[TMP0]], 255
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = icmp ne i8 [[CSTEP]], 0
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = and i1 [[TMP9]], [[TMP10]]
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = or i1 [[TMP8]], [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = or i1 [[TMP12]], [[MUL_OVERFLOW]]
>>> +; CHECK-NEXT:    br i1 [[TMP13]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[WIDE_TRIP_COUNT]], 4
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[WIDE_TRIP_COUNT]], [[N_MOD_VF]]
>>> @@ -395,24 +392,24 @@ define void @doit4(i32 %n, i8 signext %cstep) local_unnamed_addr {
>>>  ; CHECK-NEXT:    [[IND_END:%.*]] = mul i32 [[CAST_CRD]], [[CONV]]
>>>  ; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[CONV]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP15]], zeroinitializer
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = mul i32 [[CONV]], 4
>>> -; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP16]], i32 0
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = mul <4 x i32> <i32 0, i32 1, i32 2, i32 3>, [[DOTSPLAT]]
>>> +; CHECK-NEXT:    [[INDUCTION:%.*]] = add <4 x i32> [[TMP14]], zeroinitializer
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = mul i32 [[CONV]], 4
>>> +; CHECK-NEXT:    [[DOTSPLATINSERT2:%.*]] = insertelement <4 x i32> poison, i32 [[TMP15]], i32 0
>>>  ; CHECK-NEXT:    [[DOTSPLAT3:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT2]], <4 x i32> poison, <4 x i32> zeroinitializer
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP17]]
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, i32* [[TMP18]], i32 0
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = bitcast i32* [[TMP19]] to <4 x i32>*
>>> -; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP20]], align 4
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = add i64 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [250 x i32], [250 x i32]* @a, i64 0, i64 [[TMP16]]
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr inbounds i32, i32* [[TMP17]], i32 0
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = bitcast i32* [[TMP18]] to <4 x i32>*
>>> +; CHECK-NEXT:    store <4 x i32> [[VEC_IND]], <4 x i32>* [[TMP19]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
>>>  ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], [[DOTSPLAT3]]
>>> -; CHECK-NEXT:    [[TMP21:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP21]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP20:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP20]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[FOR_END_LOOPEXIT:%.*]], label [[SCALAR_PH]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll b/llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll
>>> index d77abbd39f0d2..3780b3888a80c 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll
>>> @@ -55,7 +55,7 @@ define void @pr45679(i32* %A) optsize {
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add i32 [[INDEX]], 4
>>>  ; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 4, i32 4, i32 4, i32 4>
>>>  ; CHECK-NEXT:    [[TMP13:%.*]] = icmp eq i32 [[INDEX_NEXT]], 16
>>> -; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
>>> +; CHECK-NEXT:    br i1 [[TMP13]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    br i1 true, label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>  ; CHECK:       scalar.ph:
>>> @@ -67,7 +67,7 @@ define void @pr45679(i32* %A) optsize {
>>>  ; CHECK-NEXT:    store i32 13, i32* [[ARRAYIDX]], align 1
>>>  ; CHECK-NEXT:    [[RIVPLUS1]] = add nuw nsw i32 [[RIV]], 1
>>>  ; CHECK-NEXT:    [[COND:%.*]] = icmp eq i32 [[RIVPLUS1]], 14
>>> -; CHECK-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]], !llvm.loop !2
>>> +; CHECK-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ; CHECK:       exit:
>>>  ; CHECK-NEXT:    ret void
>>>  ;
>>> @@ -117,7 +117,7 @@ define void @pr45679(i32* %A) optsize {
>>>  ; VF2UF2-NEXT:    [[INDEX_NEXT]] = add i32 [[INDEX]], 4
>>>  ; VF2UF2-NEXT:    [[VEC_IND_NEXT]] = add <2 x i32> [[STEP_ADD]], <i32 2, i32 2>
>>>  ; VF2UF2-NEXT:    [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], 16
>>> -; VF2UF2-NEXT:    br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
>>> +; VF2UF2-NEXT:    br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; VF2UF2:       middle.block:
>>>  ; VF2UF2-NEXT:    br i1 true, label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>  ; VF2UF2:       scalar.ph:
>>> @@ -129,7 +129,7 @@ define void @pr45679(i32* %A) optsize {
>>>  ; VF2UF2-NEXT:    store i32 13, i32* [[ARRAYIDX]], align 1
>>>  ; VF2UF2-NEXT:    [[RIVPLUS1]] = add nuw nsw i32 [[RIV]], 1
>>>  ; VF2UF2-NEXT:    [[COND:%.*]] = icmp eq i32 [[RIVPLUS1]], 14
>>> -; VF2UF2-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]], !llvm.loop !2
>>> +; VF2UF2-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ; VF2UF2:       exit:
>>>  ; VF2UF2-NEXT:    ret void
>>>  ;
>>> @@ -139,42 +139,41 @@ define void @pr45679(i32* %A) optsize {
>>>  ; VF1UF4:       vector.ph:
>>>  ; VF1UF4-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; VF1UF4:       vector.body:
>>> -; VF1UF4-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE9:%.*]] ]
>>> -; VF1UF4-NEXT:    [[INDUCTION:%.*]] = add i32 [[INDEX]], 0
>>> -; VF1UF4-NEXT:    [[INDUCTION1:%.*]] = add i32 [[INDEX]], 1
>>> -; VF1UF4-NEXT:    [[INDUCTION2:%.*]] = add i32 [[INDEX]], 2
>>> -; VF1UF4-NEXT:    [[INDUCTION3:%.*]] = add i32 [[INDEX]], 3
>>> -; VF1UF4-NEXT:    [[TMP0:%.*]] = icmp ule i32 [[INDUCTION]], 13
>>> -; VF1UF4-NEXT:    [[TMP1:%.*]] = icmp ule i32 [[INDUCTION1]], 13
>>> -; VF1UF4-NEXT:    [[TMP2:%.*]] = icmp ule i32 [[INDUCTION2]], 13
>>> -; VF1UF4-NEXT:    [[TMP3:%.*]] = icmp ule i32 [[INDUCTION3]], 13
>>> +; VF1UF4-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE8:%.*]] ]
>>> +; VF1UF4-NEXT:    [[INDUCTION:%.*]] = add i32 [[INDEX]], 1
>>> +; VF1UF4-NEXT:    [[INDUCTION1:%.*]] = add i32 [[INDEX]], 2
>>> +; VF1UF4-NEXT:    [[INDUCTION2:%.*]] = add i32 [[INDEX]], 3
>>> +; VF1UF4-NEXT:    [[TMP0:%.*]] = icmp ule i32 [[INDEX]], 13
>>> +; VF1UF4-NEXT:    [[TMP1:%.*]] = icmp ule i32 [[INDUCTION]], 13
>>> +; VF1UF4-NEXT:    [[TMP2:%.*]] = icmp ule i32 [[INDUCTION1]], 13
>>> +; VF1UF4-NEXT:    [[TMP3:%.*]] = icmp ule i32 [[INDUCTION2]], 13
>>>  ; VF1UF4-NEXT:    br i1 [[TMP0]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
>>>  ; VF1UF4:       pred.store.if:
>>> -; VF1UF4-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[INDUCTION]]
>>> +; VF1UF4-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[INDEX]]
>>>  ; VF1UF4-NEXT:    store i32 13, i32* [[TMP4]], align 1
>>>  ; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE]]
>>>  ; VF1UF4:       pred.store.continue:
>>> -; VF1UF4-NEXT:    br i1 [[TMP1]], label [[PRED_STORE_IF4:%.*]], label [[PRED_STORE_CONTINUE5:%.*]]
>>> -; VF1UF4:       pred.store.if4:
>>> -; VF1UF4-NEXT:    [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION1]]
>>> +; VF1UF4-NEXT:    br i1 [[TMP1]], label [[PRED_STORE_IF3:%.*]], label [[PRED_STORE_CONTINUE4:%.*]]
>>> +; VF1UF4:       pred.store.if3:
>>> +; VF1UF4-NEXT:    [[TMP5:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION]]
>>>  ; VF1UF4-NEXT:    store i32 13, i32* [[TMP5]], align 1
>>> -; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE5]]
>>> -; VF1UF4:       pred.store.continue5:
>>> -; VF1UF4-NEXT:    br i1 [[TMP2]], label [[PRED_STORE_IF6:%.*]], label [[PRED_STORE_CONTINUE7:%.*]]
>>> -; VF1UF4:       pred.store.if6:
>>> -; VF1UF4-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION2]]
>>> +; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE4]]
>>> +; VF1UF4:       pred.store.continue4:
>>> +; VF1UF4-NEXT:    br i1 [[TMP2]], label [[PRED_STORE_IF5:%.*]], label [[PRED_STORE_CONTINUE6:%.*]]
>>> +; VF1UF4:       pred.store.if5:
>>> +; VF1UF4-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION1]]
>>>  ; VF1UF4-NEXT:    store i32 13, i32* [[TMP6]], align 1
>>> -; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE7]]
>>> -; VF1UF4:       pred.store.continue7:
>>> -; VF1UF4-NEXT:    br i1 [[TMP3]], label [[PRED_STORE_IF8:%.*]], label [[PRED_STORE_CONTINUE9]]
>>> -; VF1UF4:       pred.store.if8:
>>> -; VF1UF4-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION3]]
>>> +; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE6]]
>>> +; VF1UF4:       pred.store.continue6:
>>> +; VF1UF4-NEXT:    br i1 [[TMP3]], label [[PRED_STORE_IF7:%.*]], label [[PRED_STORE_CONTINUE8]]
>>> +; VF1UF4:       pred.store.if7:
>>> +; VF1UF4-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[INDUCTION2]]
>>>  ; VF1UF4-NEXT:    store i32 13, i32* [[TMP7]], align 1
>>> -; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE9]]
>>> -; VF1UF4:       pred.store.continue9:
>>> +; VF1UF4-NEXT:    br label [[PRED_STORE_CONTINUE8]]
>>> +; VF1UF4:       pred.store.continue8:
>>>  ; VF1UF4-NEXT:    [[INDEX_NEXT]] = add i32 [[INDEX]], 4
>>>  ; VF1UF4-NEXT:    [[TMP8:%.*]] = icmp eq i32 [[INDEX_NEXT]], 16
>>> -; VF1UF4-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]]
>>> +; VF1UF4-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; VF1UF4:       middle.block:
>>>  ; VF1UF4-NEXT:    br i1 true, label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>  ; VF1UF4:       scalar.ph:
>>> @@ -186,7 +185,7 @@ define void @pr45679(i32* %A) optsize {
>>>  ; VF1UF4-NEXT:    store i32 13, i32* [[ARRAYIDX]], align 1
>>>  ; VF1UF4-NEXT:    [[RIVPLUS1]] = add nuw nsw i32 [[RIV]], 1
>>>  ; VF1UF4-NEXT:    [[COND:%.*]] = icmp eq i32 [[RIVPLUS1]], 14
>>> -; VF1UF4-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]]
>>> +; VF1UF4-NEXT:    br i1 [[COND]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ; VF1UF4:       exit:
>>>  ; VF1UF4-NEXT:    ret void
>>>  ;
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll b/llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll
>>> index 8f7a3a6bd41ad..adb22dc56b910 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll
>>> @@ -20,20 +20,19 @@ define void @load_clamped_index(i32* %A, i32* %B, i32 %N) {
>>>  ; CHECK:       vector.scevcheck:
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[N]], -1
>>>  ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i32 [[TMP0]] to i2
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i2 [[TMP1]], 0
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = sub i2 0, [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i2 [[TMP3]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i2 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[SCALAR_PH]], label [[VECTOR_MEMCHECK:%.*]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = sub i2 0, [[TMP1]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp ugt i2 [[TMP2]], 0
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i2 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[SCALAR_PH]], label [[VECTOR_MEMCHECK:%.*]]
>>>  ; CHECK:       vector.memcheck:
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = add i32 [[N]], -1
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = zext i32 [[TMP8]] to i64
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = add nuw nsw i64 [[TMP9]], 1
>>> -; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i32, i32* [[B]], i64 [[TMP10]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = add i32 [[N]], -1
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = zext i32 [[TMP7]] to i64
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = add nuw nsw i64 [[TMP8]], 1
>>> +; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i32, i32* [[B]], i64 [[TMP9]]
>>>  ; CHECK-NEXT:    [[SCEVGEP2:%.*]] = bitcast i32* [[SCEVGEP]] to i8*
>>> -; CHECK-NEXT:    [[SCEVGEP4:%.*]] = getelementptr i32, i32* [[A]], i64 [[TMP10]]
>>> +; CHECK-NEXT:    [[SCEVGEP4:%.*]] = getelementptr i32, i32* [[A]], i64 [[TMP9]]
>>>  ; CHECK-NEXT:    [[SCEVGEP45:%.*]] = bitcast i32* [[SCEVGEP4]] to i8*
>>>  ; CHECK-NEXT:    [[BOUND0:%.*]] = icmp ult i8* [[B1]], [[SCEVGEP45]]
>>>  ; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ult i8* [[A3]], [[SCEVGEP2]]
>>> @@ -45,20 +44,20 @@ define void @load_clamped_index(i32* %A, i32* %B, i32 %N) {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = urem i32 [[TMP11]], 4
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[TMP12]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i32, i32* [[TMP13]], i32 0
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = bitcast i32* [[TMP14]] to <2 x i32>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP15]], align 4, !alias.scope !0
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds i32, i32* [[B]], i32 [[TMP11]]
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr inbounds i32, i32* [[TMP17]], i32 0
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = bitcast i32* [[TMP18]] to <2 x i32>*
>>> -; CHECK-NEXT:    store <2 x i32> [[TMP16]], <2 x i32>* [[TMP19]], align 4, !alias.scope !3, !noalias !0
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = add i32 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = urem i32 [[TMP10]], 4
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr inbounds i32, i32* [[TMP12]], i32 0
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i32* [[TMP13]] to <2 x i32>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP14]], align 4, !alias.scope !0
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr inbounds i32, i32* [[B]], i32 [[TMP10]]
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds i32, i32* [[TMP16]], i32 0
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = bitcast i32* [[TMP17]] to <2 x i32>*
>>> +; CHECK-NEXT:    store <2 x i32> [[TMP15]], <2 x i32>* [[TMP18]], align 4, !alias.scope !3, !noalias !0
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP20]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[N]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>> @@ -109,20 +108,19 @@ define void @store_clamped_index(i32* %A, i32* %B, i32 %N) {
>>>  ; CHECK:       vector.scevcheck:
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[N]], -1
>>>  ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i32 [[TMP0]] to i2
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i2 [[TMP1]], 0
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = sub i2 0, [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i2 [[TMP3]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i2 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[SCALAR_PH]], label [[VECTOR_MEMCHECK:%.*]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = sub i2 0, [[TMP1]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp ugt i2 [[TMP2]], 0
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i2 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[SCALAR_PH]], label [[VECTOR_MEMCHECK:%.*]]
>>>  ; CHECK:       vector.memcheck:
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = add i32 [[N]], -1
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = zext i32 [[TMP8]] to i64
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = add nuw nsw i64 [[TMP9]], 1
>>> -; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i32, i32* [[B]], i64 [[TMP10]]
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = add i32 [[N]], -1
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = zext i32 [[TMP7]] to i64
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = add nuw nsw i64 [[TMP8]], 1
>>> +; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i32, i32* [[B]], i64 [[TMP9]]
>>>  ; CHECK-NEXT:    [[SCEVGEP2:%.*]] = bitcast i32* [[SCEVGEP]] to i8*
>>> -; CHECK-NEXT:    [[SCEVGEP4:%.*]] = getelementptr i32, i32* [[A]], i64 [[TMP10]]
>>> +; CHECK-NEXT:    [[SCEVGEP4:%.*]] = getelementptr i32, i32* [[A]], i64 [[TMP9]]
>>>  ; CHECK-NEXT:    [[SCEVGEP45:%.*]] = bitcast i32* [[SCEVGEP4]] to i8*
>>>  ; CHECK-NEXT:    [[BOUND0:%.*]] = icmp ult i8* [[B1]], [[SCEVGEP45]]
>>>  ; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ult i8* [[A3]], [[SCEVGEP2]]
>>> @@ -134,20 +132,20 @@ define void @store_clamped_index(i32* %A, i32* %B, i32 %N) {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = urem i32 [[TMP11]], 4
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr inbounds i32, i32* [[B]], i32 [[TMP11]]
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i32, i32* [[TMP13]], i32 0
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = bitcast i32* [[TMP14]] to <2 x i32>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP15]], align 4, !alias.scope !8, !noalias !11
>>> -; CHECK-NEXT:    [[TMP16:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> -; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[TMP12]]
>>> -; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr inbounds i32, i32* [[TMP17]], i32 0
>>> -; CHECK-NEXT:    [[TMP19:%.*]] = bitcast i32* [[TMP18]] to <2 x i32>*
>>> -; CHECK-NEXT:    store <2 x i32> [[TMP16]], <2 x i32>* [[TMP19]], align 4, !alias.scope !11
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = add i32 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = urem i32 [[TMP10]], 4
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr inbounds i32, i32* [[B]], i32 [[TMP10]]
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr inbounds i32, i32* [[TMP12]], i32 0
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i32* [[TMP13]] to <2 x i32>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP14]], align 4, !alias.scope !8, !noalias !11
>>> +; CHECK-NEXT:    [[TMP15:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> +; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr inbounds i32, i32* [[A]], i32 [[TMP11]]
>>> +; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds i32, i32* [[TMP16]], i32 0
>>> +; CHECK-NEXT:    [[TMP18:%.*]] = bitcast i32* [[TMP17]] to <2 x i32>*
>>> +; CHECK-NEXT:    store <2 x i32> [[TMP15]], <2 x i32>* [[TMP18]], align 4, !alias.scope !11
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[TMP20:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP20]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP19:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP19]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[N]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>> @@ -277,31 +275,30 @@ define void @clamped_index_equal_dependence(i32* %A, i32* %B, i32 %N) {
>>>  ; CHECK:       vector.scevcheck:
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[N]], -1
>>>  ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i32 [[TMP0]] to i2
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i2 [[TMP1]], 0
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = sub i2 0, [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i2 [[TMP3]], 0
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i2 [[TMP2]], 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = sub i2 0, [[TMP1]]
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp ugt i2 [[TMP2]], 0
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i2 [[TMP1]], 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = icmp ugt i32 [[TMP0]], 3
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
>>>  ; CHECK:       vector.ph:
>>>  ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i32 [[N]], 2
>>>  ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i32 [[N]], [[N_MOD_VF]]
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[TMP9:%.*]] = urem i32 [[TMP8]], 4
>>> -; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[TMP9]]
>>> -; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr inbounds i32, i32* [[TMP10]], i32 0
>>> -; CHECK-NEXT:    [[TMP12:%.*]] = bitcast i32* [[TMP11]] to <2 x i32>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP12]], align 4
>>> -; CHECK-NEXT:    [[TMP13:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> -; CHECK-NEXT:    [[TMP14:%.*]] = bitcast i32* [[TMP11]] to <2 x i32>*
>>> -; CHECK-NEXT:    store <2 x i32> [[TMP13]], <2 x i32>* [[TMP14]], align 4
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = add i32 [[INDEX]], 0
>>> +; CHECK-NEXT:    [[TMP8:%.*]] = urem i32 [[TMP7]], 4
>>> +; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[TMP8]]
>>> +; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i32, i32* [[TMP9]], i32 0
>>> +; CHECK-NEXT:    [[TMP11:%.*]] = bitcast i32* [[TMP10]] to <2 x i32>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, <2 x i32>* [[TMP11]], align 4
>>> +; CHECK-NEXT:    [[TMP12:%.*]] = add <2 x i32> [[WIDE_LOAD]], <i32 10, i32 10>
>>> +; CHECK-NEXT:    [[TMP13:%.*]] = bitcast i32* [[TMP10]] to <2 x i32>*
>>> +; CHECK-NEXT:    store <2 x i32> [[TMP12]], <2 x i32>* [[TMP13]], align 4
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[TMP15:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP15]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP15:![0-9]+]]
>>> +; CHECK-NEXT:    [[TMP14:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP15:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[N]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll b/llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll
>>> index 4e64e94459a1d..7afa1b32f4e43 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll
>>> @@ -56,8 +56,8 @@ define i32 @pred_select_const_i32_from_icmp(i32* noalias nocapture readonly %src
>>>  ;
>>>  ; CHECK-VF1IC2-LABEL: @pred_select_const_i32_from_icmp(
>>>  ; CHECK-VF1IC2:       vector.body:
>>> -; CHECK-VF1IC2:         [[VEC_PHI:%.*]] = phi i32 [ 0, %vector.ph ], [ [[PREDPHI:%.*]], %pred.load.continue4 ]
>>> -; CHECK-VF1IC2-NEXT:    [[VEC_PHI2:%.*]] = phi i32 [ 0, %vector.ph ], [ [[PREDPHI5:%.*]], %pred.load.continue4 ]
>>> +; CHECK-VF1IC2:         [[VEC_PHI:%.*]] = phi i32 [ 0, %vector.ph ], [ [[PREDPHI:%.*]], %pred.load.continue3 ]
>>> +; CHECK-VF1IC2-NEXT:    [[VEC_PHI2:%.*]] = phi i32 [ 0, %vector.ph ], [ [[PREDPHI5:%.*]], %pred.load.continue3 ]
>>>  ; CHECK-VF1IC2:         [[TMP0:%.*]] = getelementptr inbounds i32, i32* [[SRC1:%.*]], i64 {{%.*}}
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i32, i32* [[SRC1]], i64 {{%.*}}
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP2:%.*]] = load i32, i32* [[TMP0]], align 4
>>> @@ -71,13 +71,13 @@ define i32 @pred_select_const_i32_from_icmp(i32* noalias nocapture readonly %src
>>>  ; CHECK-VF1IC2-NEXT:    br label %pred.load.continue
>>>  ; CHECK-VF1IC2:       pred.load.continue:
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP8:%.*]] = phi i32 [ poison, %vector.body ], [ [[TMP7]], %pred.load.if ]
>>> -; CHECK-VF1IC2-NEXT:    br i1 [[TMP5]], label %pred.load.if3, label %pred.load.continue4
>>> -; CHECK-VF1IC2:       pred.load.if3:
>>> +; CHECK-VF1IC2-NEXT:    br i1 [[TMP5]], label %pred.load.if2, label %pred.load.continue3
>>> +; CHECK-VF1IC2:       pred.load.if2:
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i32, i32* [[SRC2]], i64 {{%.*}}
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP10:%.*]] = load i32, i32* [[TMP9]], align 4
>>> -; CHECK-VF1IC2-NEXT:    br label %pred.load.continue4
>>> -; CHECK-VF1IC2:       pred.load.continue4:
>>> -; CHECK-VF1IC2-NEXT:    [[TMP11:%.*]] = phi i32 [ poison, %pred.load.continue ], [ [[TMP10]], %pred.load.if3 ]
>>> +; CHECK-VF1IC2-NEXT:    br label %pred.load.continue3
>>> +; CHECK-VF1IC2:       pred.load.continue3:
>>> +; CHECK-VF1IC2-NEXT:    [[TMP11:%.*]] = phi i32 [ poison, %pred.load.continue ], [ [[TMP10]], %pred.load.if2 ]
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP12:%.*]] = icmp eq i32 [[TMP8]], 2
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP13:%.*]] = icmp eq i32 [[TMP11]], 2
>>>  ; CHECK-VF1IC2-NEXT:    [[TMP14:%.*]] = select i1 [[TMP12]], i32 1, i32 [[VEC_PHI]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1.ll b/llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1.ll
>>> index bb901ed70a827..59c1221cec241 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/tail-folding-vectorization-factor-1.ll
>>> @@ -17,13 +17,12 @@ define void @VF1-VPlanExe() {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
>>> -; CHECK-NEXT:    [[INDUCTION2:%.*]] = add i64 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[INDUCTION3:%.*]] = add i64 [[INDEX]], 3
>>> +; CHECK-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; CHECK-NEXT:    [[INDUCTION1:%.*]] = add i64 [[INDEX]], 2
>>> +; CHECK-NEXT:    [[INDUCTION2:%.*]] = add i64 [[INDEX]], 3
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add i64 [[INDEX]], 4
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = icmp eq i64 [[INDEX_NEXT]], 16
>>> -; CHECK-NEXT:    br i1 [[TMP0]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
>>> +; CHECK-NEXT:    br i1 [[TMP0]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
>>>  ; CHECK:       scalar.ph:
>>> @@ -35,7 +34,7 @@ define void @VF1-VPlanExe() {
>>>  ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
>>>  ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], 15
>>> -; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop !2
>>> +; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ;
>>>  entry:
>>>    br label %for.body
>>> @@ -60,17 +59,16 @@ define void @VF1-VPWidenCanonicalIVRecipeExe(double* %ptr1) {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP0]]
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[INDEX]], 1
>>> -; CHECK-NEXT:    [[NEXT_GEP1:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[INDEX]], 2
>>> -; CHECK-NEXT:    [[NEXT_GEP2:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP2]]
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[INDEX]], 3
>>> -; CHECK-NEXT:    [[NEXT_GEP3:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP3]]
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr double, double* [[PTR1]], i64 [[INDEX]]
>>> +; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 1
>>> +; CHECK-NEXT:    [[NEXT_GEP1:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP0]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[INDEX]], 2
>>> +; CHECK-NEXT:    [[NEXT_GEP2:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP1]]
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[INDEX]], 3
>>> +; CHECK-NEXT:    [[NEXT_GEP3:%.*]] = getelementptr double, double* [[PTR1]], i64 [[TMP2]]
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add i64 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], 16
>>> -; CHECK-NEXT:    br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !3
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = icmp eq i64 [[INDEX_NEXT]], 16
>>> +; CHECK-NEXT:    br i1 [[TMP3]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
>>>  ; CHECK:       scalar.ph:
>>> @@ -82,7 +80,7 @@ define void @VF1-VPWidenCanonicalIVRecipeExe(double* %ptr1) {
>>>  ; CHECK-NEXT:    [[ADDR:%.*]] = phi double* [ [[PTR:%.*]], [[FOR_BODY]] ], [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ]
>>>  ; CHECK-NEXT:    [[PTR]] = getelementptr inbounds double, double* [[ADDR]], i64 1
>>>  ; CHECK-NEXT:    [[COND:%.*]] = icmp eq double* [[PTR]], [[PTR2]]
>>> -; CHECK-NEXT:    br i1 [[COND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop !4
>>> +; CHECK-NEXT:    br i1 [[COND]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
>>>  ;
>>>  entry:
>>>    %ptr2 = getelementptr inbounds double, double* %ptr1, i64 15
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/unroll_nonlatch.ll b/llvm/test/Transforms/LoopVectorize/unroll_nonlatch.ll
>>> index e2d25cc42f56d..c4d95880736db 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/unroll_nonlatch.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/unroll_nonlatch.ll
>>> @@ -16,10 +16,9 @@ define void @test(double* %data) {
>>>  ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
>>>  ; CHECK:       vector.body:
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>> -; CHECK-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
>>> -; CHECK-NEXT:    [[TMP0:%.*]] = shl nuw nsw i64 [[INDUCTION]], 1
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[INDUCTION1]], 1
>>> +; CHECK-NEXT:    [[INDUCTION:%.*]] = add i64 [[INDEX]], 1
>>> +; CHECK-NEXT:    [[TMP0:%.*]] = shl nuw nsw i64 [[INDEX]], 1
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[INDUCTION]], 1
>>>  ; CHECK-NEXT:    [[TMP2:%.*]] = or i64 [[TMP0]], 1
>>>  ; CHECK-NEXT:    [[TMP3:%.*]] = or i64 [[TMP1]], 1
>>>  ; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds double, double* [[DATA:%.*]], i64 [[TMP2]]
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll b/llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll
>>> index 0e02297fdde1a..74890576146ac 100644
>>> --- a/llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll
>>> +++ b/llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll
>>> @@ -28,18 +28,17 @@ define void @basic_loop(i8* nocapture readonly %ptr, i32 %size, i8** %pos) {
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i32 [[SIZE]], [[INDEX]]
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[OFFSET_IDX]], 0
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i32 0
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i8* [[TMP3]] to <4 x i8>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP4]], align 1
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i8* [[TMP5]] to <4 x i8>*
>>> -; CHECK-NEXT:    store <4 x i8> [[WIDE_LOAD]], <4 x i8>* [[TMP6]], align 1
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[INDEX]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[TMP1]], i32 0
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = bitcast i8* [[TMP2]] to <4 x i8>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP3]], align 1
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = bitcast i8* [[TMP4]] to <4 x i8>*
>>> +; CHECK-NEXT:    store <4 x i8> [[WIDE_LOAD]], <4 x i8>* [[TMP5]], align 1
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !0
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[SIZE]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[END:%.*]], label [[SCALAR_PH]]
>>> @@ -52,10 +51,10 @@ define void @basic_loop(i8* nocapture readonly %ptr, i32 %size, i8** %pos) {
>>>  ; CHECK-NEXT:    [[BUFF:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[BODY]] ], [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ]
>>>  ; CHECK-NEXT:    [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[BUFF]], i32 1
>>>  ; CHECK-NEXT:    [[DEC]] = add nsw i32 [[DEC66]], -1
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> -; CHECK-NEXT:    store i8 [[TMP8]], i8* [[BUFF]], align 1
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> +; CHECK-NEXT:    store i8 [[TMP7]], i8* [[BUFF]], align 1
>>>  ; CHECK-NEXT:    [[TOBOOL11:%.*]] = icmp eq i32 [[DEC]], 0
>>> -; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop !2
>>> +; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop [[LOOP2:![0-9]+]]
>>>  ; CHECK:       end:
>>>  ; CHECK-NEXT:    [[INCDEC_PTR_LCSSA:%.*]] = phi i8* [ [[INCDEC_PTR]], [[BODY]] ], [ [[IND_END2]], [[MIDDLE_BLOCK]] ]
>>>  ; CHECK-NEXT:    store i8* [[INCDEC_PTR_LCSSA]], i8** [[POS]], align 4
>>> @@ -96,18 +95,17 @@ define void @metadata(i8* nocapture readonly %ptr, i32 %size, i8** %pos) {
>>>  ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
>>>  ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i32 [[SIZE]], [[INDEX]]
>>>  ; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[OFFSET_IDX]], 0
>>> -; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[INDEX]], 0
>>> -; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[TMP1]]
>>> -; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> -; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i8, i8* [[TMP2]], i32 0
>>> -; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i8* [[TMP3]] to <4 x i8>*
>>> -; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP4]], align 1
>>> -; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> -; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i8* [[TMP5]] to <4 x i8>*
>>> -; CHECK-NEXT:    store <4 x i8> [[WIDE_LOAD]], <4 x i8>* [[TMP6]], align 1
>>> +; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, i8* [[PTR]], i32 [[INDEX]]
>>> +; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, i8* [[NEXT_GEP]], i32 1
>>> +; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, i8* [[TMP1]], i32 0
>>> +; CHECK-NEXT:    [[TMP3:%.*]] = bitcast i8* [[TMP2]] to <4 x i8>*
>>> +; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, <4 x i8>* [[TMP3]], align 1
>>> +; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i8, i8* [[NEXT_GEP]], i32 0
>>> +; CHECK-NEXT:    [[TMP5:%.*]] = bitcast i8* [[TMP4]] to <4 x i8>*
>>> +; CHECK-NEXT:    store <4 x i8> [[WIDE_LOAD]], <4 x i8>* [[TMP5]], align 1
>>>  ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
>>> -; CHECK-NEXT:    [[TMP7:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> -; CHECK-NEXT:    br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop !4
>>> +; CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i32 [[INDEX_NEXT]], [[N_VEC]]
>>> +; CHECK-NEXT:    br i1 [[TMP6]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
>>>  ; CHECK:       middle.block:
>>>  ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[SIZE]], [[N_VEC]]
>>>  ; CHECK-NEXT:    br i1 [[CMP_N]], label [[END:%.*]], label [[SCALAR_PH]]
>>> @@ -120,10 +118,10 @@ define void @metadata(i8* nocapture readonly %ptr, i32 %size, i8** %pos) {
>>>  ; CHECK-NEXT:    [[BUFF:%.*]] = phi i8* [ [[INCDEC_PTR:%.*]], [[BODY]] ], [ [[BC_RESUME_VAL1]], [[SCALAR_PH]] ]
>>>  ; CHECK-NEXT:    [[INCDEC_PTR]] = getelementptr inbounds i8, i8* [[BUFF]], i32 1
>>>  ; CHECK-NEXT:    [[DEC]] = add nsw i32 [[DEC66]], -1
>>> -; CHECK-NEXT:    [[TMP8:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> -; CHECK-NEXT:    store i8 [[TMP8]], i8* [[BUFF]], align 1
>>> +; CHECK-NEXT:    [[TMP7:%.*]] = load i8, i8* [[INCDEC_PTR]], align 1
>>> +; CHECK-NEXT:    store i8 [[TMP7]], i8* [[BUFF]], align 1
>>>  ; CHECK-NEXT:    [[TOBOOL11:%.*]] = icmp eq i32 [[DEC]], 0
>>> -; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop !5
>>> +; CHECK-NEXT:    br i1 [[TOBOOL11]], label [[END]], label [[BODY]], !llvm.loop [[LOOP5:![0-9]+]]
>>>  ; CHECK:       end:
>>>  ; CHECK-NEXT:    [[INCDEC_PTR_LCSSA:%.*]] = phi i8* [ [[INCDEC_PTR]], [[BODY]] ], [ [[IND_END2]], [[MIDDLE_BLOCK]] ]
>>>  ; CHECK-NEXT:    store i8* [[INCDEC_PTR_LCSSA]], i8** [[POS]], align 4
>>>
>>> diff  --git a/llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll b/llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll
>>> index 17862145f466a..f6135d627c09f 100644
>>> --- a/llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll
>>> +++ b/llvm/test/Transforms/LoopVersioning/wrapping-pointer-versioning.ll
>>> @@ -34,24 +34,23 @@ define void @f1(i16* noalias %a,
>>>  ; LV-NEXT:    [[MUL1:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 2, i32 [[TMP1]])
>>>  ; LV-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL1]], 0
>>>  ; LV-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL1]], 1
>>> -; LV-NEXT:    [[TMP2:%.*]] = add i32 [[MUL_RESULT]], 0
>>> -; LV-NEXT:    [[TMP3:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> -; LV-NEXT:    [[TMP4:%.*]] = icmp ugt i32 [[TMP3]], 0
>>> -; LV-NEXT:    [[TMP5:%.*]] = icmp ult i32 [[TMP2]], 0
>>> -; LV-NEXT:    [[TMP6:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> -; LV-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; LV-NEXT:    [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
>>> +; LV-NEXT:    [[TMP2:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> +; LV-NEXT:    [[TMP3:%.*]] = icmp ugt i32 [[TMP2]], 0
>>> +; LV-NEXT:    [[TMP4:%.*]] = icmp ult i32 [[MUL_RESULT]], 0
>>> +; LV-NEXT:    [[TMP5:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> +; LV-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; LV-NEXT:    [[TMP7:%.*]] = or i1 [[TMP6]], [[MUL_OVERFLOW]]
>>>  ; LV-NEXT:    [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 4, i64 [[TMP0]])
>>>  ; LV-NEXT:    [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
>>>  ; LV-NEXT:    [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
>>> -; LV-NEXT:    [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> -; LV-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> -; LV-NEXT:    [[TMP11:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP9]]
>>> -; LV-NEXT:    [[TMP12:%.*]] = icmp ugt i8* [[TMP11]], [[A5]]
>>> -; LV-NEXT:    [[TMP13:%.*]] = icmp ult i8* [[TMP10]], [[A5]]
>>> -; LV-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW4]]
>>> -; LV-NEXT:    [[TMP15:%.*]] = or i1 [[TMP8]], [[TMP14]]
>>> -; LV-NEXT:    br i1 [[TMP15]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH:%.*]]
>>> +; LV-NEXT:    [[TMP8:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> +; LV-NEXT:    [[TMP9:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> +; LV-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP8]]
>>> +; LV-NEXT:    [[TMP11:%.*]] = icmp ugt i8* [[TMP10]], [[A5]]
>>> +; LV-NEXT:    [[TMP12:%.*]] = icmp ult i8* [[TMP9]], [[A5]]
>>> +; LV-NEXT:    [[TMP13:%.*]] = or i1 [[TMP12]], [[MUL_OVERFLOW4]]
>>> +; LV-NEXT:    [[TMP14:%.*]] = or i1 [[TMP7]], [[TMP13]]
>>> +; LV-NEXT:    br i1 [[TMP14]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH:%.*]]
>>>  ; LV:       for.body.ph.lver.orig:
>>>  ; LV-NEXT:    br label [[FOR_BODY_LVER_ORIG:%.*]]
>>>  ; LV:       for.body.lver.orig:
>>> @@ -271,24 +270,23 @@ define void @f3(i16* noalias %a,
>>>  ; LV-NEXT:    [[MUL1:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 2, i32 [[TMP1]])
>>>  ; LV-NEXT:    [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL1]], 0
>>>  ; LV-NEXT:    [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL1]], 1
>>> -; LV-NEXT:    [[TMP2:%.*]] = add i32 [[MUL_RESULT]], 0
>>> -; LV-NEXT:    [[TMP3:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> -; LV-NEXT:    [[TMP4:%.*]] = icmp sgt i32 [[TMP3]], 0
>>> -; LV-NEXT:    [[TMP5:%.*]] = icmp slt i32 [[TMP2]], 0
>>> -; LV-NEXT:    [[TMP6:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> -; LV-NEXT:    [[TMP7:%.*]] = or i1 [[TMP5]], [[TMP6]]
>>> -; LV-NEXT:    [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
>>> +; LV-NEXT:    [[TMP2:%.*]] = sub i32 0, [[MUL_RESULT]]
>>> +; LV-NEXT:    [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], 0
>>> +; LV-NEXT:    [[TMP4:%.*]] = icmp slt i32 [[MUL_RESULT]], 0
>>> +; LV-NEXT:    [[TMP5:%.*]] = icmp ugt i64 [[TMP0]], 4294967295
>>> +; LV-NEXT:    [[TMP6:%.*]] = or i1 [[TMP4]], [[TMP5]]
>>> +; LV-NEXT:    [[TMP7:%.*]] = or i1 [[TMP6]], [[MUL_OVERFLOW]]
>>>  ; LV-NEXT:    [[MUL2:%.*]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 4, i64 [[TMP0]])
>>>  ; LV-NEXT:    [[MUL_RESULT3:%.*]] = extractvalue { i64, i1 } [[MUL2]], 0
>>>  ; LV-NEXT:    [[MUL_OVERFLOW4:%.*]] = extractvalue { i64, i1 } [[MUL2]], 1
>>> -; LV-NEXT:    [[TMP9:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> -; LV-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> -; LV-NEXT:    [[TMP11:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP9]]
>>> -; LV-NEXT:    [[TMP12:%.*]] = icmp ugt i8* [[TMP11]], [[A5]]
>>> -; LV-NEXT:    [[TMP13:%.*]] = icmp ult i8* [[TMP10]], [[A5]]
>>> -; LV-NEXT:    [[TMP14:%.*]] = or i1 [[TMP13]], [[MUL_OVERFLOW4]]
>>> -; LV-NEXT:    [[TMP15:%.*]] = or i1 [[TMP8]], [[TMP14]]
>>> -; LV-NEXT:    br i1 [[TMP15]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH:%.*]]
>>> +; LV-NEXT:    [[TMP8:%.*]] = sub i64 0, [[MUL_RESULT3]]
>>> +; LV-NEXT:    [[TMP9:%.*]] = getelementptr i8, i8* [[A5]], i64 [[MUL_RESULT3]]
>>> +; LV-NEXT:    [[TMP10:%.*]] = getelementptr i8, i8* [[A5]], i64 [[TMP8]]
>>> +; LV-NEXT:    [[TMP11:%.*]] = icmp ugt i8* [[TMP10]], [[A5]]
>>> +; LV-NEXT:    [[TMP12:%.*]] = icmp ult i8* [[TMP9]], [[A5]]
>>> +; LV-NEXT:    [[TMP13:%.*]] = or i1 [[TMP12]], [[MUL_OVERFLOW4]]
>>> +; LV-NEXT:    [[TMP14:%.*]] = or i1 [[TMP7]], [[TMP13]]
>>> +; LV-NEXT:    br i1 [[TMP14]], label [[FOR_BODY_PH_LVER_ORIG:%.*]], label [[FOR_BODY_PH:%.*]]
>>>  ; LV:       for.body.ph.lver.orig:
>>>  ; LV-NEXT:    br label [[FOR_BODY_LVER_ORIG:%.*]]
>>>  ; LV:       for.body.lver.orig:
>>>
>>> diff  --git a/llvm/unittests/IR/PatternMatch.cpp b/llvm/unittests/IR/PatternMatch.cpp
>>> index 598dcdff943f8..7023e5a41fe7b 100644
>>> --- a/llvm/unittests/IR/PatternMatch.cpp
>>> +++ b/llvm/unittests/IR/PatternMatch.cpp
>>> @@ -479,19 +479,19 @@ TEST_F(PatternMatchTest, SpecificIntSLE) {
>>>  }
>>>
>>>  TEST_F(PatternMatchTest, Unless) {
>>> -  Value *X = IRB.CreateAdd(IRB.getInt32(1), IRB.getInt32(0));
>>> +  Value *X = IRB.CreateAdd(IRB.getInt32(1), IRB.getInt32(-1));
>>>
>>> -  EXPECT_TRUE(m_Add(m_One(), m_Zero()).match(X));
>>> -  EXPECT_FALSE(m_Add(m_Zero(), m_One()).match(X));
>>> +  EXPECT_TRUE(m_Add(m_One(), m_AllOnes()).match(X));
>>> +  EXPECT_FALSE(m_Add(m_AllOnes(), m_One()).match(X));
>>>
>>> -  EXPECT_FALSE(m_Unless(m_Add(m_One(), m_Zero())).match(X));
>>> -  EXPECT_TRUE(m_Unless(m_Add(m_Zero(), m_One())).match(X));
>>> +  EXPECT_FALSE(m_Unless(m_Add(m_One(), m_AllOnes())).match(X));
>>> +  EXPECT_TRUE(m_Unless(m_Add(m_AllOnes(), m_One())).match(X));
>>>
>>> -  EXPECT_TRUE(m_c_Add(m_One(), m_Zero()).match(X));
>>> -  EXPECT_TRUE(m_c_Add(m_Zero(), m_One()).match(X));
>>> +  EXPECT_TRUE(m_c_Add(m_One(), m_AllOnes()).match(X));
>>> +  EXPECT_TRUE(m_c_Add(m_AllOnes(), m_One()).match(X));
>>>
>>> -  EXPECT_FALSE(m_Unless(m_c_Add(m_One(), m_Zero())).match(X));
>>> -  EXPECT_FALSE(m_Unless(m_c_Add(m_Zero(), m_One())).match(X));
>>> +  EXPECT_FALSE(m_Unless(m_c_Add(m_One(), m_AllOnes())).match(X));
>>> +  EXPECT_FALSE(m_Unless(m_c_Add(m_AllOnes(), m_One())).match(X));
>>>  }
>>>
>>>  TEST_F(PatternMatchTest, ZExtSExtSelf) {
>>>
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


More information about the llvm-commits mailing list