[llvm] 2e9f869 - Reland "[LLVM] Add IRNormalizer Pass" (#113780)

Thu Nov 14 09:56:25 PST 2024

Author: Justin Fargnoli
Date: 2024-11-14T09:56:22-08:00
New Revision: 2e9f8696e9533fdd464e025bd504302fa1a22f14

URL: https://github.com/llvm/llvm-project/commit/2e9f8696e9533fdd464e025bd504302fa1a22f14
DIFF: https://github.com/llvm/llvm-project/commit/2e9f8696e9533fdd464e025bd504302fa1a22f14.diff

LOG: Reland "[LLVM] Add IRNormalizer Pass" (#113780)

`IRNormalizer` will reorder instructions. Thus, we need to invalidate
analyses. Done in cd500d28cba3177c213f2f2faf50f14ea56e230b. This should
resolve the [BuildBot
failure](https://github.com/llvm/llvm-project/pull/68176#issuecomment-2428243474).

---

Original PR: #68176
Original commit: 1295d2e6da2fe90f3b770ab1d35bf5caecd38bed
Reverted with: 8a12e0131f3d84b470fac63af042aa96a1b19f56

---

Add the llvm-canon tool. Description from the [original
PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu):

> Added a new llvm-canon tool which aims to transform LLVM Modules into
a canonical form by reordering and renaming instructions while
preserving the same semantics. This tool makes it easier to spot
semantic differences while diffing two modules which have undergone
different transformation passes.

The current version of this tool can:

- Reorder instructions within a function.
- Rename instructions based on the operands.
- Sort commutative operands.

This code was originally written by @michalpaszkowski and [submitted to
mainline
LLVM](https://github.com/llvm/llvm-project/commit/14d358537f124a732adad1ec6edf3981dc9baece).
However, it was quickly
[reverted](https://github.com/llvm/llvm-project/commit/335de55fa3384946f1e62050f2545c0966163236)
to do BuildBot errors.

Michal presented his version of the tool in [LLVM-Canon: Shooting for
Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg).

@AidanGoldfarb and I ported the code to the new pass manager, added more
tests, and fixed some bugs related to PHI nodes that may have been the
root cause of the BuildBot errors that caused the patch to be reverted.
Additionally, we rewrote the implementation of instruction reordering to
fix cases where the original algorithm would break use-def chains.

Note that this is @AidanGoldfarb and I's first time submitting to LLVM.
Please liberally critique the PR!

CC @plotfi for initial review.

---------

Co-authored-by: Aidan <aidan.goldfarb at mail.mcgill.ca>

Added: 
    llvm/include/llvm/Transforms/Utils/IRNormalizer.h
    llvm/lib/Transforms/Utils/IRNormalizer.cpp
    llvm/test/Transforms/IRNormalizer/naming-args-instr-blocks.ll
    llvm/test/Transforms/IRNormalizer/naming-arguments.ll
    llvm/test/Transforms/IRNormalizer/naming.ll
    llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll
    llvm/test/Transforms/IRNormalizer/regression-coro-elide-musttail.ll
    llvm/test/Transforms/IRNormalizer/regression-deoptimize.ll
    llvm/test/Transforms/IRNormalizer/regression-dont-hoist-deoptimize.ll
    llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll
    llvm/test/Transforms/IRNormalizer/reordering-basic.ll
    llvm/test/Transforms/IRNormalizer/reordering.ll

Modified: 
    llvm/docs/Passes.rst
    llvm/docs/ReleaseNotes.md
    llvm/lib/Passes/PassBuilder.cpp
    llvm/lib/Passes/PassRegistry.def
    llvm/lib/Transforms/Utils/CMakeLists.txt

Removed: 
    


################################################################################
diff  --git a/llvm/docs/Passes.rst b/llvm/docs/Passes.rst
index 49f633e98d16fe..5e436db62be3a1 100644

--- a/llvm/docs/Passes.rst
+++ b/llvm/docs/Passes.rst
@@ -543,6 +543,14 @@ variables with initializers are marked as internal.
 An interprocedural variant of :ref:`Sparse Conditional Constant Propagation
 <passes-sccp>`.
 
+``ir-normalizer``: Transforms IR into a normal form that's easier to 
diff 
+----------------------------------------------------------------------------
+
+This pass aims to transform LLVM Modules into a normal form by reordering and
+renaming instructions while preserving the same semantics. The normalizer makes
+it easier to spot semantic 
diff erences while 
diff ing two modules which have
+undergone two 
diff erent passes.
+
 ``jump-threading``: Jump Threading
 ----------------------------------
 

diff  --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index ec57b71f82ce9c..0c1019c2fdca03 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -42,6 +42,11 @@ point (e.g. maybe you would like to give an example of the
 functionality, or simply have a lot to talk about), see the comment below
 for adding a new subsection. -->
 
+* Added a new IRNormalizer pass which aims to transform LLVM modules into
+  a normal form by reordering and renaming instructions while preserving the
+  same semantics. The normalizer makes it easier to spot semantic 
diff erences
+  when 
diff ing two modules which have undergone 
diff erent passes.
+
 * ...
 
 <!-- If you would like to document a larger change, then you can add a

diff  --git a/llvm/include/llvm/Transforms/Utils/IRNormalizer.h b/llvm/include/llvm/Transforms/Utils/IRNormalizer.h
new file mode 100644
index 00000000000000..af1f715d4940d8
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Utils/IRNormalizer.h
@@ -0,0 +1,15 @@
+#ifndef LLVM_TRANSFORMS_UTILS_IRNORMALIZER_H
+#define LLVM_TRANSFORMS_UTILS_IRNORMALIZER_H
+
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+/// IRNormalizer aims to transform LLVM IR into normal form.
+struct IRNormalizerPass : public PassInfoMixin<IRNormalizerPass> {
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) const;
+};
+
+} // namespace llvm
+
+#endif // LLVM_TRANSFORMS_UTILS_IRNORMALIZER_H

diff  --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 1bdf88e48b4ae2..df7c9a4fbb9387 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -306,9 +306,9 @@
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
 #include "llvm/Transforms/Utils/FixIrreducible.h"
 #include "llvm/Transforms/Utils/HelloWorld.h"
+#include "llvm/Transforms/Utils/IRNormalizer.h"
 #include "llvm/Transforms/Utils/InjectTLIMappings.h"
 #include "llvm/Transforms/Utils/InstructionNamer.h"
-#include "llvm/Transforms/Utils/Instrumentation.h"
 #include "llvm/Transforms/Utils/LCSSA.h"
 #include "llvm/Transforms/Utils/LibCallsShrinkWrap.h"
 #include "llvm/Transforms/Utils/LoopSimplify.h"

diff  --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 21d53e5d01fbc0..f2ae933d650f91 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -416,6 +416,7 @@ FUNCTION_PASS("move-auto-init", MoveAutoInitPass())
 FUNCTION_PASS("nary-reassociate", NaryReassociatePass())
 FUNCTION_PASS("newgvn", NewGVNPass())
 FUNCTION_PASS("no-op-function", NoOpFunctionPass())
+FUNCTION_PASS("normalize", IRNormalizerPass())
 FUNCTION_PASS("objc-arc", ObjCARCOptPass())
 FUNCTION_PASS("objc-arc-contract", ObjCARCContractPass())
 FUNCTION_PASS("objc-arc-expand", ObjCARCExpandPass())

diff  --git a/llvm/lib/Transforms/Utils/CMakeLists.txt b/llvm/lib/Transforms/Utils/CMakeLists.txt
index 36761cf3569741..65bd3080662c4d 100644
--- a/llvm/lib/Transforms/Utils/CMakeLists.txt
+++ b/llvm/lib/Transforms/Utils/CMakeLists.txt
@@ -37,6 +37,7 @@ add_llvm_component_library(LLVMTransformUtils
   InstructionNamer.cpp
   Instrumentation.cpp
   IntegerDivision.cpp
+  IRNormalizer.cpp
   LCSSA.cpp
   LibCallsShrinkWrap.cpp
   Local.cpp

diff  --git a/llvm/lib/Transforms/Utils/IRNormalizer.cpp b/llvm/lib/Transforms/Utils/IRNormalizer.cpp
new file mode 100644
index 00000000000000..47ec7f3177db73
--- /dev/null
+++ b/llvm/lib/Transforms/Utils/IRNormalizer.cpp
@@ -0,0 +1,697 @@
+//===--------------- IRNormalizer.cpp - IR Normalizer ---------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+/// \file
+/// This file implements the IRNormalizer class which aims to transform LLVM
+/// Modules into a normal form by reordering and renaming instructions while
+/// preserving the same semantics. The normalizer makes it easier to spot
+/// semantic 
diff erences while 
diff ing two modules which have undergone
+/// 
diff erent passes.
+///
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Utils/IRNormalizer.h"
+#include "llvm/ADT/SetVector.h"
+#include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/IR/BasicBlock.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/Module.h"
+#include "llvm/InitializePasses.h"
+#include "llvm/Pass.h"
+#include "llvm/PassRegistry.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Transforms/Utils.h"
+#include <algorithm>
+#include <stack>
+
+#define DEBUG_TYPE "normalize"
+
+using namespace llvm;
+
+namespace {
+/// IRNormalizer aims to transform LLVM IR into normal form.
+class IRNormalizer {
+public:
+  /// \name Normalizer flags.
+  /// @{
+  /// Preserves original order of instructions.
+  static cl::opt<bool> PreserveOrder;
+  /// Renames all instructions (including user-named).
+  static cl::opt<bool> RenameAll; // TODO: Don't rename on empty name
+  /// Folds all regular instructions (including pre-outputs).
+  static cl::opt<bool> FoldPreOutputs;
+  /// Sorts and reorders operands in commutative instructions.
+  static cl::opt<bool> ReorderOperands;
+  /// @}
+
+  bool runOnFunction(Function &F);
+
+private:
+  // Random constant for hashing, so the state isn't zero.
+  const uint64_t MagicHashConstant = 0x6acaa36bef8325c5ULL;
+  DenseSet<const Instruction *> NamedInstructions;
+
+  SmallVector<Instruction *, 16> Outputs;
+
+  /// \name Naming.
+  /// @{
+  void nameFunctionArguments(Function &F) const;
+  void nameBasicBlocks(Function &F) const;
+  void nameInstruction(Instruction *I);
+  void nameAsInitialInstruction(Instruction *I) const;
+  void nameAsRegularInstruction(Instruction *I);
+  void foldInstructionName(Instruction *I) const;
+  /// @}
+
+  /// \name Reordering.
+  /// @{
+  void reorderInstructions(Function &F) const;
+  void reorderDefinition(Instruction *Definition,
+                         std::stack<Instruction *> &TopologicalSort,
+                         SmallPtrSet<const Instruction *, 32> &Visited) const;
+  void reorderInstructionOperandsByNames(Instruction *I) const;
+  void reorderPHIIncomingValues(PHINode *Phi) const;
+  /// @}
+
+  /// \name Utility methods.
+  /// @{
+  template <typename T>
+  void sortCommutativeOperands(Instruction *I, T &Operands) const;
+  SmallVector<Instruction *, 16> collectOutputInstructions(Function &F) const;
+  bool isOutput(const Instruction *I) const;
+  bool isInitialInstruction(const Instruction *I) const;
+  bool hasOnlyImmediateOperands(const Instruction *I) const;
+  SetVector<int>
+  getOutputFootprint(Instruction *I,
+                     SmallPtrSet<const Instruction *, 32> &Visited) const;
+  /// @}
+};
+} // namespace
+
+cl::opt<bool> IRNormalizer::PreserveOrder(
+    "norm-preserve-order", cl::Hidden, cl::init(false),
+    cl::desc("Preserves original instruction order"));
+cl::opt<bool> IRNormalizer::RenameAll(
+    "norm-rename-all", cl::Hidden, cl::init(true),
+    cl::desc("Renames all instructions (including user-named)"));
+cl::opt<bool> IRNormalizer::FoldPreOutputs(
+    "norm-fold-all", cl::Hidden, cl::init(true),
+    cl::desc("Folds all regular instructions (including pre-outputs)"));
+cl::opt<bool> IRNormalizer::ReorderOperands(
+    "norm-reorder-operands", cl::Hidden, cl::init(true),
+    cl::desc("Sorts and reorders operands in commutative instructions"));
+
+/// Entry method to the IRNormalizer.
+///
+/// \param F Function to normalize.
+bool IRNormalizer::runOnFunction(Function &F) {
+  nameFunctionArguments(F);
+  nameBasicBlocks(F);
+
+  Outputs = collectOutputInstructions(F);
+
+  if (!PreserveOrder)
+    reorderInstructions(F);
+
+  // TODO: Reorder basic blocks via a topological sort.
+
+  for (auto &I : Outputs)
+    nameInstruction(I);
+
+  for (auto &I : instructions(F)) {
+    if (!PreserveOrder) {
+      if (ReorderOperands)
+        reorderInstructionOperandsByNames(&I);
+
+      if (auto *Phi = dyn_cast<PHINode>(&I))
+        reorderPHIIncomingValues(Phi);
+    }
+    foldInstructionName(&I);
+  }
+
+  return true;
+}
+
+/// Numbers arguments.
+///
+/// \param F Function whose arguments will be renamed.
+void IRNormalizer::nameFunctionArguments(Function &F) const {
+  int ArgumentCounter = 0;
+  for (auto &A : F.args()) {
+    if (RenameAll || A.getName().empty()) {
+      A.setName("a" + Twine(ArgumentCounter));
+      ArgumentCounter += 1;
+    }
+  }
+}
+
+/// Names basic blocks using a generated hash for each basic block in
+/// a function considering the opcode and the order of output instructions.
+///
+/// \param F Function containing basic blocks to rename.
+void IRNormalizer::nameBasicBlocks(Function &F) const {
+  for (auto &B : F) {
+    // Initialize to a magic constant, so the state isn't zero.
+    uint64_t Hash = MagicHashConstant;
+
+    // Hash considering output instruction opcodes.
+    for (auto &I : B)
+      if (isOutput(&I))
+        Hash = hashing::detail::hash_16_bytes(Hash, I.getOpcode());
+
+    if (RenameAll || B.getName().empty()) {
+      // Name basic block. Substring hash to make 
diff s more readable.
+      B.setName("bb" + std::to_string(Hash).substr(0, 5));
+    }
+  }
+}
+
+/// Names instructions graphically (recursive) in accordance with the
+/// def-use tree, starting from the initial instructions (defs), finishing at
+/// the output (top-most user) instructions (depth-first).
+///
+/// \param I Instruction to be renamed.
+void IRNormalizer::nameInstruction(Instruction *I) {
+  // Ensure instructions are not renamed. This is done
+  // to prevent situation where instructions are used
+  // before their definition (in phi nodes)
+  if (NamedInstructions.contains(I))
+    return;
+  NamedInstructions.insert(I);
+  if (isInitialInstruction(I)) {
+    nameAsInitialInstruction(I);
+  } else {
+    // This must be a regular instruction.
+    nameAsRegularInstruction(I);
+  }
+}
+
+template <typename T>
+void IRNormalizer::sortCommutativeOperands(Instruction *I, T &Operands) const {
+  if (!(I->isCommutative() && Operands.size() >= 2))
+    return;
+  auto CommutativeEnd = Operands.begin();
+  std::advance(CommutativeEnd, 2);
+  llvm::sort(Operands.begin(), CommutativeEnd);
+}
+
+/// Names instruction following the scheme:
+/// vl00000Callee(Operands)
+///
+/// Where 00000 is a hash calculated considering instruction's opcode and output
+/// footprint. Callee's name is only included when instruction's type is
+/// CallInst. In cases where instruction is commutative, operands list is also
+/// sorted.
+///
+/// Renames instruction only when RenameAll flag is raised or instruction is
+/// unnamed.
+///
+/// \see getOutputFootprint()
+/// \param I Instruction to be renamed.
+void IRNormalizer::nameAsInitialInstruction(Instruction *I) const {
+  if (I->getType()->isVoidTy())
+    return;
+  if (!(I->getName().empty() || RenameAll))
+    return;
+  LLVM_DEBUG(dbgs() << "Naming initial instruction: " << *I << "\n");
+
+  // Instruction operands for further sorting.
+  SmallVector<SmallString<64>, 4> Operands;
+
+  // Collect operands.
+  for (auto &Op : I->operands()) {
+    if (!isa<Function>(Op)) {
+      std::string TextRepresentation;
+      raw_string_ostream Stream(TextRepresentation);
+      Op->printAsOperand(Stream, false);
+      Operands.push_back(StringRef(Stream.str()));
+    }
+  }
+
+  sortCommutativeOperands(I, Operands);
+
+  // Initialize to a magic constant, so the state isn't zero.
+  uint64_t Hash = MagicHashConstant;
+
+  // Consider instruction's opcode in the hash.
+  Hash = hashing::detail::hash_16_bytes(Hash, I->getOpcode());
+
+  SmallPtrSet<const Instruction *, 32> Visited;
+  // Get output footprint for I.
+  SetVector<int> OutputFootprint = getOutputFootprint(I, Visited);
+
+  // Consider output footprint in the hash.
+  for (const int &Output : OutputFootprint)
+    Hash = hashing::detail::hash_16_bytes(Hash, Output);
+
+  // Base instruction name.
+  SmallString<256> Name;
+  Name.append("vl" + std::to_string(Hash).substr(0, 5));
+
+  // In case of CallInst, consider callee in the instruction name.
+  if (const auto *CI = dyn_cast<CallInst>(I)) {
+    Function *F = CI->getCalledFunction();
+
+    if (F != nullptr)
+      Name.append(F->getName());
+  }
+
+  Name.append("(");
+  for (size_t i = 0; i < Operands.size(); ++i) {
+    Name.append(Operands[i]);
+
+    if (i < Operands.size() - 1)
+      Name.append(", ");
+  }
+  Name.append(")");
+
+  I->setName(Name);
+}
+
+/// Names instruction following the scheme:
+/// op00000Callee(Operands)
+///
+/// Where 00000 is a hash calculated considering instruction's opcode, its
+/// operands' opcodes and order. Callee's name is only included when
+/// instruction's type is CallInst. In cases where instruction is commutative,
+/// operand list is also sorted.
+///
+/// Names instructions recursively in accordance with the def-use tree,
+/// starting from the initial instructions (defs), finishing at
+/// the output (top-most user) instructions (depth-first).
+///
+/// Renames instruction only when RenameAll flag is raised or instruction is
+/// unnamed.
+///
+/// \see getOutputFootprint()
+/// \param I Instruction to be renamed.
+void IRNormalizer::nameAsRegularInstruction(Instruction *I) {
+  LLVM_DEBUG(dbgs() << "Naming regular instruction: " << *I << "\n");
+
+  // Instruction operands for further sorting.
+  SmallVector<SmallString<128>, 4> Operands;
+
+  // The name of a regular instruction depends
+  // on the names of its operands. Hence, all
+  // operands must be named first in the use-def
+  // walk.
+
+  // Collect operands.
+  for (auto &Op : I->operands()) {
+    if (auto *I = dyn_cast<Instruction>(Op)) {
+      // Walk down the use-def chain.
+      nameInstruction(I);
+      Operands.push_back(I->getName());
+    } else if (!isa<Function>(Op)) {
+      // This must be an immediate value.
+      std::string TextRepresentation;
+      raw_string_ostream Stream(TextRepresentation);
+      Op->printAsOperand(Stream, false);
+      Operands.push_back(StringRef(Stream.str()));
+    }
+  }
+
+  sortCommutativeOperands(I, Operands);
+
+  // Initialize to a magic constant, so the state isn't zero.
+  uint64_t Hash = MagicHashConstant;
+
+  // Consider instruction opcode in the hash.
+  Hash = hashing::detail::hash_16_bytes(Hash, I->getOpcode());
+
+  // Operand opcodes for further sorting (commutative).
+  SmallVector<int, 4> OperandsOpcodes;
+
+  // Collect operand opcodes for hashing.
+  for (auto &Op : I->operands())
+    if (auto *I = dyn_cast<Instruction>(Op))
+      OperandsOpcodes.push_back(I->getOpcode());
+
+  sortCommutativeOperands(I, OperandsOpcodes);
+
+  // Consider operand opcodes in the hash.
+  for (const int Code : OperandsOpcodes)
+    Hash = hashing::detail::hash_16_bytes(Hash, Code);
+
+  // Base instruction name.
+  SmallString<512> Name;
+  Name.append("op" + std::to_string(Hash).substr(0, 5));
+
+  // In case of CallInst, consider callee in the instruction name.
+  if (const auto *CI = dyn_cast<CallInst>(I))
+    if (const Function *F = CI->getCalledFunction())
+      Name.append(F->getName());
+
+  Name.append("(");
+  for (size_t i = 0; i < Operands.size(); ++i) {
+    Name.append(Operands[i]);
+
+    if (i < Operands.size() - 1)
+      Name.append(", ");
+  }
+  Name.append(")");
+
+  if ((I->getName().empty() || RenameAll) && !I->getType()->isVoidTy())
+    I->setName(Name);
+}
+
+/// Shortens instruction's name. This method removes called function name from
+/// the instruction name and substitutes the call chain with a corresponding
+/// list of operands.
+///
+/// Examples:
+/// op00000Callee(op00001Callee(...), vl00000Callee(1, 2), ...)  ->
+/// op00000(op00001, vl00000, ...) vl00000Callee(1, 2)  ->  vl00000(1, 2)
+///
+/// This method omits output instructions and pre-output (instructions directly
+/// used by an output instruction) instructions (by default). By default it also
+/// does not affect user named instructions.
+///
+/// \param I Instruction whose name will be folded.
+void IRNormalizer::foldInstructionName(Instruction *I) const {
+  // If this flag is raised, fold all regular
+  // instructions (including pre-outputs).
+  if (!FoldPreOutputs) {
+    // Don't fold if one of the users is an output instruction.
+    for (auto *U : I->users())
+      if (auto *IU = dyn_cast<Instruction>(U))
+        if (isOutput(IU))
+          return;
+  }
+
+  // Don't fold if it is an output instruction or has no op prefix.
+  if (isOutput(I) || I->getName().substr(0, 2) != "op")
+    return;
+
+  // Instruction operands.
+  SmallVector<SmallString<64>, 4> Operands;
+
+  for (auto &Op : I->operands()) {
+    if (const auto *I = dyn_cast<Instruction>(Op)) {
+      bool HasNormalName = I->getName().substr(0, 2) == "op" ||
+                           I->getName().substr(0, 2) == "vl";
+
+      Operands.push_back(HasNormalName ? I->getName().substr(0, 7)
+                                       : I->getName());
+    }
+  }
+
+  sortCommutativeOperands(I, Operands);
+
+  SmallString<256> Name;
+  Name.append(I->getName().substr(0, 7));
+
+  Name.append("(");
+  for (size_t i = 0; i < Operands.size(); ++i) {
+    Name.append(Operands[i]);
+
+    if (i < Operands.size() - 1)
+      Name.append(", ");
+  }
+  Name.append(")");
+
+  I->setName(Name);
+}
+
+/// Reorders instructions by walking up the tree from each operand of an output
+/// instruction and reducing the def-use distance.
+/// This method assumes that output instructions were collected top-down,
+/// otherwise the def-use chain may be broken.
+/// This method is a wrapper for recursive reorderInstruction().
+///
+/// \see reorderInstruction()
+void IRNormalizer::reorderInstructions(Function &F) const {
+  for (auto &BB : F) {
+    LLVM_DEBUG(dbgs() << "Reordering instructions in basic block: "
+                      << BB.getName() << "\n");
+    // Find the source nodes of the DAG of instructions in this basic block.
+    // Source nodes are instructions that have side effects, are terminators, or
+    // don't have a parent in the DAG of instructions.
+    //
+    // We must iterate from the first to the last instruction otherwise side
+    // effecting instructions could be reordered.
+
+    std::stack<Instruction *> TopologicalSort;
+    SmallPtrSet<const Instruction *, 32> Visited;
+    for (auto &I : BB) {
+      // First process side effecting and terminating instructions.
+      if (!(isOutput(&I) || I.isTerminator()))
+        continue;
+      LLVM_DEBUG(dbgs() << "\tReordering from source effecting instruction: ";
+                 I.dump());
+      reorderDefinition(&I, TopologicalSort, Visited);
+    }
+
+    for (auto &I : BB) {
+      // Process the remaining instructions.
+      //
+      // TODO: Do more a intelligent sorting of these instructions. For example,
+      // seperate between dead instructinos and instructions used in another
+      // block. Use properties of the CFG the order instructions that are used
+      // in another block.
+      if (Visited.contains(&I))
+        continue;
+      LLVM_DEBUG(dbgs() << "\tReordering from source instruction: "; I.dump());
+      reorderDefinition(&I, TopologicalSort, Visited);
+    }
+
+    LLVM_DEBUG(dbgs() << "Inserting instructions into: " << BB.getName()
+                      << "\n");
+    // Reorder based on the topological sort.
+    while (!TopologicalSort.empty()) {
+      auto *Instruction = TopologicalSort.top();
+      auto FirstNonPHIOrDbgOrAlloca = BB.getFirstNonPHIOrDbgOrAlloca();
+      if (auto *Call = dyn_cast<CallInst>(&*FirstNonPHIOrDbgOrAlloca)) {
+        if (Call->getIntrinsicID() ==
+                Intrinsic::experimental_convergence_entry ||
+            Call->getIntrinsicID() == Intrinsic::experimental_convergence_loop)
+          FirstNonPHIOrDbgOrAlloca++;
+      }
+      Instruction->moveBefore(&*FirstNonPHIOrDbgOrAlloca);
+      TopologicalSort.pop();
+    }
+  }
+}
+
+void IRNormalizer::reorderDefinition(
+    Instruction *Definition, std::stack<Instruction *> &TopologicalSort,
+    SmallPtrSet<const Instruction *, 32> &Visited) const {
+  if (Visited.contains(Definition))
+    return;
+  Visited.insert(Definition);
+
+  {
+    const auto *BasicBlock = Definition->getParent();
+    const auto FirstNonPHIOrDbgOrAlloca =
+        BasicBlock->getFirstNonPHIOrDbgOrAlloca();
+    if (FirstNonPHIOrDbgOrAlloca == BasicBlock->end())
+      return; // TODO: Is this necessary?
+    if (Definition->comesBefore(&*FirstNonPHIOrDbgOrAlloca))
+      return; // TODO: Do some kind of ordering for these instructions.
+  }
+
+  for (auto &Operand : Definition->operands()) {
+    if (auto *Op = dyn_cast<Instruction>(Operand)) {
+      if (Op->getParent() != Definition->getParent())
+        continue; // Only reorder instruction within the same basic block
+      reorderDefinition(Op, TopologicalSort, Visited);
+    }
+  }
+
+  LLVM_DEBUG(dbgs() << "\t\tNext in topological sort: "; Definition->dump());
+  if (Definition->isTerminator())
+    return;
+  if (auto *Call = dyn_cast<CallInst>(Definition)) {
+    if (Call->isMustTailCall())
+      return;
+    if (Call->getIntrinsicID() == Intrinsic::experimental_deoptimize)
+      return;
+    if (Call->getIntrinsicID() == Intrinsic::experimental_convergence_entry)
+      return;
+    if (Call->getIntrinsicID() == Intrinsic::experimental_convergence_loop)
+      return;
+  }
+  if (auto *BitCast = dyn_cast<BitCastInst>(Definition)) {
+    if (auto *Call = dyn_cast<CallInst>(BitCast->getOperand(0))) {
+      if (Call->isMustTailCall())
+        return;
+    }
+  }
+
+  TopologicalSort.emplace(Definition);
+}
+
+/// Reorders instruction's operands alphabetically. This method assumes
+/// that passed instruction is commutative. Changing the operand order
+/// in other instructions may change the semantics.
+///
+/// \param I Instruction whose operands will be reordered.
+void IRNormalizer::reorderInstructionOperandsByNames(Instruction *I) const {
+  // This method assumes that passed I is commutative,
+  // changing the order of operands in other instructions
+  // may change the semantics.
+
+  // Instruction operands for further sorting.
+  SmallVector<std::pair<std::string, Value *>, 4> Operands;
+
+  // Collect operands.
+  for (auto &Op : I->operands()) {
+    if (auto *V = dyn_cast<Value>(Op)) {
+      if (isa<Instruction>(V)) {
+        // This is an an instruction.
+        Operands.push_back(std::pair<std::string, Value *>(V->getName(), V));
+      } else {
+        std::string TextRepresentation;
+        raw_string_ostream Stream(TextRepresentation);
+        Op->printAsOperand(Stream, false);
+        Operands.push_back(std::pair<std::string, Value *>(Stream.str(), V));
+      }
+    }
+  }
+
+  // Sort operands.
+  sortCommutativeOperands(I, Operands);
+
+  // Reorder operands.
+  unsigned Position = 0;
+  for (auto &Op : I->operands()) {
+    Op.set(Operands[Position].second);
+    Position += 1;
+  }
+}
+
+/// Reorders PHI node's values according to the names of corresponding basic
+/// blocks.
+///
+/// \param Phi PHI node to normalize.
+void IRNormalizer::reorderPHIIncomingValues(PHINode *Phi) const {
+  // Values for further sorting.
+  SmallVector<std::pair<Value *, BasicBlock *>, 2> Values;
+
+  // Collect blocks and corresponding values.
+  for (auto &BB : Phi->blocks()) {
+    Value *V = Phi->getIncomingValueForBlock(BB);
+    Values.push_back(std::pair<Value *, BasicBlock *>(V, BB));
+  }
+
+  // Sort values according to the name of a basic block.
+  llvm::sort(Values, [](const std::pair<Value *, BasicBlock *> &LHS,
+                        const std::pair<Value *, BasicBlock *> &RHS) {
+    return LHS.second->getName() < RHS.second->getName();
+  });
+
+  // Swap.
+  for (unsigned i = 0; i < Values.size(); ++i) {
+    Phi->setIncomingBlock(i, Values[i].second);
+    Phi->setIncomingValue(i, Values[i].first);
+  }
+}
+
+/// Returns a vector of output instructions. An output is an instruction which
+/// has side-effects or is ReturnInst. Uses isOutput().
+///
+/// \see isOutput()
+/// \param F Function to collect outputs from.
+SmallVector<Instruction *, 16>
+IRNormalizer::collectOutputInstructions(Function &F) const {
+  // Output instructions are collected top-down in each function,
+  // any change may break the def-use chain in reordering methods.
+  SmallVector<Instruction *, 16> Outputs;
+  for (auto &I : instructions(F))
+    if (isOutput(&I))
+      Outputs.push_back(&I);
+  return Outputs;
+}
+
+/// Helper method checking whether the instruction may have side effects or is
+/// ReturnInst.
+///
+/// \param I Considered instruction.
+bool IRNormalizer::isOutput(const Instruction *I) const {
+  // Outputs are such instructions which may have side effects or is ReturnInst.
+  return I->mayHaveSideEffects() || isa<ReturnInst>(I);
+}
+
+/// Helper method checking whether the instruction has users and only
+/// immediate operands.
+///
+/// \param I Considered instruction.
+bool IRNormalizer::isInitialInstruction(const Instruction *I) const {
+  // Initial instructions are such instructions whose values are used by
+  // other instructions, yet they only depend on immediate values.
+  return !I->user_empty() && hasOnlyImmediateOperands(I);
+}
+
+/// Helper method checking whether the instruction has only immediate operands.
+///
+/// \param I Considered instruction.
+bool IRNormalizer::hasOnlyImmediateOperands(const Instruction *I) const {
+  for (const auto &Op : I->operands())
+    if (isa<Instruction>(Op))
+      return false; // Found non-immediate operand (instruction).
+  return true;
+}
+
+/// Helper method returning indices (distance from the beginning of the basic
+/// block) of outputs using the \p I (eliminates repetitions). Walks down the
+/// def-use tree recursively.
+///
+/// \param I Considered instruction.
+/// \param Visited Set of visited instructions.
+SetVector<int> IRNormalizer::getOutputFootprint(
+    Instruction *I, SmallPtrSet<const Instruction *, 32> &Visited) const {
+
+  // Vector containing indexes of outputs (no repetitions),
+  // which use I in the order of walking down the def-use tree.
+  SetVector<int> Outputs;
+
+  if (!Visited.count(I)) {
+    Visited.insert(I);
+
+    if (isOutput(I)) {
+      // Gets output instruction's parent function.
+      Function *Func = I->getParent()->getParent();
+
+      // Finds and inserts the index of the output to the vector.
+      unsigned Count = 0;
+      for (const auto &B : *Func) {
+        for (const auto &E : B) {
+          if (&E == I)
+            Outputs.insert(Count);
+          Count += 1;
+        }
+      }
+
+      // Returns to the used instruction.
+      return Outputs;
+    }
+
+    for (auto *U : I->users()) {
+      if (auto *UI = dyn_cast<Instruction>(U)) {
+        // Vector for outputs which use UI.
+        SetVector<int> OutputsUsingUI = getOutputFootprint(UI, Visited);
+        // Insert the indexes of outputs using UI.
+        Outputs.insert(OutputsUsingUI.begin(), OutputsUsingUI.end());
+      }
+    }
+  }
+
+  // Return to the used instruction.
+  return Outputs;
+}
+
+PreservedAnalyses IRNormalizerPass::run(Function &F,
+                                        FunctionAnalysisManager &AM) const {
+  IRNormalizer{}.runOnFunction(F);
+  PreservedAnalyses PA;
+  PA.preserveSet<CFGAnalyses>();
+  return PA;
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/naming-args-instr-blocks.ll b/llvm/test/Transforms/IRNormalizer/naming-args-instr-blocks.ll
new file mode 100644
index 00000000000000..7845a22e3ae174
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/naming-args-instr-blocks.ll
@@ -0,0 +1,18 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define i32 @foo(i32) {
+; CHECK-LABEL: define i32 @foo(
+; CHECK-SAME: i32 [[A0:%.*]]) {
+; CHECK-NEXT:  [[BB17254:.*:]]
+; CHECK-NEXT:    %"vl24903([[A0]], 2)" = add i32 [[A0]], 2
+; CHECK-NEXT:    %"op10412(vl24903)" = add i32 6, %"vl24903([[A0]], 2)"
+; CHECK-NEXT:    ret i32 %"op10412(vl24903)"
+;
+entry:
+  %a = add i32 %0, 2
+
+  %b = add i32 %a, 6
+
+  ret i32 %b
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/naming-arguments.ll b/llvm/test/Transforms/IRNormalizer/naming-arguments.ll
new file mode 100644
index 00000000000000..a0066c97ca3db3
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/naming-arguments.ll
@@ -0,0 +1,13 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define i32 @foo(i32, i32) {
+; CHECK-LABEL: define i32 @foo(
+; CHECK-SAME: i32 [[A0:%.*]], i32 [[A1:%.*]]) {
+; CHECK-NEXT:  [[BB17254:.*:]]
+; CHECK-NEXT:    %"vl20416([[A0]], [[A1]])" = mul i32 [[A0]], [[A1]]
+; CHECK-NEXT:    ret i32 %"vl20416([[A0]], [[A1]])"
+;
+  %tmp = mul i32 %0, %1
+  ret i32 %tmp
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/naming.ll b/llvm/test/Transforms/IRNormalizer/naming.ll
new file mode 100644
index 00000000000000..7c93a6213a4a9f
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/naming.ll
@@ -0,0 +1,30 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define i32 @foo(i32 %a0) {
+; CHECK-LABEL: define i32 @foo(
+; CHECK-SAME: i32 [[A0:%.*]]) {
+; CHECK-NEXT:  [[BB17254:.*:]]
+; CHECK-NEXT:    %"vl12603([[A0]], 2)" = add i32 [[A0]], 2
+; CHECK-NEXT:    ret i32 %"vl12603([[A0]], 2)"
+;
+entry:
+  %a = add i32 %a0, 2
+  ret i32 %a
+}
+
+define i32 @bar(i32 %a0) {
+; CHECK-LABEL: define i32 @bar(
+; CHECK-SAME: i32 [[A0:%.*]]) {
+; CHECK-NEXT:  [[BB17254:.*:]]
+; CHECK-NEXT:    %"vl76167([[A0]], 2)" = add i32 [[A0]], 2
+; CHECK-NEXT:    %"op10412(vl76167)" = add i32 6, %"vl76167([[A0]], 2)"
+; CHECK-NEXT:    %"op10412(op10412)" = add i32 8, %"op10412(vl76167)"
+; CHECK-NEXT:    ret i32 %"op10412(op10412)"
+;
+entry:
+  %a = add i32 %a0, 2
+  %b = add i32 %a, 6
+  %c = add i32 %b, 8
+  ret i32 %c
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll b/llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll
new file mode 100644
index 00000000000000..633cf091da5085
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/regression-convergence-tokens.ll
@@ -0,0 +1,27 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+; Function Attrs: convergent nounwind readnone
+define i32 @nested(i32 %src) #0 {
+; CHECK-LABEL: define i32 @nested(
+; CHECK-SAME: i32 [[A0:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[BB15160:.*:]]
+; CHECK-NEXT:    [[T1:%.*]] = call token @llvm.experimental.convergence.entry()
+; CHECK-NEXT:    %"vl15001llvm.experimental.convergence.anchor()" = call token @llvm.experimental.convergence.anchor()
+; CHECK-NEXT:    %"op68297llvm.amdgcn.readfirstlane.i32([[A0]], vl15001llvm.experimental.convergence.anchor())" = call i32 @llvm.amdgcn.readfirstlane.i32(i32 [[A0]]) [ "convergencectrl"(token %"vl15001llvm.experimental.convergence.anchor()") ]
+; CHECK-NEXT:    ret i32 undef
+;
+  %t1 = call token @llvm.experimental.convergence.entry()
+  %t2 = call token @llvm.experimental.convergence.anchor()
+  %r2 = call i32 @llvm.amdgcn.readfirstlane(i32 %src) [ "convergencectrl"(token %t2) ]
+  ret i32 undef
+}
+
+; Function Attrs: convergent nounwind readnone
+declare i32 @llvm.amdgcn.readfirstlane(i32) #0
+
+declare token @llvm.experimental.convergence.entry()
+
+declare token @llvm.experimental.convergence.anchor()
+
+attributes #0 = { convergent nounwind readnone }

diff  --git a/llvm/test/Transforms/IRNormalizer/regression-coro-elide-musttail.ll b/llvm/test/Transforms/IRNormalizer/regression-coro-elide-musttail.ll
new file mode 100644
index 00000000000000..891eae0954a7c1
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/regression-coro-elide-musttail.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define fastcc void @foo.resume_musttail(ptr %FramePtr) {
+; CHECK-LABEL: define fastcc void @foo.resume_musttail(
+; CHECK-SAME: ptr [[A0:%.*]]) {
+; CHECK-NEXT:  [[BB15160:.*:]]
+; CHECK-NEXT:    [[TMP0:%.*]] = tail call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+; CHECK-NEXT:    musttail call fastcc void undef(ptr null)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %0 = tail call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+  musttail call fastcc void undef(ptr null)
+  ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: read)
+declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr) #0
+
+attributes #0 = { nocallback nofree nosync nounwind willreturn memory(argmem: read) }

diff  --git a/llvm/test/Transforms/IRNormalizer/regression-deoptimize.ll b/llvm/test/Transforms/IRNormalizer/regression-deoptimize.ll
new file mode 100644
index 00000000000000..594ea6109499a1
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/regression-deoptimize.ll
@@ -0,0 +1,18 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+target datalayout = "e-p:32:32:32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f32:32:32-f64:32:32-v64:32:64-v128:32:128-a0:0:32-n32"
+
+define void @test1() {
+; CHECK-LABEL: define void @test1() {
+; CHECK-NEXT:  [[BB15160:.*:]]
+; CHECK-NEXT:    [[TMP0:%.*]] = load i8, ptr null, align 1
+; CHECK-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
+; CHECK-NEXT:    ret void
+;
+  %1 = load i8, ptr null, align 1
+  call void (...) @llvm.experimental.deoptimize.isVoid() [ "deopt"() ]
+  ret void
+}
+
+declare void @llvm.experimental.deoptimize.isVoid(...)

diff  --git a/llvm/test/Transforms/IRNormalizer/regression-dont-hoist-deoptimize.ll b/llvm/test/Transforms/IRNormalizer/regression-dont-hoist-deoptimize.ll
new file mode 100644
index 00000000000000..d3d3e33723f029
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/regression-dont-hoist-deoptimize.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128-ni:1-p2:32:8:8:32-ni:2"
+target triple = "x86_64-unknown-linux-gnu"
+
+declare void @llvm.experimental.deoptimize.isVoid(...)
+
+define void @widget() {
+; CHECK-LABEL: define void @widget() {
+; CHECK-NEXT:  [[BB15160:.*:]]
+; CHECK-NEXT:    [[TMP3:%.*]] = trunc i64 0 to i32
+; CHECK-NEXT:    call void (...) @llvm.experimental.deoptimize.isVoid(i32 0) [ "deopt"() ]
+; CHECK-NEXT:    ret void
+;
+bb:
+  %tmp3 = trunc i64 0 to i32
+  call void (...) @llvm.experimental.deoptimize.isVoid(i32 0) [ "deopt"() ]
+  ret void
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll b/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll
new file mode 100644
index 00000000000000..35ac0fd8c329a7
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll
@@ -0,0 +1,195 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define void @test(ptr, i32) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: ptr [[A0:%.*]], i32 [[A1:%.*]]) {
+; CHECK-NEXT:  [[BB76951:.*]]:
+; CHECK-NEXT:    %"vl72693([[A1]], 1)" = add i32 [[A1]], 1
+; CHECK-NEXT:    br label %[[BB16110:.*]]
+; CHECK:       [[BB16110]]:
+; CHECK-NEXT:    %"op10912(op18080, vl72693)" = phi i32 [ %"op18080(op10412, op17645)", %[[BB16110]] ], [ %"vl72693([[A1]], 1)", %[[BB76951]] ]
+; CHECK-NEXT:    %"op10912(op17645, vl72693)" = phi i32 [ %"op17645(op10912)70", %[[BB16110]] ], [ %"vl72693([[A1]], 1)", %[[BB76951]] ]
+; CHECK-NEXT:    %"op15084(op10912)" = mul i32 %"op10912(op18080, vl72693)", undef
+; CHECK-NEXT:    %"op16562(op15084)" = xor i32 -1, %"op15084(op10912)"
+; CHECK-NEXT:    %"op44627(op10912, op16562)" = add i32 %"op10912(op18080, vl72693)", %"op16562(op15084)"
+; CHECK-NEXT:    %"op17645(op10912)" = add i32 -1, %"op10912(op17645, vl72693)"
+; CHECK-NEXT:    %"op18080(op17645, op44627)" = add i32 %"op17645(op10912)", %"op44627(op10912, op16562)"
+; CHECK-NEXT:    %"op17720(op15084, op18080)" = mul i32 %"op15084(op10912)", %"op18080(op17645, op44627)"
+; CHECK-NEXT:    %"op16562(op17720)" = xor i32 -1, %"op17720(op15084, op18080)"
+; CHECK-NEXT:    %"op17430(op16562, op18080)" = add i32 %"op16562(op17720)", %"op18080(op17645, op44627)"
+; CHECK-NEXT:    %"op10412(op17430)" = add i32 %"op17430(op16562, op18080)", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)" = mul i32 %"op10412(op17430)", %"op17720(op15084, op18080)"
+; CHECK-NEXT:    %"op16562(op17720)1" = xor i32 -1, %"op17720(op10412, op17720)"
+; CHECK-NEXT:    %"op17430(op10412, op16562)" = add i32 %"op10412(op17430)", %"op16562(op17720)1"
+; CHECK-NEXT:    %"op10412(op17430)2" = add i32 %"op17430(op10412, op16562)", undef
+; CHECK-NEXT:    %"op10412(op10412)" = add i32 %"op10412(op17430)2", undef
+; CHECK-NEXT:    %"op10412(op10412)3" = add i32 %"op10412(op10412)", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)4" = mul i32 %"op10412(op17430)2", %"op17720(op10412, op17720)"
+; CHECK-NEXT:    %"op17720(op10412, op17720)5" = mul i32 %"op10412(op10412)3", %"op17720(op10412, op17720)4"
+; CHECK-NEXT:    %"op16562(op17720)6" = xor i32 -1, %"op17720(op10412, op17720)5"
+; CHECK-NEXT:    %"op17430(op10412, op16562)7" = add i32 %"op10412(op10412)3", %"op16562(op17720)6"
+; CHECK-NEXT:    %"op10412(op17430)8" = add i32 %"op17430(op10412, op16562)7", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)9" = mul i32 %"op10412(op17430)8", %"op17720(op10412, op17720)5"
+; CHECK-NEXT:    %"op16562(op17720)10" = xor i32 -1, %"op17720(op10412, op17720)9"
+; CHECK-NEXT:    %"op17430(op10412, op16562)11" = add i32 %"op10412(op17430)8", %"op16562(op17720)10"
+; CHECK-NEXT:    %"op10412(op17430)12" = add i32 %"op17430(op10412, op16562)11", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)13" = mul i32 %"op10412(op17430)12", %"op17720(op10412, op17720)9"
+; CHECK-NEXT:    %"op16562(op17720)14" = xor i32 -1, %"op17720(op10412, op17720)13"
+; CHECK-NEXT:    %"op17430(op10412, op16562)15" = add i32 %"op10412(op17430)12", %"op16562(op17720)14"
+; CHECK-NEXT:    %"op10412(op17430)16" = add i32 %"op17430(op10412, op16562)15", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)17" = mul i32 %"op10412(op17430)16", %"op17720(op10412, op17720)13"
+; CHECK-NEXT:    %"op16562(op17720)18" = xor i32 -1, %"op17720(op10412, op17720)17"
+; CHECK-NEXT:    %"op17430(op10412, op16562)19" = add i32 %"op10412(op17430)16", %"op16562(op17720)18"
+; CHECK-NEXT:    %"op10412(op17430)20" = add i32 %"op17430(op10412, op16562)19", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)21" = mul i32 %"op10412(op17430)20", %"op17720(op10412, op17720)17"
+; CHECK-NEXT:    %"op16562(op17720)22" = xor i32 -1, %"op17720(op10412, op17720)21"
+; CHECK-NEXT:    %"op17430(op10412, op16562)23" = add i32 %"op10412(op17430)20", %"op16562(op17720)22"
+; CHECK-NEXT:    %"op17645(op10912)24" = add i32 -9, %"op10912(op17645, vl72693)"
+; CHECK-NEXT:    %"op18080(op17430, op17645)" = add i32 %"op17430(op10412, op16562)23", %"op17645(op10912)24"
+; CHECK-NEXT:    %"op17720(op17720, op18080)" = mul i32 %"op17720(op10412, op17720)21", %"op18080(op17430, op17645)"
+; CHECK-NEXT:    %"op16562(op17720)25" = xor i32 -1, %"op17720(op17720, op18080)"
+; CHECK-NEXT:    %"op17430(op16562, op18080)26" = add i32 %"op16562(op17720)25", %"op18080(op17430, op17645)"
+; CHECK-NEXT:    %"op10412(op17430)27" = add i32 %"op17430(op16562, op18080)26", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)28" = mul i32 %"op10412(op17430)27", %"op17720(op17720, op18080)"
+; CHECK-NEXT:    %"op16562(op17720)29" = xor i32 -1, %"op17720(op10412, op17720)28"
+; CHECK-NEXT:    %"op17430(op10412, op16562)30" = add i32 %"op10412(op17430)27", %"op16562(op17720)29"
+; CHECK-NEXT:    %"op10412(op17430)31" = add i32 %"op17430(op10412, op16562)30", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)32" = mul i32 %"op10412(op17430)31", %"op17720(op10412, op17720)28"
+; CHECK-NEXT:    %"op16562(op17720)33" = xor i32 -1, %"op17720(op10412, op17720)32"
+; CHECK-NEXT:    %"op17430(op10412, op16562)34" = add i32 %"op10412(op17430)31", %"op16562(op17720)33"
+; CHECK-NEXT:    %"op10412(op17430)35" = add i32 %"op17430(op10412, op16562)34", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)36" = mul i32 %"op10412(op17430)35", %"op17720(op10412, op17720)32"
+; CHECK-NEXT:    %"op16562(op17720)37" = xor i32 -1, %"op17720(op10412, op17720)36"
+; CHECK-NEXT:    %"op17430(op10412, op16562)38" = add i32 %"op10412(op17430)35", %"op16562(op17720)37"
+; CHECK-NEXT:    %"op10412(op17430)39" = add i32 %"op17430(op10412, op16562)38", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)40" = mul i32 %"op10412(op17430)39", %"op17720(op10412, op17720)36"
+; CHECK-NEXT:    %"op16562(op17720)41" = xor i32 -1, %"op17720(op10412, op17720)40"
+; CHECK-NEXT:    %"op17430(op10412, op16562)42" = add i32 %"op10412(op17430)39", %"op16562(op17720)41"
+; CHECK-NEXT:    %"op17645(op10912)43" = add i32 -14, %"op10912(op17645, vl72693)"
+; CHECK-NEXT:    %"op18080(op17430, op17645)44" = add i32 %"op17430(op10412, op16562)42", %"op17645(op10912)43"
+; CHECK-NEXT:    %"op17720(op17720, op18080)45" = mul i32 %"op17720(op10412, op17720)40", %"op18080(op17430, op17645)44"
+; CHECK-NEXT:    %"op16562(op17720)46" = xor i32 -1, %"op17720(op17720, op18080)45"
+; CHECK-NEXT:    %"op17430(op16562, op18080)47" = add i32 %"op16562(op17720)46", %"op18080(op17430, op17645)44"
+; CHECK-NEXT:    %"op10412(op17430)48" = add i32 %"op17430(op16562, op18080)47", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)49" = mul i32 %"op10412(op17430)48", %"op17720(op17720, op18080)45"
+; CHECK-NEXT:    %"op16562(op17720)50" = xor i32 -1, %"op17720(op10412, op17720)49"
+; CHECK-NEXT:    %"op17430(op10412, op16562)51" = add i32 %"op10412(op17430)48", %"op16562(op17720)50"
+; CHECK-NEXT:    %"op10412(op17430)52" = add i32 %"op17430(op10412, op16562)51", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)53" = mul i32 %"op10412(op17430)52", %"op17720(op10412, op17720)49"
+; CHECK-NEXT:    %"op16562(op17720)54" = xor i32 -1, %"op17720(op10412, op17720)53"
+; CHECK-NEXT:    %"op17430(op10412, op16562)55" = add i32 %"op10412(op17430)52", %"op16562(op17720)54"
+; CHECK-NEXT:    %"op10412(op17430)56" = add i32 %"op17430(op10412, op16562)55", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)57" = mul i32 %"op10412(op17430)56", %"op17720(op10412, op17720)53"
+; CHECK-NEXT:    %"op16562(op17720)58" = xor i32 -1, %"op17720(op10412, op17720)57"
+; CHECK-NEXT:    %"op17430(op10412, op16562)59" = add i32 %"op10412(op17430)56", %"op16562(op17720)58"
+; CHECK-NEXT:    %"op10412(op17430)60" = add i32 %"op17430(op10412, op16562)59", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)61" = mul i32 %"op10412(op17430)60", %"op17720(op10412, op17720)57"
+; CHECK-NEXT:    %"op16562(op17720)62" = xor i32 -1, %"op17720(op10412, op17720)61"
+; CHECK-NEXT:    %"op17430(op10412, op16562)63" = add i32 %"op10412(op17430)60", %"op16562(op17720)62"
+; CHECK-NEXT:    %"op10412(op17430)64" = add i32 %"op17430(op10412, op16562)63", undef
+; CHECK-NEXT:    %"op17720(op10412, op17720)65" = mul i32 %"op10412(op17430)64", %"op17720(op10412, op17720)61"
+; CHECK-NEXT:    %"op16562(op17720)66" = xor i32 -1, %"op17720(op10412, op17720)65"
+; CHECK-NEXT:    %"op17430(op10412, op16562)67" = add i32 %"op10412(op17430)64", %"op16562(op17720)66"
+; CHECK-NEXT:    %"op10412(op17430)68" = add i32 %"op17430(op10412, op16562)67", undef
+; CHECK-NEXT:    %"op10412(op10412)69" = add i32 %"op10412(op17430)68", undef
+; CHECK-NEXT:    %"op17645(op10912)70" = add i32 -21, %"op10912(op17645, vl72693)"
+; CHECK-NEXT:    %"op18080(op10412, op17645)" = add i32 %"op10412(op10412)69", %"op17645(op10912)70"
+; CHECK-NEXT:    store i32 %"op18080(op10412, op17645)", ptr [[A0]], align 4
+; CHECK-NEXT:    br label %[[BB16110]]
+;
+bb:
+  %a = add i32 %1, 1
+  br label %bb1
+
+bb1:                                              ; preds = %bb1, %bb
+  %tmp = phi i32 [ %a, %bb ], [ %tmp87, %bb1 ]
+  %tmp2 = phi i32 [ %a, %bb ], [ %tmp86, %bb1 ]
+  %tmp3 = mul i32 %tmp, undef
+  %tmp4 = xor i32 %tmp3, -1
+  %tmp5 = add i32 %tmp, %tmp4
+  %tmp6 = add i32 %tmp2, -1
+  %tmp7 = add i32 %tmp5, %tmp6
+  %tmp8 = mul i32 %tmp7, %tmp3
+  %tmp9 = xor i32 %tmp8, -1
+  %tmp10 = add i32 %tmp7, %tmp9
+  %tmp11 = add i32 %tmp10, undef
+  %tmp12 = mul i32 %tmp11, %tmp8
+  %tmp13 = xor i32 %tmp12, -1
+  %tmp14 = add i32 %tmp11, %tmp13
+  %tmp15 = add i32 %tmp14, undef
+  %tmp16 = mul i32 %tmp15, %tmp12
+  %tmp17 = add i32 %tmp15, undef
+  %tmp18 = add i32 %tmp17, undef
+  %tmp19 = mul i32 %tmp18, %tmp16
+  %tmp20 = xor i32 %tmp19, -1
+  %tmp21 = add i32 %tmp18, %tmp20
+  %tmp22 = add i32 %tmp21, undef
+  %tmp23 = mul i32 %tmp22, %tmp19
+  %tmp24 = xor i32 %tmp23, -1
+  %tmp25 = add i32 %tmp22, %tmp24
+  %tmp26 = add i32 %tmp25, undef
+  %tmp27 = mul i32 %tmp26, %tmp23
+  %tmp28 = xor i32 %tmp27, -1
+  %tmp29 = add i32 %tmp26, %tmp28
+  %tmp30 = add i32 %tmp29, undef
+  %tmp31 = mul i32 %tmp30, %tmp27
+  %tmp32 = xor i32 %tmp31, -1
+  %tmp33 = add i32 %tmp30, %tmp32
+  %tmp34 = add i32 %tmp33, undef
+  %tmp35 = mul i32 %tmp34, %tmp31
+  %tmp36 = xor i32 %tmp35, -1
+  %tmp37 = add i32 %tmp34, %tmp36
+  %tmp38 = add i32 %tmp2, -9
+  %tmp39 = add i32 %tmp37, %tmp38
+  %tmp40 = mul i32 %tmp39, %tmp35
+  %tmp41 = xor i32 %tmp40, -1
+  %tmp42 = add i32 %tmp39, %tmp41
+  %tmp43 = add i32 %tmp42, undef
+  %tmp44 = mul i32 %tmp43, %tmp40
+  %tmp45 = xor i32 %tmp44, -1
+  %tmp46 = add i32 %tmp43, %tmp45
+  %tmp47 = add i32 %tmp46, undef
+  %tmp48 = mul i32 %tmp47, %tmp44
+  %tmp49 = xor i32 %tmp48, -1
+  %tmp50 = add i32 %tmp47, %tmp49
+  %tmp51 = add i32 %tmp50, undef
+  %tmp52 = mul i32 %tmp51, %tmp48
+  %tmp53 = xor i32 %tmp52, -1
+  %tmp54 = add i32 %tmp51, %tmp53
+  %tmp55 = add i32 %tmp54, undef
+  %tmp56 = mul i32 %tmp55, %tmp52
+  %tmp57 = xor i32 %tmp56, -1
+  %tmp58 = add i32 %tmp55, %tmp57
+  %tmp59 = add i32 %tmp2, -14
+  %tmp60 = add i32 %tmp58, %tmp59
+  %tmp61 = mul i32 %tmp60, %tmp56
+  %tmp62 = xor i32 %tmp61, -1
+  %tmp63 = add i32 %tmp60, %tmp62
+  %tmp64 = add i32 %tmp63, undef
+  %tmp65 = mul i32 %tmp64, %tmp61
+  %tmp66 = xor i32 %tmp65, -1
+  %tmp67 = add i32 %tmp64, %tmp66
+  %tmp68 = add i32 %tmp67, undef
+  %tmp69 = mul i32 %tmp68, %tmp65
+  %tmp70 = xor i32 %tmp69, -1
+  %tmp71 = add i32 %tmp68, %tmp70
+  %tmp72 = add i32 %tmp71, undef
+  %tmp73 = mul i32 %tmp72, %tmp69
+  %tmp74 = xor i32 %tmp73, -1
+  %tmp75 = add i32 %tmp72, %tmp74
+  %tmp76 = add i32 %tmp75, undef
+  %tmp77 = mul i32 %tmp76, %tmp73
+  %tmp78 = xor i32 %tmp77, -1
+  %tmp79 = add i32 %tmp76, %tmp78
+  %tmp80 = add i32 %tmp79, undef
+  %tmp81 = mul i32 %tmp80, %tmp77
+  %tmp82 = xor i32 %tmp81, -1
+  %tmp83 = add i32 %tmp80, %tmp82
+  %tmp84 = add i32 %tmp83, undef
+  %tmp85 = add i32 %tmp84, undef
+  %tmp86 = add i32 %tmp2, -21
+  %tmp87 = add i32 %tmp85, %tmp86
+  store i32 %tmp87, ptr %0
+  br label %bb1
+}

diff  --git a/llvm/test/Transforms/IRNormalizer/reordering-basic.ll b/llvm/test/Transforms/IRNormalizer/reordering-basic.ll
new file mode 100644
index 00000000000000..fd09ce016addbc
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/reordering-basic.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=normalize < %s | FileCheck %s
+
+define double @foo(double %a0, double %a1) {
+; CHECK-LABEL: define double @foo(
+; CHECK-SAME: double [[A0:%.*]], double [[A1:%.*]]) {
+; CHECK-NEXT:  [[BB17254:.*:]]
+; CHECK-NEXT:    %"vl93562([[A0]], 2.000000e+00)" = fmul double [[A0]], 2.000000e+00
+; CHECK-NEXT:    %"op95858(vl93562)" = fmul double 6.000000e+00, %"vl93562([[A0]], 2.000000e+00)"
+; CHECK-NEXT:    [[A:%.*]] = fmul double [[A0]], [[A1]]
+; CHECK-NEXT:    [[C:%.*]] = fmul double 6.000000e+00, [[A]]
+; CHECK-NEXT:    ret double %"op95858(vl93562)"
+;
+entry:
+  %a = fmul double %a0, %a1
+  %b = fmul double %a0, 2.000000e+00
+  %c = fmul double %a, 6.000000e+00
+  %d = fmul double %b, 6.000000e+00
+  ret double %d
+}
+
+declare double @bir()
+
+declare double @bar()
+
+define double @baz(double %x) {
+; CHECK-LABEL: define double @baz(
+; CHECK-SAME: double [[A0:%.*]]) {
+; CHECK-NEXT:  [[BB76951:.*:]]
+; CHECK-NEXT:    [[IFCOND:%.*]] = fcmp one double [[A0]], 0.000000e+00
+; CHECK-NEXT:    br i1 [[IFCOND]], label %[[BB91455:.*]], label %[[BB914551:.*]]
+; CHECK:       [[BB91455]]:
+; CHECK-NEXT:    %"vl15001bir()" = call double @bir()
+; CHECK-NEXT:    br label %[[BB17254:.*]]
+; CHECK:       [[BB914551]]:
+; CHECK-NEXT:    %"vl69719bar()" = call double @bar()
+; CHECK-NEXT:    br label %[[BB17254]]
+; CHECK:       [[BB17254]]:
+; CHECK-NEXT:    %"op19734(vl15001, vl69719)" = phi double [ %"vl15001bir()", %[[BB91455]] ], [ %"vl69719bar()", %[[BB914551]] ]
+; CHECK-NEXT:    ret double %"op19734(vl15001, vl69719)"
+;
+entry:
+  %ifcond = fcmp one double %x, 0.000000e+00
+  br i1 %ifcond, label %then, label %else
+
+then:       ; preds = %entry
+  %calltmp = call double @bir()
+  br label %ifcont
+
+else:       ; preds = %entry
+  %calltmp1 = call double @bar()
+  br label %ifcont
+
+ifcont:     ; preds = %else, %then
+  %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
+  ret double %iftmp
+}
+

diff  --git a/llvm/test/Transforms/IRNormalizer/reordering.ll b/llvm/test/Transforms/IRNormalizer/reordering.ll
new file mode 100644
index 00000000000000..313d44a88e3cb7
--- /dev/null
+++ b/llvm/test/Transforms/IRNormalizer/reordering.ll
@@ -0,0 +1,163 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=normalize -verify-each -norm-rename-all=false < %s | FileCheck %s
+
+define void @foo() {
+; CHECK-LABEL: define void @foo() {
+; CHECK-NEXT:  bb17254:
+; CHECK-NEXT:    ret void
+;
+  ret void
+}
+
+define void @empty_basic_block() {
+; CHECK-LABEL: define void @empty_basic_block() {
+; CHECK-NEXT:  exit:
+; CHECK-NEXT:    ret void
+;
+exit:
+  ret void
+}
+
+declare void @effecting()
+
+; Place dead instruction(s) before the terminator
+define void @call_effecting() {
+; CHECK-LABEL: define void @call_effecting() {
+; CHECK-NEXT:  bb15160:
+; CHECK-NEXT:    call void @effecting()
+; CHECK-NEXT:    [[TMP0:%.*]] = add i32 0, 1
+; CHECK-NEXT:    ret void
+;
+  %1 = add i32 0, 1
+  call void @effecting()
+  ret void
+}
+
+define void @dont_move_above_phi() {
+; CHECK-LABEL: define void @dont_move_above_phi() {
+; CHECK-NEXT:  bb76951:
+; CHECK-NEXT:    br label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    [[TMP0:%.*]] = phi i32 [ 0, [[BB76951:%.*]] ]
+; CHECK-NEXT:    call void @effecting()
+; CHECK-NEXT:    ret void
+;
+  br label %exit
+exit:
+  %1 = phi i32 [0, %0]
+  call void @effecting()
+  ret void
+}
+
+define void @dont_move_above_alloca() {
+; CHECK-LABEL: define void @dont_move_above_alloca() {
+; CHECK-NEXT:  bb15160:
+; CHECK-NEXT:    [[TMP0:%.*]] = alloca i32, align 4
+; CHECK-NEXT:    call void @effecting()
+; CHECK-NEXT:    ret void
+;
+  %1 = alloca i32
+  call void @effecting()
+  ret void
+}
+
+declare void @effecting1()
+
+define void @dont_reorder_effecting() {
+; CHECK-LABEL: define void @dont_reorder_effecting() {
+; CHECK-NEXT:  bb10075:
+; CHECK-NEXT:    call void @effecting()
+; CHECK-NEXT:    call void @effecting1()
+; CHECK-NEXT:    ret void
+;
+  call void @effecting()
+  call void @effecting1()
+  ret void
+}
+
+declare void @effecting2(i32)
+
+define void @dont_reorder_effecting1() {
+; CHECK-LABEL: define void @dont_reorder_effecting1() {
+; CHECK-NEXT:  bb10075:
+; CHECK-NEXT:    [[ONE:%.*]] = add i32 1, 1
+; CHECK-NEXT:    call void @effecting2(i32 [[ONE]])
+; CHECK-NEXT:    [[TWO:%.*]] = add i32 2, 2
+; CHECK-NEXT:    call void @effecting2(i32 [[TWO]])
+; CHECK-NEXT:    ret void
+;
+  %one = add i32 1, 1
+  %two = add i32 2, 2
+  call void @effecting2(i32 %one)
+  call void @effecting2(i32 %two)
+  ret void
+}
+
+define void @dont_reorder_across_blocks() {
+; CHECK-LABEL: define void @dont_reorder_across_blocks() {
+; CHECK-NEXT:  bb76951:
+; CHECK-NEXT:    [[ONE:%.*]] = add i32 1, 1
+; CHECK-NEXT:    br label [[EXIT:%.*]]
+; CHECK:       exit:
+; CHECK-NEXT:    call void @effecting2(i32 [[ONE]])
+; CHECK-NEXT:    ret void
+;
+  %one = add i32 1, 1
+  br label %exit
+exit:
+  call void @effecting2(i32 %one)
+  ret void
+}
+
+define void @independentldst(ptr %a, ptr %b) {
+; CHECK-LABEL: define void @independentldst(
+; CHECK-SAME: ptr [[A:%.*]], ptr [[B:%.*]]) {
+; CHECK-NEXT:  bb10495:
+; CHECK-NEXT:    %"vl12961([[B]])" = load i32, ptr [[B]], align 4
+; CHECK-NEXT:    store i32 %"vl12961([[B]])", ptr [[A]], align 4
+; CHECK-NEXT:    %"vl89528([[A]])" = load i32, ptr [[A]], align 4
+; CHECK-NEXT:    store i32 %"vl89528([[A]])", ptr [[B]], align 4
+; CHECK-NEXT:    ret void
+;
+  %2 = load i32, ptr %a
+  %3 = load i32, ptr %b
+  store i32 %3, ptr %a
+  store i32 %2, ptr %b
+  ret void
+}
+
+define void @multiple_use_ld(ptr %a, ptr %b) {
+; CHECK-LABEL: define void @multiple_use_ld(
+; CHECK-SAME: ptr [[A:%.*]], ptr [[B:%.*]]) {
+; CHECK-NEXT:  bb14927:
+; CHECK-NEXT:    %"vl16793([[A]])" = load i32, ptr [[A]], align 4
+; CHECK-NEXT:    store i32 %"vl16793([[A]])", ptr [[A]], align 4
+; CHECK-NEXT:    %"vl89528([[B]])" = load i32, ptr [[B]], align 4
+; CHECK-NEXT:    store i32 %"vl89528([[B]])", ptr [[B]], align 4
+; CHECK-NEXT:    store i32 %"vl16793([[A]])", ptr [[A]], align 4
+; CHECK-NEXT:    ret void
+;
+  %2 = load i32, ptr %a
+  store i32 %2, ptr %a
+  %3 = load i32, ptr %b
+  store i32 %3, ptr %b
+  store i32 %2, ptr %a
+  ret void
+}
+
+; This is an incorrect transformation. Moving the store above the load could
+; change the loaded value. To do this, the pointers would need to be `noalias`.
+define void @undef_st(ptr %a, ptr %b) {
+; CHECK-LABEL: define void @undef_st(
+; CHECK-SAME: ptr [[A:%.*]], ptr [[B:%.*]]) {
+; CHECK-NEXT:  bb10495:
+; CHECK-NEXT:    store i32 undef, ptr [[B]], align 4
+; CHECK-NEXT:    %"vl16028([[A]])" = load i32, ptr [[A]], align 4
+; CHECK-NEXT:    store i32 %"vl16028([[A]])", ptr [[A]], align 4
+; CHECK-NEXT:    ret void
+;
+  %2 = load i32, ptr %a
+  store i32 undef, ptr %b
+  store i32 %2, ptr %a
+  ret void
+}