[Mlir-commits] [clang] [clang-tools-extra] [flang] [libcxx] [lldb] [llvm] [mlir] [openmp] [Clang][Sema] Diagnosis for constexpr constructor not initializing a union member (PR #81042)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Thu Feb 8 10:39:34 PST 2024


Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,David Green
 <david.green at arm.com>,Martin =?utf-8?q?Storsjö?= <martin at martin.st>,Evgeniy
 <evgeniy.tyurin at intel.com>,
=?utf-8?q?Balázs_Kéri?= <balazs.keri at ericsson.com>,Simon Camphausen
 <simon.camphausen at iml.fraunhofer.de>,Jeremy Morse <jeremy.morse at sony.com>,Jeremy
 Morse <jeremy.morse at sony.com>,David Green <david.green at arm.com>,Alex
 Bradbury <asb at igalia.com>,Michael Buch <michaelbuch12 at gmail.com>,Simon
 Pilgrim <llvm-dev at redking.me.uk>,Jeremy Morse <jeremy.morse at sony.com>,Simon
 Pilgrim <llvm-dev at redking.me.uk>,Sergio Afonso <safonsof at amd.com>,Zain
 Jaffal <zain at jjaffal.com>,Zain Jaffal <zain at jjaffal.com>,whisperity
 <whisperity at gmail.com>,Simon Pilgrim <llvm-dev at redking.me.uk>,Jeremy Morse
 <jeremy.morse at sony.com>,agozillon <Andrew.Gozillon at amd.com>,
Martin =?utf-8?q?Storsjö?= <martin at martin.st>,Mariya Podchishchaeva
 <mariya.podchishchaeva at intel.com>,Uday Bondhugula <uday at polymagelabs.com>,
Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,Nikita Popov
 <npopov at redhat.com>,Yingwei Zheng <dtcxzyw2333 at gmail.com>,
Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,Shilei Tian <i at tianshilei.me>
 =?utf-8?q?,?=Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,Tarun Prabhu
 <tarun at lanl.gov>,Timm =?utf-8?q?Bäder?= <tbaeder at redhat.com>,Louis
 Dionne <ldionne.2 at gmail.com>,ostannard <oliver.stannard at arm.com>,Daniel Chen
 <cdchen at ca.ibm.com>,Nikita Popov <npopov at redhat.com>,Francesco Petrogalli
 <francesco.petrogalli at apple.com>,erichkeane <ekeane at nvidia.com>,ian Bearman
 <ianb at microsoft.com>,Jeremy Morse <jeremy.morse at sony.com>,Ivan Kosarev
 <ivan.kosarev at amd.com>,Simon Pilgrim <llvm-dev at redking.me.uk>,Simon Pilgrim
 <llvm-dev at redking.me.uk>,Simon Pilgrim <llvm-dev at redking.me.uk>,stephenpeckham
 <118857872+stephenpeckham at users.noreply.github.com>,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Adrian Prantl
 <aprantl at apple.com>,Adrian Prantl <aprantl at apple.com>,Jeremy Morse
 <jeremy.morse at sony.com>,Jason Molenda <jmolenda at apple.com>,Dave Lee
 <davelee.com at gmail.com>,Simon Pilgrim <llvm-dev at redking.me.uk>,Philip Reames
 <preames at rivosinc.com>,Cooper Partin <coopp at microsoft.com>,S. Bharadwaj
 Yadavalli <Bharadwaj.Yadavalli at microsoft.com>=?utf-8?q?,?Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Krystian Stasiowski
 <sdkrystian at gmail.com>,Peiming Liu
 <36770114+PeimingLiu at users.noreply.github.com>,Jan Svoboda
 <jan_svoboda at apple.com>,Jeremy Morse <jeremy.morse at sony.com>,Nikolas Klauser
 <nikolasklauser at berlin.de>,Nikolas Klauser <nikolasklauser at berlin.de>,Nikolas
 Klauser <nikolasklauser at berlin.de>,Valentin Clement <clementval at gmail.com>,
Nicolai =?utf-8?q?Hähnle?= <nicolai.haehnle at amd.com>,mahtohappy
 <Happy.Kumar at windriver.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/81042 at github.com>


https://github.com/mahtohappy updated https://github.com/llvm/llvm-project/pull/81042

>From 9271e67ab27f850413e3d6d6f1383454067efe75 Mon Sep 17 00:00:00 2001
From: mahtohappy <Happy.Kumar at windriver.com>
Date: Wed, 7 Feb 2024 13:29:45 -0800
Subject: [PATCH 01/72] Diagnosis for constexpr constructor not initializing a
 union member

---
 clang/lib/Sema/SemaDeclCXX.cpp                | 19 +++++++++++++++++++
 .../CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp |  3 +++
 .../SemaCXX/constexpr-union-temp-ctor-cxx.cpp | 18 ++++++++++++++++++
 3 files changed, 40 insertions(+)
 create mode 100644 clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp

diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 5adc262cd6bc97..ef4e274389f60b 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -2393,6 +2393,25 @@ static bool CheckConstexprFunctionBody(Sema &SemaRef, const FunctionDecl *Dcl,
                                              Kind))
             return false;
       }
+    } else if(!Constructor->isDelegatingConstructor()){
+      for(const Decl* decl : RD->decls()){
+        if(const auto* inner = dyn_cast<CXXRecordDecl>(decl)){
+          if(inner->isUnion()){
+              if (Constructor->getNumCtorInitializers() == 0 &&
+                RD->hasVariantMembers()) {
+              if (Kind == Sema::CheckConstexprKind::Diagnose) {
+                SemaRef.Diag(
+                    Dcl->getLocation(),
+                    SemaRef.getLangOpts().CPlusPlus20
+                        ? diag::warn_cxx17_compat_constexpr_union_ctor_no_init
+                        : diag::ext_constexpr_union_ctor_no_init);
+              } else if (!SemaRef.getLangOpts().CPlusPlus20) {
+                return false;
+              }
+            }
+          }
+        }
+      }
     }
   } else {
     if (ReturnStmts.empty()) {
diff --git a/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp b/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
index f1f677ebfcd341..0d9b4d740a7c11 100644
--- a/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
+++ b/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
@@ -224,6 +224,9 @@ struct TemplateInit {
   };
   // FIXME: This is ill-formed (no diagnostic required). We should diagnose it.
   constexpr TemplateInit() {} // desired-error {{must initialize all members}}
+#ifndef CXX2A
+  // expected-error at 226 {{constexpr union constructor that does not initialize any member is a C++20 extension}}
+#endif
 };
 template<typename T> struct TemplateInit2 {
   Literal l;
diff --git a/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp b/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp
new file mode 100644
index 00000000000000..1300641f28f1c6
--- /dev/null
+++ b/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -std=c++14 -verify -fcxx-exceptions -Werror=c++14-extensions -Werror=c++20-extensions %s
+
+template <class> struct C {
+    union {
+      int i;
+    };
+    constexpr C() {} // expected-error {{constexpr union constructor that does not initialize any member is a C++20 extension}}
+};
+constexpr C<int> c;
+
+template <class> class D {
+    union {
+      int i;
+    };
+public:
+    constexpr D() {} // expected-error {{constexpr union constructor that does not initialize any member is a C++20 extension}}
+};
+constexpr D<int> d;
\ No newline at end of file

>From a6e1168db911fbb6778f2711f685a0feaf97a9f5 Mon Sep 17 00:00:00 2001
From: mahtohappy <Happy.Kumar at windriver.com>
Date: Thu, 8 Feb 2024 00:55:58 -0800
Subject: [PATCH 02/72] [Clang][Sema] Diagnosis for constexpr constructor not
 initializing a union member

---
 clang/docs/ReleaseNotes.rst                          | 2 +-
 clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp | 2 +-
 clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1d278fe032d264..46a03b7c91220d 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -150,7 +150,7 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses member template declarations with multiple declarators.
 - Clang now diagnoses use of the ``template`` keyword after declarative nested name specifiers.
-- Clang now diagnoses constexpr constructor for not initializing atleast one member
+- Clang now diagnoses constexpr constructor for not initializing atleast one member of union
 - Fixes(`#46689 Constexpr constructor not initializing a union member is not diagnosed`)
 
 Improvements to Clang's time-trace
diff --git a/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp b/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
index 0d9b4d740a7c11..37c9e2c36ad657 100644
--- a/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
+++ b/clang/test/CXX/dcl.dcl/dcl.spec/dcl.constexpr/p4.cpp
@@ -225,7 +225,7 @@ struct TemplateInit {
   // FIXME: This is ill-formed (no diagnostic required). We should diagnose it.
   constexpr TemplateInit() {} // desired-error {{must initialize all members}}
 #ifndef CXX2A
-  // expected-error at 226 {{constexpr union constructor that does not initialize any member is a C++20 extension}}
+  // expected-error at -2 {{constexpr union constructor that does not initialize any member is a C++20 extension}}
 #endif
 };
 template<typename T> struct TemplateInit2 {
diff --git a/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp b/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp
index 1300641f28f1c6..519e0557a0e393 100644
--- a/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp
+++ b/clang/test/SemaCXX/constexpr-union-temp-ctor-cxx.cpp
@@ -15,4 +15,4 @@ template <class> class D {
 public:
     constexpr D() {} // expected-error {{constexpr union constructor that does not initialize any member is a C++20 extension}}
 };
-constexpr D<int> d;
\ No newline at end of file
+constexpr D<int> d;

>From ed3e358d13da3e984584400e08e8d3e0b637b426 Mon Sep 17 00:00:00 2001
From: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: Thu, 8 Feb 2024 14:12:39 +0530
Subject: [PATCH 03/72] Reapply "InstCombine: Introduce
 SimplifyDemandedUseFPClass"" (#74056)

This reverts commit ef388334ee5a3584255b9ef5b3fefdb244fa3fd7.

The referenced issue violates the spec for finite-only math only by
using a return value for a constant infinity. If the interpretation
is results and arguments cannot violate nofpclass, then any
std::numeric_limits<T>::infinity() result is invalid under
-ffinite-math-only. Without this interpretation the utility of
nofpclass is slashed.
---
 llvm/include/llvm/Analysis/ValueTracking.h    |   4 +
 .../InstCombine/InstCombineInternal.h         |   9 +
 .../InstCombineSimplifyDemanded.cpp           | 136 ++++++++++++
 .../InstCombine/InstructionCombining.cpp      |  27 ++-
 .../InstCombine/simplify-demanded-fpclass.ll  | 209 +++++++-----------
 5 files changed, 251 insertions(+), 134 deletions(-)

diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index d9287ae9e5e986..06f94f58ae5eff 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -248,6 +248,10 @@ struct KnownFPClass {
   /// definitely set or false if the sign bit is definitely unset.
   std::optional<bool> SignBit;
 
+  bool operator==(KnownFPClass Other) const {
+    return KnownFPClasses == Other.KnownFPClasses && SignBit == Other.SignBit;
+  }
+
   /// Return true if it's known this can never be one of the mask entries.
   bool isKnownNever(FPClassTest Mask) const {
     return (KnownFPClasses & Mask) == fcNone;
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
index 97459a8fbd6574..7f6618fc5b737c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+++ b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
@@ -566,6 +566,15 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final
                                     APInt &PoisonElts, unsigned Depth = 0,
                                     bool AllowMultipleUsers = false) override;
 
+  /// Attempts to replace V with a simpler value based on the demanded
+  /// floating-point classes
+  Value *SimplifyDemandedUseFPClass(Value *V, FPClassTest DemandedMask,
+                                    KnownFPClass &Known, unsigned Depth,
+                                    Instruction *CxtI);
+  bool SimplifyDemandedFPClass(Instruction *I, unsigned Op,
+                               FPClassTest DemandedMask, KnownFPClass &Known,
+                               unsigned Depth = 0);
+
   /// Canonicalize the position of binops relative to shufflevector.
   Instruction *foldVectorBinop(BinaryOperator &Inst);
   Instruction *foldVectorSelect(SelectInst &Sel);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index 79873a9b4cbb4c..be6ee9d96d2630 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
 
   return MadeChange ? I : nullptr;
 }
+
+/// For floating-point classes that resolve to a single bit pattern, return that
+/// value.
+static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
+  switch (Mask) {
+  case fcPosZero:
+    return ConstantFP::getZero(Ty);
+  case fcNegZero:
+    return ConstantFP::getZero(Ty, true);
+  case fcPosInf:
+    return ConstantFP::getInfinity(Ty);
+  case fcNegInf:
+    return ConstantFP::getInfinity(Ty, true);
+  case fcNone:
+    return PoisonValue::get(Ty);
+  default:
+    return nullptr;
+  }
+}
+
+Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
+    Value *V, const FPClassTest DemandedMask, KnownFPClass &Known,
+    unsigned Depth, Instruction *CxtI) {
+  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  Type *VTy = V->getType();
+
+  assert(Known == KnownFPClass() && "expected uninitialized state");
+
+  if (DemandedMask == fcNone)
+    return isa<UndefValue>(V) ? nullptr : PoisonValue::get(VTy);
+
+  if (Depth == MaxAnalysisRecursionDepth)
+    return nullptr;
+
+  Instruction *I = dyn_cast<Instruction>(V);
+  if (!I) {
+    // Handle constants and arguments
+    Known = computeKnownFPClass(V, fcAllFlags, CxtI, Depth + 1);
+    Value *FoldedToConst =
+        getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+    return FoldedToConst == V ? nullptr : FoldedToConst;
+  }
+
+  if (!I->hasOneUse())
+    return nullptr;
+
+  // TODO: Should account for nofpclass/FastMathFlags on current instruction
+  switch (I->getOpcode()) {
+  case Instruction::FNeg: {
+    if (SimplifyDemandedFPClass(I, 0, llvm::fneg(DemandedMask), Known,
+                                Depth + 1))
+      return I;
+    Known.fneg();
+    break;
+  }
+  case Instruction::Call: {
+    CallInst *CI = cast<CallInst>(I);
+    switch (CI->getIntrinsicID()) {
+    case Intrinsic::fabs:
+      if (SimplifyDemandedFPClass(I, 0, llvm::inverse_fabs(DemandedMask), Known,
+                                  Depth + 1))
+        return I;
+      Known.fabs();
+      break;
+    case Intrinsic::arithmetic_fence:
+      if (SimplifyDemandedFPClass(I, 0, DemandedMask, Known, Depth + 1))
+        return I;
+      break;
+    case Intrinsic::copysign: {
+      // Flip on more potentially demanded classes
+      const FPClassTest DemandedMaskAnySign = llvm::unknown_sign(DemandedMask);
+      if (SimplifyDemandedFPClass(I, 0, DemandedMaskAnySign, Known, Depth + 1))
+        return I;
+
+      if ((DemandedMask & fcPositive) == fcNone) {
+        // Roundabout way of replacing with fneg(fabs)
+        I->setOperand(1, ConstantFP::get(VTy, -1.0));
+        return I;
+      }
+
+      if ((DemandedMask & fcNegative) == fcNone) {
+        // Roundabout way of replacing with fabs
+        I->setOperand(1, ConstantFP::getZero(VTy));
+        return I;
+      }
+
+      KnownFPClass KnownSign =
+          computeKnownFPClass(I->getOperand(1), fcAllFlags, CxtI, Depth + 1);
+      Known.copysign(KnownSign);
+      break;
+    }
+    default:
+      Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);
+      break;
+    }
+
+    break;
+  }
+  case Instruction::Select: {
+    KnownFPClass KnownLHS, KnownRHS;
+    if (SimplifyDemandedFPClass(I, 2, DemandedMask, KnownRHS, Depth + 1) ||
+        SimplifyDemandedFPClass(I, 1, DemandedMask, KnownLHS, Depth + 1))
+      return I;
+
+    if (KnownLHS.isKnownNever(DemandedMask))
+      return I->getOperand(2);
+    if (KnownRHS.isKnownNever(DemandedMask))
+      return I->getOperand(1);
+
+    // TODO: Recognize clamping patterns
+    Known = KnownLHS | KnownRHS;
+    break;
+  }
+  default:
+    Known = computeKnownFPClass(I, ~DemandedMask, CxtI, Depth + 1);
+    break;
+  }
+
+  return getFPClassConstant(VTy, DemandedMask & Known.KnownFPClasses);
+}
+
+bool InstCombinerImpl::SimplifyDemandedFPClass(Instruction *I, unsigned OpNo,
+                                               FPClassTest DemandedMask,
+                                               KnownFPClass &Known,
+                                               unsigned Depth) {
+  Use &U = I->getOperandUse(OpNo);
+  Value *NewVal =
+      SimplifyDemandedUseFPClass(U.get(), DemandedMask, Known, Depth, I);
+  if (!NewVal)
+    return false;
+  if (Instruction *OpInst = dyn_cast<Instruction>(U))
+    salvageDebugInfo(*OpInst);
+
+  replaceUse(U, NewVal);
+  return true;
+}
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 9e8bcbc8e156e2..b1e2262fac4794 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -142,6 +142,12 @@ static cl::opt<unsigned>
 MaxArraySize("instcombine-maxarray-size", cl::init(1024),
              cl::desc("Maximum array size considered when doing a combine"));
 
+// TODO: Remove this option
+static cl::opt<bool> EnableSimplifyDemandedUseFPClass(
+    "instcombine-simplify-demanded-fp-class",
+    cl::desc("Enable demanded floating-point class optimizations"),
+    cl::init(false));
+
 // FIXME: Remove this flag when it is no longer necessary to convert
 // llvm.dbg.declare to avoid inaccurate debug info. Setting this to false
 // increases variable availability at the cost of accuracy. Variables that
@@ -3105,8 +3111,25 @@ Instruction *InstCombinerImpl::visitFree(CallInst &FI, Value *Op) {
 }
 
 Instruction *InstCombinerImpl::visitReturnInst(ReturnInst &RI) {
-  // Nothing for now.
-  return nullptr;
+  if (!EnableSimplifyDemandedUseFPClass)
+    return nullptr;
+
+  Value *RetVal = RI.getReturnValue();
+  if (!RetVal || !AttributeFuncs::isNoFPClassCompatibleType(RetVal->getType()))
+    return nullptr;
+
+  Function *F = RI.getFunction();
+  FPClassTest ReturnClass = F->getAttributes().getRetNoFPClass();
+  if (ReturnClass == fcNone)
+    return nullptr;
+
+  KnownFPClass KnownClass;
+  Value *Simplified =
+      SimplifyDemandedUseFPClass(RetVal, ~ReturnClass, KnownClass, 0, &RI);
+  if (!Simplified)
+    return nullptr;
+
+  return ReturnInst::Create(RI.getContext(), Simplified);
 }
 
 // WARNING: keep in sync with SimplifyCFGOpt::simplifyUnreachable()!
diff --git a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
index 9817b6e13ca8ae..dd9b71415bd6d9 100644
--- a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+++ b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
-; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+; RUN: opt -S -passes=instcombine -instcombine-simplify-demanded-fp-class < %s | FileCheck %s
 
 declare float @llvm.fabs.f32(float)
 declare float @llvm.copysign.f32(float, float)
@@ -42,7 +42,7 @@ define nofpclass(inf) float @ret_nofpclass_inf_undef() {
 define nofpclass(all) float @ret_nofpclass_all_var(float %arg) {
 ; CHECK-LABEL: define nofpclass(all) float @ret_nofpclass_all_var
 ; CHECK-SAME: (float [[ARG:%.*]]) {
-; CHECK-NEXT:    ret float [[ARG]]
+; CHECK-NEXT:    ret float poison
 ;
   ret float %arg
 }
@@ -51,7 +51,7 @@ define nofpclass(all) float @ret_nofpclass_all_var(float %arg) {
 define nofpclass(all) <2 x float> @ret_nofpclass_all_var_vector(<2 x float> %arg) {
 ; CHECK-LABEL: define nofpclass(all) <2 x float> @ret_nofpclass_all_var_vector
 ; CHECK-SAME: (<2 x float> [[ARG:%.*]]) {
-; CHECK-NEXT:    ret <2 x float> [[ARG]]
+; CHECK-NEXT:    ret <2 x float> poison
 ;
   ret <2 x float> %arg
 }
@@ -65,14 +65,14 @@ define nofpclass(inf) float @ret_nofpclass_inf__0() {
 
 define nofpclass(inf) float @ret_nofpclass_inf__pinf() {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__pinf() {
-; CHECK-NEXT:    ret float 0x7FF0000000000000
+; CHECK-NEXT:    ret float poison
 ;
   ret float 0x7FF0000000000000
 }
 
 define nofpclass(pinf) float @ret_nofpclass_pinf__pinf() {
 ; CHECK-LABEL: define nofpclass(pinf) float @ret_nofpclass_pinf__pinf() {
-; CHECK-NEXT:    ret float 0x7FF0000000000000
+; CHECK-NEXT:    ret float poison
 ;
   ret float 0x7FF0000000000000
 }
@@ -86,7 +86,7 @@ define nofpclass(pinf) float @ret_nofpclass_pinf__ninf() {
 
 define nofpclass(inf) float @ret_nofpclass_inf__ninf() {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__ninf() {
-; CHECK-NEXT:    ret float 0xFFF0000000000000
+; CHECK-NEXT:    ret float poison
 ;
   ret float 0xFFF0000000000000
 }
@@ -106,8 +106,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_inf_lhs(i1 %con
 define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_lhs(i1 %cond, float nofpclass(nan norm zero sub) %x, float %y) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_lhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float nofpclass(nan zero sub norm) [[X:%.*]], float [[Y:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float [[Y]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[Y]]
 ;
   %select = select i1 %cond, float %x, float %y
   ret float %select
@@ -117,8 +116,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_lh
 define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_rhs(i1 %cond, float %x, float nofpclass(nan norm zero sub) %y) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float nofpclass(nan zero sub norm) [[Y:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float [[Y]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %select = select i1 %cond, float %x, float %y
   ret float %select
@@ -128,8 +126,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_arg_only_inf_rh
 define nofpclass(inf) [3 x [2 x float]] @ret_float_array(i1 %cond, [3 x [2 x float]] nofpclass(nan norm zero sub) %x, [3 x [2 x float]] %y) {
 ; CHECK-LABEL: define nofpclass(inf) [3 x [2 x float]] @ret_float_array
 ; CHECK-SAME: (i1 [[COND:%.*]], [3 x [2 x float]] nofpclass(nan zero sub norm) [[X:%.*]], [3 x [2 x float]] [[Y:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], [3 x [2 x float]] [[X]], [3 x [2 x float]] [[Y]]
-; CHECK-NEXT:    ret [3 x [2 x float]] [[SELECT]]
+; CHECK-NEXT:    ret [3 x [2 x float]] [[Y]]
 ;
   %select = select i1 %cond, [3 x [2 x float]] %x, [3 x [2 x float]] %y
   ret [3 x [2 x float ]] %select
@@ -139,8 +136,7 @@ define nofpclass(inf) [3 x [2 x float]] @ret_float_array(i1 %cond, [3 x [2 x flo
 define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_lhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_lhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0x7FF0000000000000, float [[X]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %select = select i1 %cond, float 0x7FF0000000000000, float %x
   ret float %select
@@ -150,8 +146,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_lhs(i1 %cond, float
 define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
   ret float %select
@@ -161,8 +156,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_rhs(i1 %cond, float
 define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_or_ninf(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_or_ninf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0x7FF0000000000000, float 0xFFF0000000000000
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float poison
 ;
   %select = select i1 %cond, float 0x7FF0000000000000, float 0xFFF0000000000000
   ret float %select
@@ -172,8 +166,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_pinf_or_ninf(i1 %cond, fl
 define nofpclass(inf) float @ret_nofpclass_inf__select_ninf_or_pinf(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_ninf_or_pinf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0xFFF0000000000000, float 0x7FF0000000000000
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float poison
 ;
   %select = select i1 %cond, float 0xFFF0000000000000, float 0x7FF0000000000000
   ret float %select
@@ -183,8 +176,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_ninf_or_pinf(i1 %cond, fl
 define nofpclass(ninf) float @ret_nofpclass_ninf__select_ninf_or_pinf(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(ninf) float @ret_nofpclass_ninf__select_ninf_or_pinf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0xFFF0000000000000, float 0x7FF0000000000000
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float 0x7FF0000000000000
 ;
   %select = select i1 %cond, float 0xFFF0000000000000, float 0x7FF0000000000000
   ret float %select
@@ -194,8 +186,7 @@ define nofpclass(ninf) float @ret_nofpclass_ninf__select_ninf_or_pinf(i1 %cond,
 define nofpclass(pinf) float @ret_nofpclass_pinf__select_ninf_or_pinf(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(pinf) float @ret_nofpclass_pinf__select_ninf_or_pinf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0xFFF0000000000000, float 0x7FF0000000000000
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float 0xFFF0000000000000
 ;
   %select = select i1 %cond, float 0xFFF0000000000000, float 0x7FF0000000000000
   ret float %select
@@ -205,8 +196,7 @@ define nofpclass(pinf) float @ret_nofpclass_pinf__select_ninf_or_pinf(i1 %cond,
 define nofpclass(zero) float @ret_nofpclass_zero__select_pzero_or_nzero(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(zero) float @ret_nofpclass_zero__select_pzero_or_nzero
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0.000000e+00, float -0.000000e+00
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float poison
 ;
   %select = select i1 %cond, float 0.0, float -0.0
   ret float %select
@@ -216,8 +206,7 @@ define nofpclass(zero) float @ret_nofpclass_zero__select_pzero_or_nzero(i1 %cond
 define nofpclass(nzero) float @ret_nofpclass_nzero__select_pzero_or_nzero(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(nzero) float @ret_nofpclass_nzero__select_pzero_or_nzero
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0.000000e+00, float -0.000000e+00
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float 0.000000e+00
 ;
   %select = select i1 %cond, float 0.0, float -0.0
   ret float %select
@@ -227,8 +216,7 @@ define nofpclass(nzero) float @ret_nofpclass_nzero__select_pzero_or_nzero(i1 %co
 define nofpclass(pzero) float @ret_nofpclass_pzero__select_pzero_or_nzero(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(pzero) float @ret_nofpclass_pzero__select_pzero_or_nzero
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0.000000e+00, float -0.000000e+00
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float -0.000000e+00
 ;
   %select = select i1 %cond, float 0.0, float -0.0
   ret float %select
@@ -238,8 +226,7 @@ define nofpclass(pzero) float @ret_nofpclass_pzero__select_pzero_or_nzero(i1 %co
 define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector(<2 x i1> %cond, <2 x float> %x) {
 ; CHECK-LABEL: define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector
 ; CHECK-SAME: (<2 x i1> [[COND:%.*]], <2 x float> [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select <2 x i1> [[COND]], <2 x float> <float 0x7FF0000000000000, float 0x7FF0000000000000>, <2 x float> [[X]]
-; CHECK-NEXT:    ret <2 x float> [[SELECT]]
+; CHECK-NEXT:    ret <2 x float> [[X]]
 ;
   %select = select <2 x i1> %cond, <2 x float> <float 0x7FF0000000000000, float 0x7FF0000000000000>, <2 x float> %x
   ret <2 x float> %select
@@ -249,8 +236,7 @@ define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector(<2
 define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector_undef(<2 x i1> %cond, <2 x float> %x) {
 ; CHECK-LABEL: define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector_undef
 ; CHECK-SAME: (<2 x i1> [[COND:%.*]], <2 x float> [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select <2 x i1> [[COND]], <2 x float> <float 0x7FF0000000000000, float poison>, <2 x float> [[X]]
-; CHECK-NEXT:    ret <2 x float> [[SELECT]]
+; CHECK-NEXT:    ret <2 x float> [[X]]
 ;
   %select = select <2 x i1> %cond, <2 x float> <float 0x7FF0000000000000, float poison>, <2 x float> %x
   ret <2 x float> %select
@@ -260,8 +246,7 @@ define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_pinf_lhs_vector_und
 define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_mixed_inf_lhs_vector(<2 x i1> %cond, <2 x float> %x) {
 ; CHECK-LABEL: define nofpclass(inf) <2 x float> @ret_nofpclass_inf__select_mixed_inf_lhs_vector
 ; CHECK-SAME: (<2 x i1> [[COND:%.*]], <2 x float> [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select <2 x i1> [[COND]], <2 x float> <float 0x7FF0000000000000, float 0xFFF0000000000000>, <2 x float> [[X]]
-; CHECK-NEXT:    ret <2 x float> [[SELECT]]
+; CHECK-NEXT:    ret <2 x float> [[X]]
 ;
   %select = select <2 x i1> %cond, <2 x float> <float 0x7FF0000000000000, float 0xFFF0000000000000>, <2 x float> %x
   ret <2 x float> %select
@@ -327,8 +312,7 @@ define nofpclass(nan) float @ret_nofpclass_nan__select_pinf_rhs(i1 %cond, float
 define nofpclass(inf nan) float @ret_nofpclass_inf_nan__select_chain_inf_nan_0(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(nan inf) float @ret_nofpclass_inf_nan__select_chain_inf_nan_0
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT1:%.*]] = select i1 [[COND]], float 0x7FF0000000000000, float [[X]]
-; CHECK-NEXT:    ret float [[SELECT1]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %select0 = select i1 %cond, float 0x7FF8000000000000, float %x
   %select1 = select i1 %cond, float 0x7FF0000000000000, float %select0
@@ -338,8 +322,7 @@ define nofpclass(inf nan) float @ret_nofpclass_inf_nan__select_chain_inf_nan_0(i
 define nofpclass(inf nan) float @ret_nofpclass_inf_nan__select_chain_inf_nan_1(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(nan inf) float @ret_nofpclass_inf_nan__select_chain_inf_nan_1
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT1:%.*]] = select i1 [[COND]], float 0x7FF0000000000000, float 0x7FF8000000000000
-; CHECK-NEXT:    ret float [[SELECT1]]
+; CHECK-NEXT:    ret float poison
 ;
   %select0 = select i1 %cond, float %x, float 0x7FF8000000000000
   %select1 = select i1 %cond, float 0x7FF0000000000000, float %select0
@@ -360,8 +343,7 @@ define nofpclass(nan) float @ret_nofpclass_nan__select_chain_inf_nan(i1 %cond, f
 define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_0(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_0
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT1:%.*]] = select i1 [[COND]], float 0x7FF0000000000000, float [[X]]
-; CHECK-NEXT:    ret float [[SELECT1]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %select0 = select i1 %cond, float 0x7FF8000000000000, float %x
   %select1 = select i1 %cond, float 0x7FF0000000000000, float %select0
@@ -371,8 +353,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_0(i1 %cond,
 define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_1(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_1
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT1:%.*]] = select i1 [[COND]], float 0x7FF8000000000000, float 0x7FF0000000000000
-; CHECK-NEXT:    ret float [[SELECT1]]
+; CHECK-NEXT:    ret float 0x7FF8000000000000
 ;
   %select0 = select i1 %cond, float 0x7FF8000000000000, float %x
   %select1 = select i1 %cond, float %select0, float 0x7FF0000000000000
@@ -383,8 +364,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_chain_inf_nan_1(i1 %cond,
 define nofpclass(inf) float @ret_nofpclass_inf__fabs_select_ninf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__fabs_select_ninf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0xFFF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[FABS]]
 ;
   %select = select i1 %cond, float %x, float 0xFFF0000000000000
@@ -396,8 +376,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__fabs_select_ninf_rhs(i1 %cond, f
 define nofpclass(inf) float @ret_nofpclass_inf__fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[FABS]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -421,8 +400,7 @@ define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_no_negatives__fabs_
 define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_no_positives__fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(pinf pzero psub pnorm) float @ret_nofpclass_no_positives__fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[FABS]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -446,9 +424,7 @@ define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_no_negatives_na
 define nofpclass(nan pinf pnorm psub pzero) float @ret_nofpclass_no_positives_nan__fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(nan pinf pzero psub pnorm) float @ret_nofpclass_no_positives_nan__fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
-; CHECK-NEXT:    ret float [[FABS]]
+; CHECK-NEXT:    ret float poison
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
   %fabs = call float @llvm.fabs.f32(float %select)
@@ -459,8 +435,7 @@ define nofpclass(nan pinf pnorm psub pzero) float @ret_nofpclass_no_positives_na
 define nofpclass(inf) float @ret_nofpclass_inf__fneg_select_ninf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__fneg_select_ninf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0xFFF0000000000000
-; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[SELECT]]
+; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[X]]
 ; CHECK-NEXT:    ret float [[FNEG]]
 ;
   %select = select i1 %cond, float %x, float 0xFFF0000000000000
@@ -472,8 +447,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__fneg_select_ninf_rhs(i1 %cond, f
 define nofpclass(inf nnorm nsub nzero) float @ret_nofpclass_nonegatives_noinf___fneg_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf nzero nsub nnorm) float @ret_nofpclass_nonegatives_noinf___fneg_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[SELECT]]
+; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[X]]
 ; CHECK-NEXT:    ret float [[FNEG]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -485,8 +459,7 @@ define nofpclass(inf nnorm nsub nzero) float @ret_nofpclass_nonegatives_noinf___
 define nofpclass(inf nnorm nsub nzero) float @ret_nofpclass_nonegatives_noinf___fneg_select_ninf_lhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf nzero nsub nnorm) float @ret_nofpclass_nonegatives_noinf___fneg_select_ninf_lhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float 0xFFF0000000000000, float [[X]]
-; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[SELECT]]
+; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[X]]
 ; CHECK-NEXT:    ret float [[FNEG]]
 ;
   %select = select i1 %cond, float 0xFFF0000000000000, float %x
@@ -510,8 +483,7 @@ define nofpclass(pzero psub pnorm pinf) float @ret_nofpclass_nopositives___fneg_
 define nofpclass(inf) float @ret_nofpclass_inf__fneg_fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__fneg_fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[FABS]]
 ; CHECK-NEXT:    ret float [[FNEG]]
 ;
@@ -525,8 +497,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__fneg_fabs_select_pinf_rhs(i1 %co
 define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives__fneg_fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(ninf nzero nsub nnorm) float @ret_nofpclass_nonegatives__fneg_fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[FABS]]
 ; CHECK-NEXT:    ret float [[FNEG]]
 ;
@@ -541,10 +512,7 @@ define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives__fneg_f
 define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_nonan__fneg_fabs_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(nan ninf nzero nsub nnorm) float @ret_nofpclass_nonegatives_nonan__fneg_fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FABS:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
-; CHECK-NEXT:    [[FNEG:%.*]] = fneg float [[FABS]]
-; CHECK-NEXT:    ret float [[FNEG]]
+; CHECK-NEXT:    ret float poison
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
   %fabs = call float @llvm.fabs.f32(float %select)
@@ -556,8 +524,7 @@ define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_non
 define nofpclass(inf) float @ret_nofpclass_inf__copysign_unknown_select_pinf_rhs(i1 %cond, float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -568,8 +535,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__copysign_unknown_select_pinf_rhs
 define nofpclass(inf) float @ret_nofpclass_inf__copysign_positive_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__copysign_positive_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -580,8 +546,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__copysign_positive_select_pinf_rh
 define nofpclass(inf) float @ret_nofpclass_inf__copysign_negative_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__copysign_negative_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
@@ -594,7 +559,8 @@ define nofpclass(inf) float @ret_nofpclass_inf__copysign_negative_select_pinf_rh
 define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives_copysign(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(pinf pzero psub pnorm) float @ret_nofpclass_nopositives_copysign
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[X]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -605,7 +571,8 @@ define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives_copysig
 define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives_copysign_nnan_flag(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(pinf pzero psub pnorm) float @ret_nofpclass_nopositives_copysign_nnan_flag
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call nnan float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call nnan float @llvm.fabs.f32(float [[X]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg nnan float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call nnan float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -616,7 +583,8 @@ define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives_copysig
 define nofpclass(nan pinf pnorm psub pzero) float @ret_nofpclass_nopositives_nonan_copysign(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(nan pinf pzero psub pnorm) float @ret_nofpclass_nopositives_nonan_copysign
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[X]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -627,7 +595,7 @@ define nofpclass(nan pinf pnorm psub pzero) float @ret_nofpclass_nopositives_non
 define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_copysign(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(ninf nzero nsub nnorm) float @ret_nofpclass_nonegatives_copysign
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -638,7 +606,7 @@ define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_copysig
 define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_copysign_nnan_flag(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(ninf nzero nsub nnorm) float @ret_nofpclass_nonegatives_copysign_nnan_flag
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call nnan float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call nnan float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call nnan float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -649,7 +617,7 @@ define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_copysig
 define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_nonan_copysign(float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(nan ninf nzero nsub nnorm) float @ret_nofpclass_nonegatives_nonan_copysign
 ; CHECK-SAME: (float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[X]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %copysign = call float @llvm.copysign.f32(float %x, float %unknown.sign)
@@ -659,8 +627,7 @@ define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_nonegatives_non
 define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives__copysign_fabs_select_pinf_rhs(i1 %cond, float %x, float %sign) {
 ; CHECK-LABEL: define nofpclass(pinf pzero psub pnorm) float @ret_nofpclass_nopositives__copysign_fabs_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[SIGN:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -673,8 +640,7 @@ define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_nopositives__copysi
 define nofpclass(inf nnorm nsub nzero) float @ret_nofpclass_no_negatives_noinf__copysign_unknown_select_pinf_rhs(i1 %cond, float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(inf nzero nsub nnorm) float @ret_nofpclass_no_negatives_noinf__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -686,8 +652,8 @@ define nofpclass(inf nnorm nsub nzero) float @ret_nofpclass_no_negatives_noinf__
 define nofpclass(inf pnorm psub pzero) float @ret_nofpclass_no_positives_noinf__copysign_unknown_select_pinf_rhs(i1 %cond, float %x, float %unknown.sign) {
 ; CHECK-LABEL: define nofpclass(inf pzero psub pnorm) float @ret_nofpclass_no_positives_noinf__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[X]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -700,7 +666,7 @@ define nofpclass(ninf nnorm nsub nzero) float @ret_nofpclass_no_negatives__copys
 ; CHECK-LABEL: define nofpclass(ninf nzero nsub nnorm) float @ret_nofpclass_no_negatives__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
 ; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -713,7 +679,8 @@ define nofpclass(pinf pnorm psub pzero) float @ret_nofpclass_no_positives__copys
 ; CHECK-LABEL: define nofpclass(pinf pzero psub pnorm) float @ret_nofpclass_no_positives__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
 ; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -726,7 +693,7 @@ define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_no_negatives_no
 ; CHECK-LABEL: define nofpclass(nan ninf nzero nsub nnorm) float @ret_nofpclass_no_negatives_nonan__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
 ; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -739,7 +706,8 @@ define nofpclass(nan pinf pnorm psub pzero) float @ret_nofpclass_no_positives_no
 ; CHECK-LABEL: define nofpclass(nan pinf pzero psub pnorm) float @ret_nofpclass_no_positives_nonan__copysign_unknown_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]], float [[UNKNOWN_SIGN:%.*]]) {
 ; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[COPYSIGN:%.*]] = call float @llvm.copysign.f32(float [[SELECT]], float [[UNKNOWN_SIGN]])
+; CHECK-NEXT:    [[TMP1:%.*]] = call float @llvm.fabs.f32(float [[SELECT]])
+; CHECK-NEXT:    [[COPYSIGN:%.*]] = fneg float [[TMP1]]
 ; CHECK-NEXT:    ret float [[COPYSIGN]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000
@@ -790,9 +758,7 @@ define nofpclass(nan ninf nnorm nsub nzero) float @ret_nofpclass_nan_negatives__
 define nofpclass(nan ninf nnorm nsub zero) float @ret_nofpclass_nan_negatives_zero__select_clamp_pos_to_zero(float %x) {
 ; CHECK-LABEL: define nofpclass(nan ninf zero nsub nnorm) float @ret_nofpclass_nan_negatives_zero__select_clamp_pos_to_zero
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:    [[IS_GT_ZERO:%.*]] = fcmp ogt float [[X]], 0.000000e+00
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[IS_GT_ZERO]], float 0.000000e+00, float [[X]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %is.gt.zero = fcmp ogt float %x, 0.0
   %select = select i1 %is.gt.zero, float 0.0, float %x
@@ -803,9 +769,7 @@ define nofpclass(nan ninf nnorm nsub zero) float @ret_nofpclass_nan_negatives_ze
 define nofpclass(ninf nnorm nsub zero) float @ret_nofpclass_negatives_zero__select_clamp_pos_to_zero(float %x) {
 ; CHECK-LABEL: define nofpclass(ninf zero nsub nnorm) float @ret_nofpclass_negatives_zero__select_clamp_pos_to_zero
 ; CHECK-SAME: (float [[X:%.*]]) {
-; CHECK-NEXT:    [[IS_GT_ZERO:%.*]] = fcmp ogt float [[X]], 0.000000e+00
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[IS_GT_ZERO]], float 0.000000e+00, float [[X]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[X]]
 ;
   %is.gt.zero = fcmp ogt float %x, 0.0
   %select = select i1 %is.gt.zero, float 0.0, float %x
@@ -819,8 +783,7 @@ define nofpclass(inf) float @ret_nofpclass_noinfs__assumed_isinf__select_pinf_lh
 ; CHECK-NEXT:    [[FABS_X:%.*]] = call float @llvm.fabs.f32(float [[X]])
 ; CHECK-NEXT:    [[X_IS_INF:%.*]] = fcmp oeq float [[FABS_X]], 0x7FF0000000000000
 ; CHECK-NEXT:    call void @llvm.assume(i1 [[X_IS_INF]])
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float [[Y]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[Y]]
 ;
   %fabs.x = call float @llvm.fabs.f32(float %x)
   %x.is.inf = fcmp oeq float %fabs.x, 0x7FF0000000000000
@@ -838,18 +801,13 @@ define nofpclass(nan inf nzero nsub nnorm) float @powr_issue64870(float nofpclas
 ; CHECK-NEXT:    [[I1:%.*]] = tail call float @llvm.log2.f32(float [[I]])
 ; CHECK-NEXT:    [[I2:%.*]] = fmul float [[I1]], [[Y]]
 ; CHECK-NEXT:    [[I3:%.*]] = tail call nofpclass(ninf nzero nsub nnorm) float @llvm.exp2.f32(float [[I2]])
-; CHECK-NEXT:    [[I4:%.*]] = fcmp olt float [[Y]], 0.000000e+00
-; CHECK-NEXT:    [[I5:%.*]] = select i1 [[I4]], float 0x7FF0000000000000, float 0.000000e+00
 ; CHECK-NEXT:    [[I6:%.*]] = fcmp oeq float [[X]], 0.000000e+00
-; CHECK-NEXT:    [[I7:%.*]] = select i1 [[I6]], float [[I5]], float [[I3]]
+; CHECK-NEXT:    [[I7:%.*]] = select i1 [[I6]], float 0.000000e+00, float [[I3]]
 ; CHECK-NEXT:    [[I8:%.*]] = fcmp oeq float [[Y]], 0.000000e+00
-; CHECK-NEXT:    [[I9:%.*]] = select i1 [[I6]], float 0x7FF8000000000000, float 1.000000e+00
-; CHECK-NEXT:    [[I10:%.*]] = select i1 [[I8]], float [[I9]], float [[I7]]
 ; CHECK-NEXT:    [[I11:%.*]] = fcmp oeq float [[X]], 1.000000e+00
-; CHECK-NEXT:    [[I12:%.*]] = select i1 [[I11]], float 1.000000e+00, float [[I10]]
-; CHECK-NEXT:    [[I13:%.*]] = fcmp olt float [[X]], 0.000000e+00
-; CHECK-NEXT:    [[I14:%.*]] = select i1 [[I13]], float 0x7FF8000000000000, float [[I12]]
-; CHECK-NEXT:    ret float [[I14]]
+; CHECK-NEXT:    [[TMP0:%.*]] = select i1 [[I11]], i1 true, i1 [[I8]]
+; CHECK-NEXT:    [[I12:%.*]] = select i1 [[TMP0]], float 1.000000e+00, float [[I7]]
+; CHECK-NEXT:    ret float [[I12]]
 ;
 entry:
   %i = tail call float @llvm.fabs.f32(float %x)
@@ -881,12 +839,8 @@ define nofpclass(nan inf nzero nsub nnorm) float @test_powr_issue64870_2(float n
 ; CHECK-NEXT:    [[I4:%.*]] = select i1 [[I]], float 0x7FF8000000000000, float [[ARG1]]
 ; CHECK-NEXT:    [[I5:%.*]] = fmul float [[I4]], [[I3]]
 ; CHECK-NEXT:    [[I6:%.*]] = tail call noundef nofpclass(ninf nzero nsub nnorm) float @llvm.exp2.f32(float noundef [[I5]])
-; CHECK-NEXT:    [[I7:%.*]] = fcmp olt float [[I4]], 0.000000e+00
-; CHECK-NEXT:    [[I8:%.*]] = select i1 [[I7]], float 0x7FF0000000000000, float 0.000000e+00
-; CHECK-NEXT:    [[I9:%.*]] = fcmp ueq float [[I4]], 0.000000e+00
 ; CHECK-NEXT:    [[I10:%.*]] = fcmp oeq float [[I2]], 0.000000e+00
-; CHECK-NEXT:    [[I11:%.*]] = select i1 [[I9]], float 0x7FF8000000000000, float [[I8]]
-; CHECK-NEXT:    [[I12:%.*]] = select i1 [[I10]], float [[I11]], float [[I6]]
+; CHECK-NEXT:    [[I12:%.*]] = select i1 [[I10]], float 0.000000e+00, float [[I6]]
 ; CHECK-NEXT:    ret float [[I12]]
 ;
 bb:
@@ -923,16 +877,10 @@ define nofpclass(nan inf) float @pow_f32(float nofpclass(nan inf) %arg, float no
 ; CHECK-NEXT:    [[I11:%.*]] = and i1 [[I7]], [[I10]]
 ; CHECK-NEXT:    [[I12:%.*]] = select i1 [[I11]], float [[ARG]], float 1.000000e+00
 ; CHECK-NEXT:    [[I13:%.*]] = tail call noundef float @llvm.copysign.f32(float noundef [[I4]], float noundef [[I12]])
-; CHECK-NEXT:    [[I14:%.*]] = fcmp olt float [[ARG]], 0.000000e+00
-; CHECK-NEXT:    [[I15:%.*]] = select i1 [[I7]], float [[I13]], float 0x7FF8000000000000
-; CHECK-NEXT:    [[I16:%.*]] = select i1 [[I14]], float [[I15]], float [[I13]]
 ; CHECK-NEXT:    [[I17:%.*]] = fcmp oeq float [[ARG]], 0.000000e+00
-; CHECK-NEXT:    [[I18:%.*]] = fcmp olt float [[ARG1]], 0.000000e+00
-; CHECK-NEXT:    [[I19:%.*]] = xor i1 [[I17]], [[I18]]
-; CHECK-NEXT:    [[I20:%.*]] = select i1 [[I19]], float 0.000000e+00, float 0x7FF0000000000000
 ; CHECK-NEXT:    [[I21:%.*]] = select i1 [[I11]], float [[ARG]], float 0.000000e+00
-; CHECK-NEXT:    [[I22:%.*]] = tail call noundef nofpclass(nan sub norm) float @llvm.copysign.f32(float noundef [[I20]], float noundef [[I21]])
-; CHECK-NEXT:    [[I23:%.*]] = select i1 [[I17]], float [[I22]], float [[I16]]
+; CHECK-NEXT:    [[I22:%.*]] = tail call noundef nofpclass(nan sub norm) float @llvm.copysign.f32(float noundef 0.000000e+00, float noundef [[I21]])
+; CHECK-NEXT:    [[I23:%.*]] = select i1 [[I17]], float [[I22]], float [[I13]]
 ; CHECK-NEXT:    [[I24:%.*]] = fcmp oeq float [[ARG]], 1.000000e+00
 ; CHECK-NEXT:    [[I25:%.*]] = fcmp oeq float [[ARG1]], 0.000000e+00
 ; CHECK-NEXT:    [[I26:%.*]] = or i1 [[I24]], [[I25]]
@@ -977,8 +925,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_call_only_inf(i
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__select_nofpclass_call_only_inf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[Y:%.*]]) {
 ; CHECK-NEXT:    [[MUST_BE_INF:%.*]] = call nofpclass(nan zero sub norm) float @extern()
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[MUST_BE_INF]], float [[Y]]
-; CHECK-NEXT:    ret float [[SELECT]]
+; CHECK-NEXT:    ret float [[Y]]
 ;
   %must.be.inf = call nofpclass(nan norm zero sub) float @extern()
   %select = select i1 %cond, float %must.be.inf, float %y
@@ -989,7 +936,7 @@ define nofpclass(pinf) float @ret_nofpclass_pinf__nofpclass_call_only_inf(i1 %co
 ; CHECK-LABEL: define nofpclass(pinf) float @ret_nofpclass_pinf__nofpclass_call_only_inf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[Y:%.*]]) {
 ; CHECK-NEXT:    [[MUST_BE_INF:%.*]] = call nofpclass(nan zero sub norm) float @extern()
-; CHECK-NEXT:    ret float [[MUST_BE_INF]]
+; CHECK-NEXT:    ret float 0xFFF0000000000000
 ;
   %must.be.inf = call nofpclass(nan norm zero sub) float @extern()
   ret float %must.be.inf
@@ -999,7 +946,7 @@ define nofpclass(ninf) float @ret_nofpclass_ninf__nofpclass_call_only_inf(i1 %co
 ; CHECK-LABEL: define nofpclass(ninf) float @ret_nofpclass_ninf__nofpclass_call_only_inf
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[Y:%.*]]) {
 ; CHECK-NEXT:    [[MUST_BE_INF:%.*]] = call nofpclass(nan zero sub norm) float @extern()
-; CHECK-NEXT:    ret float [[MUST_BE_INF]]
+; CHECK-NEXT:    ret float 0x7FF0000000000000
 ;
   %must.be.inf = call nofpclass(nan norm zero sub) float @extern()
   ret float %must.be.inf
@@ -1009,7 +956,7 @@ define nofpclass(nzero) float @ret_nofpclass_nzero__nofpclass_call_only_zero(i1
 ; CHECK-LABEL: define nofpclass(nzero) float @ret_nofpclass_nzero__nofpclass_call_only_zero
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[Y:%.*]]) {
 ; CHECK-NEXT:    [[MUST_BE_ZERO:%.*]] = call nofpclass(nan inf sub norm) float @extern()
-; CHECK-NEXT:    ret float [[MUST_BE_ZERO]]
+; CHECK-NEXT:    ret float 0.000000e+00
 ;
   %must.be.zero = call nofpclass(nan sub norm inf) float @extern()
   ret float %must.be.zero
@@ -1019,7 +966,7 @@ define nofpclass(pzero) float @ret_nofpclass_pzero__nofpclass_call_only_zero(i1
 ; CHECK-LABEL: define nofpclass(pzero) float @ret_nofpclass_pzero__nofpclass_call_only_zero
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[Y:%.*]]) {
 ; CHECK-NEXT:    [[MUST_BE_ZERO:%.*]] = call nofpclass(nan inf sub norm) float @extern()
-; CHECK-NEXT:    ret float [[MUST_BE_ZERO]]
+; CHECK-NEXT:    ret float -0.000000e+00
 ;
   %must.be.zero = call nofpclass(nan sub norm inf) float @extern()
   ret float %must.be.zero
@@ -1133,8 +1080,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__recursive_phi_0(i1 %cond0, float
 ; CHECK-NEXT:    [[LOOP_COND:%.*]] = call i1 @loop.cond()
 ; CHECK-NEXT:    br i1 [[LOOP_COND]], label [[RET]], label [[LOOP]]
 ; CHECK:       ret:
-; CHECK-NEXT:    [[PHI_RET:%.*]] = phi float [ 0.000000e+00, [[ENTRY:%.*]] ], [ 0x7FF0000000000000, [[LOOP]] ]
-; CHECK-NEXT:    ret float [[PHI_RET]]
+; CHECK-NEXT:    ret float 0.000000e+00
 ;
 entry:
   br i1 %cond0, label %loop, label %ret
@@ -1159,7 +1105,7 @@ define nofpclass(inf) float @ret_nofpclass_inf__recursive_phi_1(i1 %cond0, float
 ; CHECK-NEXT:    [[LOOP_COND:%.*]] = call i1 @loop.cond()
 ; CHECK-NEXT:    br i1 [[LOOP_COND]], label [[RET]], label [[LOOP]]
 ; CHECK:       ret:
-; CHECK-NEXT:    ret float 0x7FF0000000000000
+; CHECK-NEXT:    ret float poison
 ;
 entry:
   br i1 %cond0, label %loop, label %ret
@@ -1180,8 +1126,8 @@ define nofpclass(inf) float @ret_nofpclass_inf__phi_switch_repeated_predecessor(
 ; CHECK-SAME: (i32 [[SWITCH:%.*]], float [[UNKNOWN:%.*]]) {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    switch i32 [[SWITCH]], label [[RET:%.*]] [
-; CHECK-NEXT:    i32 0, label [[LOOP:%.*]]
-; CHECK-NEXT:    i32 1, label [[LOOP]]
+; CHECK-NEXT:      i32 0, label [[LOOP:%.*]]
+; CHECK-NEXT:      i32 1, label [[LOOP]]
 ; CHECK-NEXT:    ]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[PHI_LOOP:%.*]] = phi float [ 0x7FF0000000000000, [[ENTRY:%.*]] ], [ 0x7FF0000000000000, [[ENTRY]] ], [ [[UNKNOWN]], [[LOOP]] ]
@@ -1211,8 +1157,7 @@ ret:
 define nofpclass(inf) float @ret_nofpclass_inf__arithmetic_fence_select_pinf_rhs(i1 %cond, float %x) {
 ; CHECK-LABEL: define nofpclass(inf) float @ret_nofpclass_inf__arithmetic_fence_select_pinf_rhs
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
-; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[COND]], float [[X]], float 0x7FF0000000000000
-; CHECK-NEXT:    [[FENCE:%.*]] = call float @llvm.arithmetic.fence.f32(float [[SELECT]])
+; CHECK-NEXT:    [[FENCE:%.*]] = call float @llvm.arithmetic.fence.f32(float [[X]])
 ; CHECK-NEXT:    ret float [[FENCE]]
 ;
   %select = select i1 %cond, float %x, float 0x7FF0000000000000

>From dcbcad721c9069dd4acb3c4990c94ac79df6c75a Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 8 Feb 2024 09:44:51 +0100
Subject: [PATCH 04/72] [InstCombine] Handle multi-use in
 simplifyAndOrWithOpReplaced() (#81006)

Slightly generalize simplifyAndOrWithOpReplaced() by allowing it to
perform simplifications (without creating new instructions) in multi-use
cases. This way we can remove existing patterns without worrying about
multi-use edge cases.

I've opted to change the general way the implementation works to be more
similar to the standard simplifyWithOpReplaced(). We perform the operand
replacement generically, and then try to simplify the result or create a
new instruction if we're allowed to do so.
---
 .../InstCombine/InstCombineAndOrXor.cpp       | 92 +++++++++----------
 llvm/test/Transforms/InstCombine/or.ll        |  3 +-
 2 files changed, 47 insertions(+), 48 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index aa3b9da924aa0b..a53eb39ad5b0e4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2217,47 +2217,47 @@ foldBitwiseLogicWithIntrinsics(BinaryOperator &I,
   }
 }
 
-// Try to simplify X | Y by replacing occurrences of Y in X with 0.
-// Similarly, simplify X & Y by replacing occurrences of Y in X with -1.
+// Try to simplify V by replacing occurrences of Op with RepOp, but only look
+// through bitwise operations. In particular, for X | Y we try to replace Y with
+// 0 inside X and for X & Y we try to replace Y with -1 inside X.
 // Return the simplified result of X if successful, and nullptr otherwise.
-static Value *simplifyAndOrWithOpReplaced(Value *X, Value *Y, bool IsAnd,
+// If SimplifyOnly is true, no new instructions will be created.
+static Value *simplifyAndOrWithOpReplaced(Value *V, Value *Op, Value *RepOp,
+                                          bool SimplifyOnly,
                                           InstCombinerImpl &IC,
                                           unsigned Depth = 0) {
-  if (isa<Constant>(X) || X == Y)
+  if (Op == RepOp)
     return nullptr;
 
-  Value *RHS;
-  if (match(X, m_c_And(m_Specific(Y), m_Value(RHS)))) {
-    return IsAnd ? RHS : Constant::getNullValue(X->getType());
-  } else if (match(X, m_c_Or(m_Specific(Y), m_Value(RHS)))) {
-    return IsAnd ? Constant::getAllOnesValue(X->getType()) : RHS;
-  } else if (match(X, m_c_Xor(m_Specific(Y), m_Value(RHS)))) {
-    if (IsAnd) {
-      if (X->hasOneUse())
-        return IC.Builder.CreateNot(RHS);
+  if (V == Op)
+    return RepOp;
 
-      if (Value *NotRHS =
-              IC.getFreelyInverted(RHS, RHS->hasOneUse(), &IC.Builder))
-        return NotRHS;
-    } else
-      return RHS;
-  }
+  auto *I = dyn_cast<BinaryOperator>(V);
+  if (!I || !I->isBitwiseLogicOp() || Depth >= 3)
+    return nullptr;
 
-  // Replace uses of Y in X recursively.
-  Value *Op0, *Op1;
-  if (Depth < 2 && match(X, m_BitwiseLogic(m_Value(Op0), m_Value(Op1)))) {
-    // TODO: Relax the one-use constraint to clean up existing hard-coded
-    // simplifications.
-    if (!X->hasOneUse())
-      return nullptr;
-    Value *NewOp0 = simplifyAndOrWithOpReplaced(Op0, Y, IsAnd, IC, Depth + 1);
-    Value *NewOp1 = simplifyAndOrWithOpReplaced(Op1, Y, IsAnd, IC, Depth + 1);
-    if (!NewOp0 && !NewOp1)
-      return nullptr;
-    return IC.Builder.CreateBinOp(cast<BinaryOperator>(X)->getOpcode(),
-                                  NewOp0 ? NewOp0 : Op0, NewOp1 ? NewOp1 : Op1);
-  }
-  return nullptr;
+  if (!I->hasOneUse())
+    SimplifyOnly = true;
+
+  Value *NewOp0 = simplifyAndOrWithOpReplaced(I->getOperand(0), Op, RepOp,
+                                              SimplifyOnly, IC, Depth + 1);
+  Value *NewOp1 = simplifyAndOrWithOpReplaced(I->getOperand(1), Op, RepOp,
+                                              SimplifyOnly, IC, Depth + 1);
+  if (!NewOp0 && !NewOp1)
+    return nullptr;
+
+  if (!NewOp0)
+    NewOp0 = I->getOperand(0);
+  if (!NewOp1)
+    NewOp1 = I->getOperand(1);
+
+  if (Value *Res = simplifyBinOp(I->getOpcode(), NewOp0, NewOp1,
+                                 IC.getSimplifyQuery().getWithInstruction(I)))
+    return Res;
+
+  if (SimplifyOnly)
+    return nullptr;
+  return IC.Builder.CreateBinOp(I->getOpcode(), NewOp0, NewOp1);
 }
 
 // FIXME: We use commutative matchers (m_c_*) for some, but not all, matches
@@ -2781,9 +2781,13 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator &I) {
   if (Instruction *Res = foldBitwiseLogicWithIntrinsics(I, Builder))
     return Res;
 
-  if (Value *V = simplifyAndOrWithOpReplaced(Op0, Op1, /*IsAnd*/ true, *this))
+  if (Value *V =
+          simplifyAndOrWithOpReplaced(Op0, Op1, Constant::getAllOnesValue(Ty),
+                                      /*SimplifyOnly*/ false, *this))
     return BinaryOperator::CreateAnd(V, Op1);
-  if (Value *V = simplifyAndOrWithOpReplaced(Op1, Op0, /*IsAnd*/ true, *this))
+  if (Value *V =
+          simplifyAndOrWithOpReplaced(Op1, Op0, Constant::getAllOnesValue(Ty),
+                                      /*SimplifyOnly*/ false, *this))
     return BinaryOperator::CreateAnd(Op0, V);
 
   return nullptr;
@@ -3602,14 +3606,6 @@ Instruction *InstCombinerImpl::visitOr(BinaryOperator &I) {
     if (match(Op1, m_Xor(m_Specific(B), m_Specific(A))))
       return BinaryOperator::CreateOr(Op1, C);
 
-  // ((A & B) ^ C) | B -> C | B
-  if (match(Op0, m_c_Xor(m_c_And(m_Value(A), m_Specific(Op1)), m_Value(C))))
-    return BinaryOperator::CreateOr(C, Op1);
-
-  // B | ((A & B) ^ C) -> B | C
-  if (match(Op1, m_c_Xor(m_c_And(m_Value(A), m_Specific(Op0)), m_Value(C))))
-    return BinaryOperator::CreateOr(Op0, C);
-
   if (Instruction *DeMorgan = matchDeMorgansLaws(I, *this))
     return DeMorgan;
 
@@ -3965,9 +3961,13 @@ Instruction *InstCombinerImpl::visitOr(BinaryOperator &I) {
   if (Instruction *Res = foldBitwiseLogicWithIntrinsics(I, Builder))
     return Res;
 
-  if (Value *V = simplifyAndOrWithOpReplaced(Op0, Op1, /*IsAnd*/ false, *this))
+  if (Value *V =
+          simplifyAndOrWithOpReplaced(Op0, Op1, Constant::getNullValue(Ty),
+                                      /*SimplifyOnly*/ false, *this))
     return BinaryOperator::CreateOr(V, Op1);
-  if (Value *V = simplifyAndOrWithOpReplaced(Op1, Op0, /*IsAnd*/ false, *this))
+  if (Value *V =
+          simplifyAndOrWithOpReplaced(Op1, Op0, Constant::getNullValue(Ty),
+                                      /*SimplifyOnly*/ false, *this))
     return BinaryOperator::CreateOr(Op0, V);
 
   return nullptr;
diff --git a/llvm/test/Transforms/InstCombine/or.ll b/llvm/test/Transforms/InstCombine/or.ll
index 51863af37c131c..1b1a6ffbf0f2d3 100644
--- a/llvm/test/Transforms/InstCombine/or.ll
+++ b/llvm/test/Transforms/InstCombine/or.ll
@@ -1938,8 +1938,7 @@ define i32 @test_or_and_and_multiuse(i32 %a, i32 %b, i32 %c) {
 ; CHECK-NEXT:    [[AND2:%.*]] = and i32 [[AND1]], [[C:%.*]]
 ; CHECK-NEXT:    call void @use(i32 [[AND1]])
 ; CHECK-NEXT:    call void @use(i32 [[AND2]])
-; CHECK-NEXT:    [[OR:%.*]] = or i32 [[AND2]], [[A]]
-; CHECK-NEXT:    ret i32 [[OR]]
+; CHECK-NEXT:    ret i32 [[A]]
 ;
   %and1 = and i32 %a, %b
   %and2 = and i32 %and1, %c

>From 52649bb1691ba6ac7d03f4c543e6b2de60174228 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 8 Feb 2024 09:47:49 +0100
Subject: [PATCH 05/72] [ValueTracking] Support dominating known bits condition
 in and/or (#74728)

This extends computeKnownBits() support for dominating conditions to
also handle and/or conditions. We'll look through either and or or
depending on which edge we're considering.

This change is mainly for the sake of completeness, so we don't start
missing optimizations if SimplifyCFG decides to merge some branches.
---
 llvm/lib/Analysis/DomConditionCache.cpp       | 48 ++++++++++++-------
 llvm/lib/Analysis/ValueTracking.cpp           | 32 +++++++++----
 .../test/Transforms/InstCombine/known-bits.ll | 15 ++----
 .../Transforms/LoopVectorize/induction.ll     | 30 ++++++------
 4 files changed, 74 insertions(+), 51 deletions(-)

diff --git a/llvm/lib/Analysis/DomConditionCache.cpp b/llvm/lib/Analysis/DomConditionCache.cpp
index c7f4cab4158880..3dad0c2e07133b 100644
--- a/llvm/lib/Analysis/DomConditionCache.cpp
+++ b/llvm/lib/Analysis/DomConditionCache.cpp
@@ -34,23 +34,39 @@ static void findAffectedValues(Value *Cond,
     }
   };
 
-  ICmpInst::Predicate Pred;
-  Value *A;
-  if (match(Cond, m_ICmp(Pred, m_Value(A), m_Constant()))) {
-    AddAffected(A);
+  bool TopLevelIsAnd = match(Cond, m_LogicalAnd());
+  SmallVector<Value *, 8> Worklist;
+  SmallPtrSet<Value *, 8> Visited;
+  Worklist.push_back(Cond);
+  while (!Worklist.empty()) {
+    Value *V = Worklist.pop_back_val();
+    if (!Visited.insert(V).second)
+      continue;
 
-    if (ICmpInst::isEquality(Pred)) {
-      Value *X;
-      // (X & C) or (X | C) or (X ^ C).
-      // (X << C) or (X >>_s C) or (X >>_u C).
-      if (match(A, m_BitwiseLogic(m_Value(X), m_ConstantInt())) ||
-          match(A, m_Shift(m_Value(X), m_ConstantInt())))
-        AddAffected(X);
-    } else {
-      Value *X;
-      // Handle (A + C1) u< C2, which is the canonical form of A > C3 && A < C4.
-      if (match(A, m_Add(m_Value(X), m_ConstantInt())))
-        AddAffected(X);
+    ICmpInst::Predicate Pred;
+    Value *A, *B;
+    // Only recurse into and/or if it matches the top-level and/or type.
+    if (TopLevelIsAnd ? match(V, m_LogicalAnd(m_Value(A), m_Value(B)))
+                      : match(V, m_LogicalOr(m_Value(A), m_Value(B)))) {
+      Worklist.push_back(A);
+      Worklist.push_back(B);
+    } else if (match(V, m_ICmp(Pred, m_Value(A), m_Constant()))) {
+      AddAffected(A);
+
+      if (ICmpInst::isEquality(Pred)) {
+        Value *X;
+        // (X & C) or (X | C) or (X ^ C).
+        // (X << C) or (X >>_s C) or (X >>_u C).
+        if (match(A, m_BitwiseLogic(m_Value(X), m_ConstantInt())) ||
+            match(A, m_Shift(m_Value(X), m_ConstantInt())))
+          AddAffected(X);
+      } else {
+        Value *X;
+        // Handle (A + C1) u< C2, which is the canonical form of
+        // A > C3 && A < C4.
+        if (match(A, m_Add(m_Value(X), m_ConstantInt())))
+          AddAffected(X);
+      }
     }
   }
 }
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 58db81f470130e..0e40a02fd4de67 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -706,28 +706,40 @@ static void computeKnownBitsFromCmp(const Value *V, CmpInst::Predicate Pred,
   }
 }
 
+static void computeKnownBitsFromCond(const Value *V, Value *Cond,
+                                     KnownBits &Known, unsigned Depth,
+                                     const SimplifyQuery &SQ, bool Invert) {
+  Value *A, *B;
+  if (Depth < MaxAnalysisRecursionDepth &&
+      (Invert ? match(Cond, m_LogicalOr(m_Value(A), m_Value(B)))
+              : match(Cond, m_LogicalAnd(m_Value(A), m_Value(B))))) {
+    computeKnownBitsFromCond(V, A, Known, Depth + 1, SQ, Invert);
+    computeKnownBitsFromCond(V, B, Known, Depth + 1, SQ, Invert);
+  }
+
+  if (auto *Cmp = dyn_cast<ICmpInst>(Cond))
+    computeKnownBitsFromCmp(
+        V, Invert ? Cmp->getInversePredicate() : Cmp->getPredicate(),
+        Cmp->getOperand(0), Cmp->getOperand(1), Known, SQ);
+}
+
 void llvm::computeKnownBitsFromContext(const Value *V, KnownBits &Known,
-                                      unsigned Depth, const SimplifyQuery &Q) {
+                                       unsigned Depth, const SimplifyQuery &Q) {
   if (!Q.CxtI)
     return;
 
   if (Q.DC && Q.DT) {
     // Handle dominating conditions.
     for (BranchInst *BI : Q.DC->conditionsFor(V)) {
-      auto *Cmp = dyn_cast<ICmpInst>(BI->getCondition());
-      if (!Cmp)
-        continue;
-
       BasicBlockEdge Edge0(BI->getParent(), BI->getSuccessor(0));
       if (Q.DT->dominates(Edge0, Q.CxtI->getParent()))
-        computeKnownBitsFromCmp(V, Cmp->getPredicate(), Cmp->getOperand(0),
-                                Cmp->getOperand(1), Known, Q);
+        computeKnownBitsFromCond(V, BI->getCondition(), Known, Depth, Q,
+                                 /*Invert*/ false);
 
       BasicBlockEdge Edge1(BI->getParent(), BI->getSuccessor(1));
       if (Q.DT->dominates(Edge1, Q.CxtI->getParent()))
-        computeKnownBitsFromCmp(V, Cmp->getInversePredicate(),
-                                Cmp->getOperand(0), Cmp->getOperand(1), Known,
-                                Q);
+        computeKnownBitsFromCond(V, BI->getCondition(), Known, Depth, Q,
+                                 /*Invert*/ true);
     }
 
     if (Known.hasConflict())
diff --git a/llvm/test/Transforms/InstCombine/known-bits.ll b/llvm/test/Transforms/InstCombine/known-bits.ll
index e346330aa5b1e3..246579cc4cd0c0 100644
--- a/llvm/test/Transforms/InstCombine/known-bits.ll
+++ b/llvm/test/Transforms/InstCombine/known-bits.ll
@@ -105,8 +105,7 @@ define i8 @test_cond_and(i8 %x, i1 %c) {
 ; CHECK-NEXT:    [[COND:%.*]] = and i1 [[CMP]], [[C:%.*]]
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[OR1:%.*]] = or i8 [[X]], -4
-; CHECK-NEXT:    ret i8 [[OR1]]
+; CHECK-NEXT:    ret i8 -4
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[OR2:%.*]] = or i8 [[X]], -4
 ; CHECK-NEXT:    ret i8 [[OR2]]
@@ -133,8 +132,7 @@ define i8 @test_cond_and_commuted(i8 %x, i1 %c1, i1 %c2) {
 ; CHECK-NEXT:    [[COND:%.*]] = and i1 [[C3]], [[CMP]]
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[OR1:%.*]] = or i8 [[X]], -4
-; CHECK-NEXT:    ret i8 [[OR1]]
+; CHECK-NEXT:    ret i8 -4
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[OR2:%.*]] = or i8 [[X]], -4
 ; CHECK-NEXT:    ret i8 [[OR2]]
@@ -161,8 +159,7 @@ define i8 @test_cond_logical_and(i8 %x, i1 %c) {
 ; CHECK-NEXT:    [[COND:%.*]] = select i1 [[CMP]], i1 [[C:%.*]], i1 false
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[OR1:%.*]] = or i8 [[X]], -4
-; CHECK-NEXT:    ret i8 [[OR1]]
+; CHECK-NEXT:    ret i8 -4
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[OR2:%.*]] = or i8 [[X]], -4
 ; CHECK-NEXT:    ret i8 [[OR2]]
@@ -218,8 +215,7 @@ define i8 @test_cond_inv_or(i8 %x, i1 %c) {
 ; CHECK-NEXT:    [[OR1:%.*]] = or i8 [[X]], -4
 ; CHECK-NEXT:    ret i8 [[OR1]]
 ; CHECK:       exit:
-; CHECK-NEXT:    [[OR2:%.*]] = or i8 [[X]], -4
-; CHECK-NEXT:    ret i8 [[OR2]]
+; CHECK-NEXT:    ret i8 -4
 ;
   %and = and i8 %x, 3
   %cmp = icmp ne i8 %and, 0
@@ -242,8 +238,7 @@ define i8 @test_cond_inv_logical_or(i8 %x, i1 %c) {
 ; CHECK-NEXT:    [[COND:%.*]] = select i1 [[CMP_NOT]], i1 [[C:%.*]], i1 false
 ; CHECK-NEXT:    br i1 [[COND]], label [[IF:%.*]], label [[EXIT:%.*]]
 ; CHECK:       if:
-; CHECK-NEXT:    [[OR1:%.*]] = or i8 [[X]], -4
-; CHECK-NEXT:    ret i8 [[OR1]]
+; CHECK-NEXT:    ret i8 -4
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[OR2:%.*]] = or i8 [[X]], -4
 ; CHECK-NEXT:    ret i8 [[OR2]]
diff --git a/llvm/test/Transforms/LoopVectorize/induction.ll b/llvm/test/Transforms/LoopVectorize/induction.ll
index 29d8719db9b298..50a5cc6774c5c6 100644
--- a/llvm/test/Transforms/LoopVectorize/induction.ll
+++ b/llvm/test/Transforms/LoopVectorize/induction.ll
@@ -3523,10 +3523,10 @@ define void @wrappingindvars1(i8 %t, i32 %len, ptr %A) {
 ; IND-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; IND-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; IND:       vector.ph:
-; IND-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -2
+; IND-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 510
 ; IND-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; IND-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; IND-NEXT:    [[IND_END2:%.*]] = add i32 [[N_VEC]], [[EXT]]
+; IND-NEXT:    [[IND_END2:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
 ; IND-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i32> poison, i32 [[EXT]], i64 0
 ; IND-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i32> [[DOTSPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
 ; IND-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <2 x i32> [[DOTSPLAT]], <i32 0, i32 1>
@@ -3589,10 +3589,10 @@ define void @wrappingindvars1(i8 %t, i32 %len, ptr %A) {
 ; UNROLL-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; UNROLL-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; UNROLL:       vector.ph:
-; UNROLL-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -4
+; UNROLL-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 508
 ; UNROLL-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; UNROLL-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; UNROLL-NEXT:    [[IND_END2:%.*]] = add i32 [[N_VEC]], [[EXT]]
+; UNROLL-NEXT:    [[IND_END2:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
 ; UNROLL-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i32> poison, i32 [[EXT]], i64 0
 ; UNROLL-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i32> [[DOTSPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
 ; UNROLL-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <2 x i32> [[DOTSPLAT]], <i32 0, i32 1>
@@ -3733,10 +3733,10 @@ define void @wrappingindvars1(i8 %t, i32 %len, ptr %A) {
 ; INTERLEAVE-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; INTERLEAVE-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; INTERLEAVE:       vector.ph:
-; INTERLEAVE-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -8
+; INTERLEAVE-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 504
 ; INTERLEAVE-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; INTERLEAVE-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; INTERLEAVE-NEXT:    [[IND_END2:%.*]] = add i32 [[N_VEC]], [[EXT]]
+; INTERLEAVE-NEXT:    [[IND_END2:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
 ; INTERLEAVE-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[EXT]], i64 0
 ; INTERLEAVE-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
 ; INTERLEAVE-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <4 x i32> [[DOTSPLAT]], <i32 0, i32 1, i32 2, i32 3>
@@ -3907,11 +3907,11 @@ define void @wrappingindvars2(i8 %t, i32 %len, ptr %A) {
 ; IND-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; IND-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; IND:       vector.ph:
-; IND-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -2
+; IND-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 510
 ; IND-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; IND-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; IND-NEXT:    [[EXT_MUL5:%.*]] = add i32 [[N_VEC]], [[EXT]]
-; IND-NEXT:    [[IND_END1:%.*]] = shl i32 [[EXT_MUL5]], 2
+; IND-NEXT:    [[EXT_MUL5:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
+; IND-NEXT:    [[IND_END1:%.*]] = shl nuw nsw i32 [[EXT_MUL5]], 2
 ; IND-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i32> poison, i32 [[EXT_MUL]], i64 0
 ; IND-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i32> [[DOTSPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
 ; IND-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <2 x i32> [[DOTSPLAT]], <i32 0, i32 4>
@@ -3976,11 +3976,11 @@ define void @wrappingindvars2(i8 %t, i32 %len, ptr %A) {
 ; UNROLL-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; UNROLL-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; UNROLL:       vector.ph:
-; UNROLL-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -4
+; UNROLL-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 508
 ; UNROLL-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; UNROLL-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; UNROLL-NEXT:    [[EXT_MUL6:%.*]] = add i32 [[N_VEC]], [[EXT]]
-; UNROLL-NEXT:    [[IND_END1:%.*]] = shl i32 [[EXT_MUL6]], 2
+; UNROLL-NEXT:    [[EXT_MUL6:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
+; UNROLL-NEXT:    [[IND_END1:%.*]] = shl nuw nsw i32 [[EXT_MUL6]], 2
 ; UNROLL-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i32> poison, i32 [[EXT_MUL]], i64 0
 ; UNROLL-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i32> [[DOTSPLATINSERT]], <2 x i32> poison, <2 x i32> zeroinitializer
 ; UNROLL-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <2 x i32> [[DOTSPLAT]], <i32 0, i32 4>
@@ -4126,11 +4126,11 @@ define void @wrappingindvars2(i8 %t, i32 %len, ptr %A) {
 ; INTERLEAVE-NEXT:    [[TMP9:%.*]] = or i1 [[TMP3]], [[TMP8]]
 ; INTERLEAVE-NEXT:    br i1 [[TMP9]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
 ; INTERLEAVE:       vector.ph:
-; INTERLEAVE-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], -8
+; INTERLEAVE-NEXT:    [[N_VEC:%.*]] = and i32 [[TMP0]], 504
 ; INTERLEAVE-NEXT:    [[DOTCAST:%.*]] = trunc i32 [[N_VEC]] to i8
 ; INTERLEAVE-NEXT:    [[IND_END:%.*]] = add i8 [[DOTCAST]], [[T]]
-; INTERLEAVE-NEXT:    [[EXT_MUL6:%.*]] = add i32 [[N_VEC]], [[EXT]]
-; INTERLEAVE-NEXT:    [[IND_END1:%.*]] = shl i32 [[EXT_MUL6]], 2
+; INTERLEAVE-NEXT:    [[EXT_MUL6:%.*]] = add nuw nsw i32 [[N_VEC]], [[EXT]]
+; INTERLEAVE-NEXT:    [[IND_END1:%.*]] = shl nuw nsw i32 [[EXT_MUL6]], 2
 ; INTERLEAVE-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[EXT_MUL]], i64 0
 ; INTERLEAVE-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
 ; INTERLEAVE-NEXT:    [[INDUCTION:%.*]] = add nuw nsw <4 x i32> [[DOTSPLAT]], <i32 0, i32 4, i32 8, i32 12>

>From 433edf7d3e66f231bd2e73666862b0b75245abbd Mon Sep 17 00:00:00 2001
From: Sven van Haastregt <sven.vanhaastregt at arm.com>
Date: Thu, 8 Feb 2024 08:58:13 +0000
Subject: [PATCH 06/72] [DAG] Fix typos in comments; NFC

---
 llvm/include/llvm/CodeGen/SelectionDAG.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h
index b9ec30754f0c32..886ec0b7940ca8 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1613,10 +1613,10 @@ class SelectionDAG {
   /// Expand the specified \c ISD::VACOPY node as the Legalize pass would.
   SDValue expandVACopy(SDNode *Node);
 
-  /// Returs an GlobalAddress of the function from the current module with
+  /// Return a GlobalAddress of the function from the current module with
   /// name matching the given ExternalSymbol. Additionally can provide the
   /// matched function.
-  /// Panics the function doesn't exists.
+  /// Panic if the function doesn't exist.
   SDValue getSymbolFunctionGlobalAddress(SDValue Op,
                                          Function **TargetFunction = nullptr);
 
@@ -2255,7 +2255,7 @@ class SelectionDAG {
   std::pair<EVT, EVT> GetDependentSplitDestVTs(const EVT &VT, const EVT &EnvVT,
                                                bool *HiIsEmpty) const;
 
-  /// Split the vector with EXTRACT_SUBVECTOR using the provides
+  /// Split the vector with EXTRACT_SUBVECTOR using the provided
   /// VTs and return the low/high part.
   std::pair<SDValue, SDValue> SplitVector(const SDValue &N, const SDLoc &DL,
                                           const EVT &LoVT, const EVT &HiVT);

>From 604f2f1306259f1beae49275e4e6e6027c01e4bb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= <tbaeder at redhat.com>
Date: Wed, 7 Feb 2024 16:06:59 +0100
Subject: [PATCH 07/72] [clang][Interp][NFC] Convert test case to
 verify=expected,both style

---
 clang/test/AST/Interp/builtin-functions.cpp | 103 ++++++++------------
 1 file changed, 39 insertions(+), 64 deletions(-)

diff --git a/clang/test/AST/Interp/builtin-functions.cpp b/clang/test/AST/Interp/builtin-functions.cpp
index d6ed2d862b0949..3aa01d501a3e2a 100644
--- a/clang/test/AST/Interp/builtin-functions.cpp
+++ b/clang/test/AST/Interp/builtin-functions.cpp
@@ -1,11 +1,11 @@
-// RUN: %clang_cc1 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify
-// RUN: %clang_cc1 -Wno-string-plus-int -fexperimental-new-constant-interpreter -triple i686 %s -verify
-// RUN: %clang_cc1 -Wno-string-plus-int -verify=ref %s -Wno-constant-evaluated
-// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify
-// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter -triple i686 %s -verify
-// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -verify=ref %s -Wno-constant-evaluated
-// RUN: %clang_cc1 -triple avr -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify
-// RUN: %clang_cc1 -triple avr -std=c++20 -Wno-string-plus-int -verify=ref %s -Wno-constant-evaluated
+// RUN: %clang_cc1 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify=expected,both
+// RUN: %clang_cc1 -Wno-string-plus-int -fexperimental-new-constant-interpreter -triple i686 %s -verify=expected,both
+// RUN: %clang_cc1 -Wno-string-plus-int -verify=ref,both %s -Wno-constant-evaluated
+// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify=expected,both
+// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter -triple i686 %s -verify=expected,both
+// RUN: %clang_cc1 -std=c++20 -Wno-string-plus-int -verify=ref,both %s -Wno-constant-evaluated
+// RUN: %clang_cc1 -triple avr -std=c++20 -Wno-string-plus-int -fexperimental-new-constant-interpreter %s -verify=expected,both
+// RUN: %clang_cc1 -triple avr -std=c++20 -Wno-string-plus-int -verify=ref,both %s -Wno-constant-evaluated
 
 
 namespace strcmp {
@@ -23,23 +23,17 @@ namespace strcmp {
   static_assert(__builtin_strcmp("abab\0banana", "abab") == 0, "");
   static_assert(__builtin_strcmp("abab", "abab\0banana") == 0, "");
   static_assert(__builtin_strcmp("abab\0banana", "abab\0canada") == 0, "");
-  static_assert(__builtin_strcmp(0, "abab") == 0, ""); // expected-error {{not an integral constant}} \
-                                                       // expected-note {{dereferenced null}} \
-                                                       // expected-note {{in call to}} \
-                                                       // ref-error {{not an integral constant}} \
-                                                       // ref-note {{dereferenced null}}
-  static_assert(__builtin_strcmp("abab", 0) == 0, ""); // expected-error {{not an integral constant}} \
-                                                       // expected-note {{dereferenced null}} \
-                                                       // expected-note {{in call to}} \
-                                                       // ref-error {{not an integral constant}} \
-                                                       // ref-note {{dereferenced null}}
+  static_assert(__builtin_strcmp(0, "abab") == 0, ""); // both-error {{not an integral constant}} \
+                                                       // both-note {{dereferenced null}} \
+                                                       // expected-note {{in call to}}
+  static_assert(__builtin_strcmp("abab", 0) == 0, ""); // both-error {{not an integral constant}} \
+                                                       // both-note {{dereferenced null}} \
+                                                       // expected-note {{in call to}}
 
   static_assert(__builtin_strcmp(kFoobar, kFoobazfoobar) == -1, "");
-  static_assert(__builtin_strcmp(kFoobar, kFoobazfoobar + 6) == 0, ""); // expected-error {{not an integral constant}} \
-                                                                        // expected-note {{dereferenced one-past-the-end}} \
-                                                                        // expected-note {{in call to}} \
-                                                                        // ref-error {{not an integral constant}} \
-                                                                        // ref-note {{dereferenced one-past-the-end}}
+  static_assert(__builtin_strcmp(kFoobar, kFoobazfoobar + 6) == 0, ""); // both-error {{not an integral constant}} \
+                                                                        // both-note {{dereferenced one-past-the-end}} \
+                                                                        // expected-note {{in call to}}
 }
 
 /// Copied from constant-expression-cxx11.cpp
@@ -69,41 +63,27 @@ constexpr const char *a = "foo\0quux";
   static_assert(check(b), "");
   static_assert(check(c), "");
 
-  constexpr int over1 = __builtin_strlen(a + 9); // expected-error {{constant expression}} \
-                                                 // expected-note {{one-past-the-end}} \
-                                                 // expected-note {{in call to}} \
-                                                 // ref-error {{constant expression}} \
-                                                 // ref-note {{one-past-the-end}}
-  constexpr int over2 = __builtin_strlen(b + 9); // expected-error {{constant expression}} \
-                                                 // expected-note {{one-past-the-end}} \
-                                                 // expected-note {{in call to}} \
-                                                 // ref-error {{constant expression}} \
-                                                 // ref-note {{one-past-the-end}}
-  constexpr int over3 = __builtin_strlen(c + 9); // expected-error {{constant expression}} \
-                                                 // expected-note {{one-past-the-end}} \
-                                                 // expected-note {{in call to}} \
-                                                 // ref-error {{constant expression}} \
-                                                 // ref-note {{one-past-the-end}}
-
-  constexpr int under1 = __builtin_strlen(a - 1); // expected-error {{constant expression}} \
-                                                  // expected-note {{cannot refer to element -1}} \
-                                                  // ref-error {{constant expression}} \
-                                                  // ref-note {{cannot refer to element -1}}
-  constexpr int under2 = __builtin_strlen(b - 1); // expected-error {{constant expression}} \
-                                                  // expected-note {{cannot refer to element -1}} \
-                                                  // ref-error {{constant expression}} \
-                                                  // ref-note {{cannot refer to element -1}}
-  constexpr int under3 = __builtin_strlen(c - 1); // expected-error {{constant expression}} \
-                                                  // expected-note {{cannot refer to element -1}} \
-                                                  // ref-error {{constant expression}} \
-                                                  // ref-note {{cannot refer to element -1}}
+  constexpr int over1 = __builtin_strlen(a + 9); // both-error {{constant expression}} \
+                                                 // both-note {{one-past-the-end}} \
+                                                 // expected-note {{in call to}}
+  constexpr int over2 = __builtin_strlen(b + 9); // both-error {{constant expression}} \
+                                                 // both-note {{one-past-the-end}} \
+                                                 // expected-note {{in call to}}
+  constexpr int over3 = __builtin_strlen(c + 9); // both-error {{constant expression}} \
+                                                 // both-note {{one-past-the-end}} \
+                                                 // expected-note {{in call to}}
+
+  constexpr int under1 = __builtin_strlen(a - 1); // both-error {{constant expression}} \
+                                                  // both-note {{cannot refer to element -1}}
+  constexpr int under2 = __builtin_strlen(b - 1); // both-error {{constant expression}} \
+                                                  // both-note {{cannot refer to element -1}}
+  constexpr int under3 = __builtin_strlen(c - 1); // both-error {{constant expression}} \
+                                                  // both-note {{cannot refer to element -1}}
 
   constexpr char d[] = { 'f', 'o', 'o' }; // no nul terminator.
-  constexpr int bad = __builtin_strlen(d); // expected-error {{constant expression}} \
-                                           // expected-note {{one-past-the-end}} \
-                                           // expected-note {{in call to}} \
-                                           // ref-error {{constant expression}} \
-                                           // ref-note {{one-past-the-end}}
+  constexpr int bad = __builtin_strlen(d); // both-error {{constant expression}} \
+                                           // both-note {{one-past-the-end}} \
+                                           // expected-note {{in call to}}
 }
 
 namespace nan {
@@ -115,8 +95,7 @@ namespace nan {
   // expected-error at -2 {{must be initialized by a constant expression}}
 #endif
 
-  constexpr double NaN3 = __builtin_nan("foo"); // expected-error {{must be initialized by a constant expression}} \
-                                                // ref-error {{must be initialized by a constant expression}}
+  constexpr double NaN3 = __builtin_nan("foo"); // both-error {{must be initialized by a constant expression}}
   constexpr float NaN4 = __builtin_nanf("");
   //constexpr long double NaN5 = __builtin_nanf128("");
 
@@ -126,8 +105,7 @@ namespace nan {
 
   /// FIXME: Current interpreter misses diagnostics.
   constexpr char f2[] = {'0', 'x', 'A', 'E'}; /// No trailing 0 byte.
-  constexpr double NaN7 = __builtin_nan(f2); // ref-error {{must be initialized by a constant expression}} \
-                                             // expected-error {{must be initialized by a constant expression}} \
+  constexpr double NaN7 = __builtin_nan(f2); // both-error {{must be initialized by a constant expression}} \
                                              // expected-note {{read of dereferenced one-past-the-end pointer}} \
                                              // expected-note {{in call to}}
   static_assert(!__builtin_issignaling(__builtin_nan("")), "");
@@ -370,9 +348,6 @@ namespace EhReturnDataRegno {
       case __builtin_eh_return_data_regno(0):  // constant foldable.
       break;
     }
-
-    __builtin_eh_return_data_regno(X);  // expected-error {{argument to '__builtin_eh_return_data_regno' must be a constant integer}} \
-                                        // ref-error {{argument to '__builtin_eh_return_data_regno' must be a constant integer}}
-
+    __builtin_eh_return_data_regno(X);  // both-error {{argument to '__builtin_eh_return_data_regno' must be a constant integer}}
   }
 }

>From d709a4d22eb85d0fde8a47d1d131bb148ebe707f Mon Sep 17 00:00:00 2001
From: David Green <david.green at arm.com>
Date: Thu, 8 Feb 2024 09:31:26 +0000
Subject: [PATCH 08/72] [BasicAA] More vscale tests. NFC

This time with i8 geps and scale intrinsics, along with mutiple vscale
intrinsics that can be treated as identical.
---
 llvm/test/Analysis/BasicAA/vscale.ll | 168 +++++++++++++++++++++++++++
 1 file changed, 168 insertions(+)

diff --git a/llvm/test/Analysis/BasicAA/vscale.ll b/llvm/test/Analysis/BasicAA/vscale.ll
index 3fff435463a6d2..1b9118bf3853ab 100644
--- a/llvm/test/Analysis/BasicAA/vscale.ll
+++ b/llvm/test/Analysis/BasicAA/vscale.ll
@@ -309,6 +309,174 @@ define void @v1v2types(ptr %p) vscale_range(1,16) {
   ret void
 }
 
+; VScale intrinsic offset tests
+
+; CHECK-LABEL: vscale_neg_notscalable
+; CHECK-DAG:   NoAlias:     <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:     <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG:   MayAlias:    <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:    <4 x i32>* %p, <4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:     <4 x i32>* %vm16, <4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:     <4 x i32>* %m16, <4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:    <4 x i32>* %m16pv16, <4 x i32>* %p
+; CHECK-DAG:   NoAlias:     <4 x i32>* %m16pv16, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:     <4 x i32>* %m16, <4 x i32>* %m16pv16
+; CHECK-DAG:   MayAlias:    <4 x i32>* %m16pv16, <4 x i32>* %vm16m16
+define void @vscale_neg_notscalable(ptr %p) {
+  %v = call i64 @llvm.vscale.i64()
+  %vp = mul nsw i64 %v, 16
+  %vm = mul nsw i64 %v, -16
+  %vm16 = getelementptr i8, ptr %p, i64 %vm
+  %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+  %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 -1
+  %m16pv16 = getelementptr i8, ptr %m16, i64 %vp
+  load <4 x i32>, ptr %p
+  load <4 x i32>, ptr %vm16
+  load <4 x i32>, ptr %m16
+  load <4 x i32>, ptr %vm16m16
+  load <4 x i32>, ptr %m16pv16
+  ret void
+}
+
+; CHECK-LABEL: vscale_neg_scalable
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
+define void @vscale_neg_scalable(ptr %p) {
+  %v = call i64 @llvm.vscale.i64()
+  %vp = mul nsw i64 %v, 16
+  %vm = mul nsw i64 %v, -16
+  %vm16 = getelementptr i8, ptr %p, i64 %vm
+  %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+  %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 -1
+  %m16pv16 = getelementptr i8, ptr %m16, i64 %vp
+  load <vscale x 4 x i32>, ptr %p
+  load <vscale x 4 x i32>, ptr %vm16
+  load <vscale x 4 x i32>, ptr %m16
+  load <vscale x 4 x i32>, ptr %vm16m16
+  load <vscale x 4 x i32>, ptr %m16pv16
+  ret void
+}
+
+; CHECK-LABEL: vscale_pos_notscalable
+; CHECK-DAG:   NoAlias:      <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %p, <4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %vm16, <4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16, <4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16pv16, <4 x i32>* %p
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16pv16, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16, <4 x i32>* %m16pv16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16pv16, <4 x i32>* %vm16m16
+define void @vscale_pos_notscalable(ptr %p) {
+  %v = call i64 @llvm.vscale.i64()
+  %vp = mul nsw i64 %v, 16
+  %vm = mul nsw i64 %v, -16
+  %vm16 = getelementptr i8, ptr %p, i64 %vp
+  %m16 = getelementptr <4 x i32>, ptr %p, i64 1
+  %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 1
+  %m16pv16 = getelementptr i8, ptr %m16, i64 %vm
+  load <4 x i32>, ptr %p
+  load <4 x i32>, ptr %vm16
+  load <4 x i32>, ptr %m16
+  load <4 x i32>, ptr %vm16m16
+  load <4 x i32>, ptr %m16pv16
+  ret void
+}
+
+; CHECK-LABEL: vscale_pos_scalable
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
+define void @vscale_pos_scalable(ptr %p) {
+  %v = call i64 @llvm.vscale.i64()
+  %vp = mul nsw i64 %v, 16
+  %vm = mul nsw i64 %v, -16
+  %vm16 = getelementptr i8, ptr %p, i64 %vp
+  %m16 = getelementptr <4 x i32>, ptr %p, i64 1
+  %vm16m16 = getelementptr <4 x i32>, ptr %vm16, i64 1
+  %m16pv16 = getelementptr i8, ptr %m16, i64 %vm
+  load <vscale x 4 x i32>, ptr %p
+  load <vscale x 4 x i32>, ptr %vm16
+  load <vscale x 4 x i32>, ptr %m16
+  load <vscale x 4 x i32>, ptr %vm16m16
+  load <vscale x 4 x i32>, ptr %m16pv16
+  ret void
+}
+
+; CHECK-LABEL: vscale_v1v2types
+; CHECK-DAG:   MustAlias:    <4 x i32>* %p, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG:   MustAlias:    <4 x i32>* %vm16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16, <vscale x 4 x i32>* %p
+; CHECK-DAG:   NoAlias:      <4 x i32>* %m16, <4 x i32>* %p
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <4 x i32>* %vm16
+; CHECK-DAG:   MustAlias:    <4 x i32>* %m16, <vscale x 4 x i32>* %m16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %p, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %vm16, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <vscale x 4 x i32>* %vp16
+define void @vscale_v1v2types(ptr %p) {
+  %v = call i64 @llvm.vscale.i64()
+  %vp = mul nsw i64 %v, 16
+  %vm = mul nsw i64 %v, -16
+  %vp16 = getelementptr i8, ptr %p, i64 %vp
+  %vm16 = getelementptr i8, ptr %p, i64 %vm
+  %m16 = getelementptr <4 x i32>, ptr %p, i64 -1
+  load <vscale x 4 x i32>, ptr %p
+  load <4 x i32>, ptr %p
+  load <vscale x 4 x i32>, ptr %vm16
+  load <4 x i32>, ptr %vm16
+  load <vscale x 4 x i32>, ptr %m16
+  load <4 x i32>, ptr %m16
+  load <vscale x 4 x i32>, ptr %vp16
+  ret void
+}
+
+; CHECK-LABEL: twovscales
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vp161, <vscale x 4 x i32>* %vp162
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vp161, <vscale x 4 x i32>* %vp161b
+; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vp161b, <vscale x 4 x i32>* %vp162
+define void @twovscales(ptr %p) {
+  %v1 = call i64 @llvm.vscale.i64()
+  %v2 = call i64 @llvm.vscale.i64()
+  %vp1 = mul nsw i64 %v1, 16
+  %vp2 = mul nsw i64 %v2, 16
+  %vp3 = mul nsw i64 %v1, 17
+  %vp161 = getelementptr i8, ptr %p, i64 %vp1
+  %vp162 = getelementptr i8, ptr %p, i64 %vp2
+  %vp161b = getelementptr i8, ptr %vp161, i64 %vp3
+  load <vscale x 4 x i32>, ptr %vp161
+  load <vscale x 4 x i32>, ptr %vp162
+  load <vscale x 4 x i32>, ptr %vp161b
+  ret void
+}
+
 ; getelementptr recursion
 
 ; CHECK-LABEL: gep_recursion_level_1

>From 452094ac0585860d7c87972a979f7d2ef947db5d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= <martin at martin.st>
Date: Thu, 8 Feb 2024 11:45:57 +0200
Subject: [PATCH 09/72] [OpenMP] [cmake] In standalone mode, make
 Python3_EXECUTABLE available (#80828)

When running the tests, we try to invoke them as
"${Python3_EXECUTABLE} ${OPENMP_LLVM_LIT_EXECUTABLE}", but when running
"find_package(Python3)" within the function
"find_standalone_test_dependencies", the variable "Python3_EXECUTABLE"
only gets set within the function scope.

Tests have worked regardless of this in many cases, where executing the
python script directly succeeds. But for consistency, and for working in
cases when the python script can't be executed as such, make the
Python3_EXECUTABLE variable available as intended.
---
 openmp/cmake/OpenMPTesting.cmake | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/openmp/cmake/OpenMPTesting.cmake b/openmp/cmake/OpenMPTesting.cmake
index df41956dadd4f4..ab2348ae59b5f0 100644
--- a/openmp/cmake/OpenMPTesting.cmake
+++ b/openmp/cmake/OpenMPTesting.cmake
@@ -10,6 +10,8 @@ function(find_standalone_test_dependencies)
     message(WARNING "The check targets will not be available!")
     set(ENABLE_CHECK_TARGETS FALSE PARENT_SCOPE)
     return()
+  else()
+    set(Python3_EXECUTABLE ${Python3_EXECUTABLE} PARENT_SCOPE)
   endif()
 
   # Find executables.

>From 158b1637fcb4802870a88eb86659582415968078 Mon Sep 17 00:00:00 2001
From: Evgeniy <evgeniy.tyurin at intel.com>
Date: Thu, 8 Feb 2024 02:06:22 -0800
Subject: [PATCH 10/72] [X86][GlobalISel] Reorganize br/brcond tests (NFC)
 (#80204)

Removing duplicating tests under GlobalISel, consolidating to perform
checks with all three selectors.
---
 llvm/test/CodeGen/X86/GlobalISel/br.ll        |   19 -
 llvm/test/CodeGen/X86/GlobalISel/brcond.ll    |   91 --
 .../test/CodeGen/X86/fast-isel-cmp-branch2.ll |  293 ----
 .../test/CodeGen/X86/fast-isel-cmp-branch3.ll |  469 ------
 llvm/test/CodeGen/X86/isel-br.ll              |   31 +
 llvm/test/CodeGen/X86/isel-brcond-fcmp.ll     | 1341 +++++++++++++++++
 llvm/test/CodeGen/X86/isel-brcond-icmp.ll     | 1107 ++++++++++++++
 7 files changed, 2479 insertions(+), 872 deletions(-)
 delete mode 100644 llvm/test/CodeGen/X86/GlobalISel/br.ll
 delete mode 100644 llvm/test/CodeGen/X86/GlobalISel/brcond.ll
 delete mode 100644 llvm/test/CodeGen/X86/fast-isel-cmp-branch2.ll
 delete mode 100644 llvm/test/CodeGen/X86/fast-isel-cmp-branch3.ll
 create mode 100644 llvm/test/CodeGen/X86/isel-br.ll
 create mode 100644 llvm/test/CodeGen/X86/isel-brcond-fcmp.ll
 create mode 100644 llvm/test/CodeGen/X86/isel-brcond-icmp.ll

diff --git a/llvm/test/CodeGen/X86/GlobalISel/br.ll b/llvm/test/CodeGen/X86/GlobalISel/br.ll
deleted file mode 100644
index 878fe981c98844..00000000000000
--- a/llvm/test/CodeGen/X86/GlobalISel/br.ll
+++ /dev/null
@@ -1,19 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -O0 -mtriple=x86_64-linux-gnu    -global-isel -verify-machineinstrs %s -o - | FileCheck %s
-
-define void @uncondbr() {
-; CHECK-LABEL: uncondbr:
-; CHECK:       # %bb.1: # %entry
-; CHECK-NEXT:    jmp .LBB0_3
-; CHECK-NEXT:  .LBB0_2: # %end
-; CHECK-NEXT:    retq
-; CHECK-NEXT:  .LBB0_3: # %bb2
-; CHECK-NEXT:    jmp .LBB0_2
-entry:
-  br label %bb2
-end:
-  ret void
-bb2:
-  br label %end
-}
-
diff --git a/llvm/test/CodeGen/X86/GlobalISel/brcond.ll b/llvm/test/CodeGen/X86/GlobalISel/brcond.ll
deleted file mode 100644
index b38fbfdcc83c8b..00000000000000
--- a/llvm/test/CodeGen/X86/GlobalISel/brcond.ll
+++ /dev/null
@@ -1,91 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=x86_64-linux-gnu    -global-isel -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=X64
-; RUN: llc -mtriple=i386-linux-gnu      -global-isel -verify-machineinstrs %s -o - | FileCheck %s --check-prefix=X86
-
-define i32 @test_1(i32 %a, i32 %b, i32 %tValue, i32 %fValue) {
-; X64-LABEL: test_1:
-; X64:       # %bb.0: # %entry
-; X64-NEXT:    cmpl %esi, %edi
-; X64-NEXT:    setl %al
-; X64-NEXT:    testb $1, %al
-; X64-NEXT:    je .LBB0_2
-; X64-NEXT:  # %bb.1: # %if.then
-; X64-NEXT:    movl %edx, -{{[0-9]+}}(%rsp)
-; X64-NEXT:    movl -{{[0-9]+}}(%rsp), %eax
-; X64-NEXT:    retq
-; X64-NEXT:  .LBB0_2: # %if.else
-; X64-NEXT:    movl %ecx, -{{[0-9]+}}(%rsp)
-; X64-NEXT:    movl -{{[0-9]+}}(%rsp), %eax
-; X64-NEXT:    retq
-;
-; X86-LABEL: test_1:
-; X86:       # %bb.0: # %entry
-; X86-NEXT:    pushl %eax
-; X86-NEXT:    .cfi_def_cfa_offset 8
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
-; X86-NEXT:    setl %al
-; X86-NEXT:    testb $1, %al
-; X86-NEXT:    je .LBB0_2
-; X86-NEXT:  # %bb.1: # %if.then
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    jmp .LBB0_3
-; X86-NEXT:  .LBB0_2: # %if.else
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:  .LBB0_3: # %return
-; X86-NEXT:    movl %eax, (%esp)
-; X86-NEXT:    movl (%esp), %eax
-; X86-NEXT:    popl %ecx
-; X86-NEXT:    .cfi_def_cfa_offset 4
-; X86-NEXT:    retl
-entry:
-  %retval = alloca i32, align 4
-  %cmp = icmp slt i32 %a, %b
-  br i1 %cmp, label %if.then, label %if.else
-
-if.then:
-  store i32 %tValue, ptr %retval, align 4
-  br label %return
-
-if.else:
-  store i32 %fValue, ptr %retval, align 4
-  br label %return
-
-return:
-  %0 = load i32, ptr %retval, align 4
-  ret i32 %0
-}
-
-define i32 @test_2(i32 %a) {
-; X64-LABEL: test_2:
-; X64:       # %bb.0: # %entry
-; X64-NEXT:    testb $1, %dil
-; X64-NEXT:    je .LBB1_2
-; X64-NEXT:  # %bb.1: # %if.then
-; X64-NEXT:    xorl %eax, %eax
-; X64-NEXT:    retq
-; X64-NEXT:  .LBB1_2: # %if.else
-; X64-NEXT:    movl $1, %eax
-; X64-NEXT:    retq
-;
-; X86-LABEL: test_2:
-; X86:       # %bb.0: # %entry
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    testb $1, %al
-; X86-NEXT:    je .LBB1_2
-; X86-NEXT:  # %bb.1: # %if.then
-; X86-NEXT:    xorl %eax, %eax
-; X86-NEXT:    retl
-; X86-NEXT:  .LBB1_2: # %if.else
-; X86-NEXT:    movl $1, %eax
-; X86-NEXT:    retl
-entry:
-  %cmp = trunc i32 %a to i1
-  br i1 %cmp, label %if.then, label %if.else
-
-if.then:
-  ret i32 0
-if.else:
-  ret i32 1
-}
-
diff --git a/llvm/test/CodeGen/X86/fast-isel-cmp-branch2.ll b/llvm/test/CodeGen/X86/fast-isel-cmp-branch2.ll
deleted file mode 100644
index 475d8fcf7f35a7..00000000000000
--- a/llvm/test/CodeGen/X86/fast-isel-cmp-branch2.ll
+++ /dev/null
@@ -1,293 +0,0 @@
-; RUN: llc < %s                             -mtriple=x86_64-apple-darwin10 | FileCheck %s
-; RUN: llc < %s -fast-isel -fast-isel-abort=1 -mtriple=x86_64-apple-darwin10 | FileCheck %s
-
-define i32 @fcmp_oeq(float %x, float %y) {
-; CHECK-LABEL: fcmp_oeq
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jne {{LBB.+_1}}
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp oeq float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ogt(float %x, float %y) {
-; CHECK-LABEL: fcmp_ogt
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jbe {{LBB.+_1}}
-  %1 = fcmp ogt float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_oge(float %x, float %y) {
-; CHECK-LABEL: fcmp_oge
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jb {{LBB.+_1}}
-  %1 = fcmp oge float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_olt(float %x, float %y) {
-; CHECK-LABEL: fcmp_olt
-; CHECK:       ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jbe {{LBB.+_1}}
-  %1 = fcmp olt float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ole(float %x, float %y) {
-; CHECK-LABEL: fcmp_ole
-; CHECK:       ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jb {{LBB.+_1}}
-  %1 = fcmp ole float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_one(float %x, float %y) {
-; CHECK-LABEL: fcmp_one
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  je {{LBB.+_1}}
-  %1 = fcmp one float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ord(float %x, float %y) {
-; CHECK-LABEL: fcmp_ord
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp ord float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uno(float %x, float %y) {
-; CHECK-LABEL: fcmp_uno
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_2}}
-  %1 = fcmp uno float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ueq(float %x, float %y) {
-; CHECK-LABEL: fcmp_ueq
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  je {{LBB.+_2}}
-  %1 = fcmp ueq float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ugt(float %x, float %y) {
-; CHECK-LABEL: fcmp_ugt
-; CHECK:       ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jae {{LBB.+_1}}
-  %1 = fcmp ugt float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uge(float %x, float %y) {
-; CHECK-LABEL: fcmp_uge
-; CHECK:       ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  ja {{LBB.+_1}}
-  %1 = fcmp uge float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ult(float %x, float %y) {
-; CHECK-LABEL: fcmp_ult
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jae {{LBB.+_1}}
-  %1 = fcmp ult float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ule(float %x, float %y) {
-; CHECK-LABEL: fcmp_ule
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  ja {{LBB.+_1}}
-  %1 = fcmp ule float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_une(float %x, float %y) {
-; CHECK-LABEL: fcmp_une
-; CHECK:       ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jne {{LBB.+_2}}
-; CHECK-NEXT:  jnp {{LBB.+_1}}
-  %1 = fcmp une float %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_eq(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_eq
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jne {{LBB.+_1}}
-  %1 = icmp eq i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ne(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_ne
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  je {{LBB.+_1}}
-  %1 = icmp ne i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ugt(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_ugt
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jbe {{LBB.+_1}}
-  %1 = icmp ugt i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_uge(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_uge
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jb {{LBB.+_1}}
-  %1 = icmp uge i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ult(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_ult
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jae {{LBB.+_1}}
-  %1 = icmp ult i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ule(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_ule
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  ja {{LBB.+_1}}
-  %1 = icmp ule i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sgt(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_sgt
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jle {{LBB.+_1}}
-  %1 = icmp sgt i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sge(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_sge
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jl {{LBB.+_1}}
-  %1 = icmp sge i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_slt(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_slt
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jge {{LBB.+_1}}
-  %1 = icmp slt i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sle(i32 %x, i32 %y) {
-; CHECK-LABEL: icmp_sle
-; CHECK:       cmpl %esi, %edi
-; CHECK-NEXT:  jg {{LBB.+_1}}
-  %1 = icmp sle i32 %x, %y
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
diff --git a/llvm/test/CodeGen/X86/fast-isel-cmp-branch3.ll b/llvm/test/CodeGen/X86/fast-isel-cmp-branch3.ll
deleted file mode 100644
index 8f09b2e3835679..00000000000000
--- a/llvm/test/CodeGen/X86/fast-isel-cmp-branch3.ll
+++ /dev/null
@@ -1,469 +0,0 @@
-; RUN: llc < %s -fast-isel -fast-isel-abort=1 -mtriple=x86_64-apple-darwin10 | FileCheck %s
-
-define i32 @fcmp_oeq1(float %x) {
-; CHECK-LABEL: fcmp_oeq1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp oeq float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_oeq2(float %x) {
-; CHECK-LABEL: fcmp_oeq2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jne {{LBB.+_1}}
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp oeq float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ogt1(float %x) {
-; CHECK-LABEL: fcmp_ogt1
-; CHECK-NOT:   ucomiss
-; CHECK:       movl $1, %eax
-  %1 = fcmp ogt float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ogt2(float %x) {
-; CHECK-LABEL: fcmp_ogt2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jbe {{LBB.+_1}}
-  %1 = fcmp ogt float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_oge1(float %x) {
-; CHECK-LABEL: fcmp_oge1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp oge float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_oge2(float %x) {
-; CHECK-LABEL: fcmp_oge2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jb {{LBB.+_1}}
-  %1 = fcmp oge float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_olt1(float %x) {
-; CHECK-LABEL: fcmp_olt1
-; CHECK-NOT:   ucomiss
-; CHECK:       movl $1, %eax
-  %1 = fcmp olt float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_olt2(float %x) {
-; CHECK-LABEL: fcmp_olt2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jbe {{LBB.+_1}}
-  %1 = fcmp olt float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ole1(float %x) {
-; CHECK-LABEL: fcmp_ole1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp ole float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ole2(float %x) {
-; CHECK-LABEL: fcmp_ole2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jb {{LBB.+_1}}
-  %1 = fcmp ole float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_one1(float %x) {
-; CHECK-LABEL: fcmp_one1
-; CHECK-NOT:   ucomiss
-; CHECK:       movl $1, %eax
-  %1 = fcmp one float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_one2(float %x) {
-; CHECK-LABEL: fcmp_one2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  je {{LBB.+_1}}
-  %1 = fcmp one float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ord1(float %x) {
-; CHECK-LABEL: fcmp_ord1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp ord float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ord2(float %x) {
-; CHECK-LABEL: fcmp_ord2
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_1}}
-  %1 = fcmp ord float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uno1(float %x) {
-; CHECK-LABEL: fcmp_uno1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_2}}
-  %1 = fcmp uno float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uno2(float %x) {
-; CHECK-LABEL: fcmp_uno2
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jp {{LBB.+_2}}
-  %1 = fcmp uno float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ueq1(float %x) {
-; CHECK-LABEL: fcmp_ueq1
-; CHECK-NOT:   ucomiss
-  %1 = fcmp ueq float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ueq2(float %x) {
-; CHECK-LABEL: fcmp_ueq2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  je {{LBB.+_2}}
-  %1 = fcmp ueq float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ugt1(float %x) {
-; CHECK-LABEL: fcmp_ugt1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jnp {{LBB.+_1}}
-  %1 = fcmp ugt float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ugt2(float %x) {
-; CHECK-LABEL: fcmp_ugt2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  jae {{LBB.+_1}}
-  %1 = fcmp ugt float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uge1(float %x) {
-; CHECK-LABEL: fcmp_uge1
-; CHECK-NOT:   ucomiss
-  %1 = fcmp uge float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_uge2(float %x) {
-; CHECK-LABEL: fcmp_uge2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm0, %xmm1
-; CHECK-NEXT:  ja {{LBB.+_1}}
-  %1 = fcmp uge float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ult1(float %x) {
-; CHECK-LABEL: fcmp_ult1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jnp {{LBB.+_1}}
-  %1 = fcmp ult float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ult2(float %x) {
-; CHECK-LABEL: fcmp_ult2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jae {{LBB.+_1}}
-  %1 = fcmp ult float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ule1(float %x) {
-; CHECK-LABEL: fcmp_ule1
-; CHECK-NOT:   ucomiss
-  %1 = fcmp ule float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_ule2(float %x) {
-; CHECK-LABEL: fcmp_ule2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  ja {{LBB.+_1}}
-  %1 = fcmp ule float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_une1(float %x) {
-; CHECK-LABEL: fcmp_une1
-; CHECK:       ucomiss  %xmm0, %xmm0
-; CHECK-NEXT:  jnp {{LBB.+_1}}
-  %1 = fcmp une float %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @fcmp_une2(float %x) {
-; CHECK-LABEL: fcmp_une2
-; CHECK:       xorps    %xmm1, %xmm1
-; CHECK-NEXT:  ucomiss  %xmm1, %xmm0
-; CHECK-NEXT:  jne {{LBB.+_2}}
-; CHECK-NEXT:  jnp {{LBB.+_1}}
-  %1 = fcmp une float %x, 0.000000e+00
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_eq(i32 %x) {
-; CHECK-LABEL: icmp_eq
-; CHECK-NOT:   cmpl
-; CHECK:       xorl %eax, %eax
-  %1 = icmp eq i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ne(i32 %x) {
-; CHECK-LABEL: icmp_ne
-; CHECK-NOT:   cmpl
-; CHECK:       movl $1, %eax
-  %1 = icmp ne i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ugt(i32 %x) {
-; CHECK-LABEL: icmp_ugt
-; CHECK-NOT:   cmpl
-; CHECK:       movl $1, %eax
-  %1 = icmp ugt i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_uge(i32 %x) {
-; CHECK-LABEL: icmp_uge
-; CHECK-NOT:   cmpl
-; CHECK:       xorl %eax, %eax
-  %1 = icmp uge i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ult(i32 %x) {
-; CHECK-LABEL: icmp_ult
-; CHECK-NOT:   cmpl
-; CHECK:       movl $1, %eax
-  %1 = icmp ult i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_ule(i32 %x) {
-; CHECK-LABEL: icmp_ule
-; CHECK-NOT:   cmpl
-; CHECK:       xorl %eax, %eax
-  %1 = icmp ule i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sgt(i32 %x) {
-; CHECK-LABEL: icmp_sgt
-; CHECK-NOT:   cmpl
-; CHECK:       movl $1, %eax
-  %1 = icmp sgt i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sge(i32 %x) {
-; CHECK-LABEL: icmp_sge
-; CHECK-NOT:   cmpl
-; CHECK:       xorl %eax, %eax
-  %1 = icmp sge i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_slt(i32 %x) {
-; CHECK-LABEL: icmp_slt
-; CHECK-NOT:   cmpl
-; CHECK:       movl $1, %eax
-  %1 = icmp slt i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
-define i32 @icmp_sle(i32 %x) {
-; CHECK-LABEL: icmp_sle
-; CHECK-NOT:   cmpl
-; CHECK:       xorl %eax, %eax
-  %1 = icmp sle i32 %x, %x
-  br i1 %1, label %bb1, label %bb2
-bb2:
-  ret i32 1
-bb1:
-  ret i32 0
-}
-
diff --git a/llvm/test/CodeGen/X86/isel-br.ll b/llvm/test/CodeGen/X86/isel-br.ll
new file mode 100644
index 00000000000000..5388c89e18199e
--- /dev/null
+++ b/llvm/test/CodeGen/X86/isel-br.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc < %s -O0 -mtriple=i686-linux-gnu -global-isel=0 -verify-machineinstrs | FileCheck %s --check-prefix=DAG
+; RUN: llc < %s -O0 -mtriple=i686-linux-gnu -fast-isel -fast-isel-abort=1        | FileCheck %s --check-prefix=DAG
+; RUN: llc < %s -O0 -mtriple=i686-linux-gnu -global-isel -global-isel-abort=1 -verify-machineinstrs | FileCheck %s --check-prefix=GISEL
+; RUN: llc < %s -O0 -mtriple=x86_64-linux-gnu -global-isel=0                     | FileCheck %s --check-prefix=DAG
+; RUN: llc < %s -O0 -mtriple=x86_64-linux-gnu -fast-isel -fast-isel-abort=1      | FileCheck %s --check-prefix=DAG
+; RUN: llc < %s -O0 -mtriple=x86_64-linux-gnu -global-isel -global-isel-abort=1  | FileCheck %s --check-prefix=GISEL
+
+define void @uncondbr() {
+; DAG-LABEL: uncondbr:
+; DAG:       # %bb.0: # %entry
+; DAG-NEXT:    jmp .LBB0_2
+; DAG-NEXT:  .LBB0_1: # %end
+; DAG-NEXT:    ret{{[l|q]}}
+; DAG-NEXT:  .LBB0_2: # %bb2
+; DAG-NEXT:    jmp .LBB0_1
+;
+; GISEL-LABEL: uncondbr:
+; GISEL:       # %bb.1: # %entry
+; GISEL-NEXT:    jmp .LBB0_3
+; GISEL-NEXT:  .LBB0_2: # %end
+; GISEL-NEXT:    ret{{[l|q]}}
+; GISEL-NEXT:  .LBB0_3: # %bb2
+; GISEL-NEXT:    jmp .LBB0_2
+entry:
+  br label %bb2
+end:
+  ret void
+bb2:
+  br label %end
+}
diff --git a/llvm/test/CodeGen/X86/isel-brcond-fcmp.ll b/llvm/test/CodeGen/X86/isel-brcond-fcmp.ll
new file mode 100644
index 00000000000000..5a28e094f8a3c9
--- /dev/null
+++ b/llvm/test/CodeGen/X86/isel-brcond-fcmp.ll
@@ -0,0 +1,1341 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc < %s -global-isel=0                    -mtriple=x86_64-apple-darwin10 | FileCheck %s --check-prefixes=X64,SDAG-X64
+; RUN: llc < %s -fast-isel -fast-isel-abort=1     -mtriple=x86_64-apple-darwin10 | FileCheck %s --check-prefixes=X64,FASTISEL-X64
+; RUN: llc < %s -global-isel -global-isel-abort=1 -mtriple=x86_64-apple-darwin10 | FileCheck %s --check-prefixes=GISEL-X64
+
+define i32 @fcmp_oeq(float %x, float %y) {
+; X64-LABEL: fcmp_oeq:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jne LBB0_1
+; X64-NEXT:    jp LBB0_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB0_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oeq:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    setnp %cl
+; GISEL-X64-NEXT:    andb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB0_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB0_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oeq float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ogt(float %x, float %y) {
+; X64-LABEL: fcmp_ogt:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jbe LBB1_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB1_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ogt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB1_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB1_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ogt float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_oge(float %x, float %y) {
+; X64-LABEL: fcmp_oge:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jb LBB2_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB2_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oge:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB2_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB2_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oge float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_olt(float %x, float %y) {
+; X64-LABEL: fcmp_olt:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jbe LBB3_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB3_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_olt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB3_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB3_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp olt float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ole(float %x, float %y) {
+; X64-LABEL: fcmp_ole:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jb LBB4_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB4_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ole:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB4_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB4_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ole float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_one(float %x, float %y) {
+; X64-LABEL: fcmp_one:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    je LBB5_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB5_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_one:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB5_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB5_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp one float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ord(float %x, float %y) {
+; X64-LABEL: fcmp_ord:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jp LBB6_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB6_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ord:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setnp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB6_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB6_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ord float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uno(float %x, float %y) {
+; X64-LABEL: fcmp_uno:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jp LBB7_2
+; X64-NEXT:  ## %bb.1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB7_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_uno:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB7_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB7_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uno float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ueq(float %x, float %y) {
+; X64-LABEL: fcmp_ueq:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    je LBB8_2
+; X64-NEXT:  ## %bb.1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB8_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ueq:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB8_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB8_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ueq float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ugt(float %x, float %y) {
+; X64-LABEL: fcmp_ugt:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jae LBB9_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB9_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ugt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB9_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB9_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ugt float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uge(float %x, float %y) {
+; X64-LABEL: fcmp_uge:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    ja LBB10_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB10_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_uge:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB10_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB10_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uge float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ult(float %x, float %y) {
+; X64-LABEL: fcmp_ult:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jae LBB11_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB11_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ult:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB11_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB11_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ult float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ule(float %x, float %y) {
+; X64-LABEL: fcmp_ule:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    ja LBB12_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB12_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ule:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB12_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB12_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ule float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_une(float %x, float %y) {
+; X64-LABEL: fcmp_une:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jne LBB13_2
+; X64-NEXT:    jnp LBB13_1
+; X64-NEXT:  LBB13_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB13_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_une:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    setp %cl
+; GISEL-X64-NEXT:    orb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB13_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB13_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp une float %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_oeq1(float %x) {
+; X64-LABEL: fcmp_oeq1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB14_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB14_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oeq1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    setnp %cl
+; GISEL-X64-NEXT:    andb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB14_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB14_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oeq float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_oeq2(float %x) {
+; X64-LABEL: fcmp_oeq2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jne LBB15_1
+; X64-NEXT:    jp LBB15_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB15_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oeq2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    setnp %cl
+; GISEL-X64-NEXT:    andb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB15_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB15_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oeq float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ogt1(float %x) {
+; SDAG-X64-LABEL: fcmp_ogt1:
+; SDAG-X64:       ## %bb.0:
+; SDAG-X64-NEXT:    xorl    %eax, %eax
+; SDAG-X64-NEXT:    testb   %al, %al
+; SDAG-X64-NEXT:    je      LBB16_1
+; SDAG-X64-NEXT:  ## %bb.2: ## %bb1
+; SDAG-X64-NEXT:    xorl    %eax, %eax
+; SDAG-X64-NEXT:    retq
+; SDAG-X64-NEXT:  LBB16_1: ## %bb2
+; SDAG-X64-NEXT:    movl    $1, %eax
+; SDAG-X64-NEXT:    retq
+
+; FASTISEL-X64-LABEL: fcmp_ogt1:
+; FASTISEL-X64:       ## %bb.0:
+; FASTISEL-X64:         movl    $1, %eax
+; FASTISEL-X64:         retq
+
+; GISEL-X64-LABEL: fcmp_ogt1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB16_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB16_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ogt float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ogt2(float %x) {
+; X64-LABEL: fcmp_ogt2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jbe LBB17_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB17_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ogt2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB17_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB17_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ogt float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_oge1(float %x) {
+; X64-LABEL: fcmp_oge1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB18_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB18_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oge1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB18_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB18_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oge float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_oge2(float %x) {
+; X64-LABEL: fcmp_oge2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jb LBB19_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB19_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_oge2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB19_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB19_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp oge float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_olt1(float %x) {
+; GISEL-X64-LABEL: fcmp_olt1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB20_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB20_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp olt float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_olt2(float %x) {
+; X64-LABEL: fcmp_olt2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jbe LBB21_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB21_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_olt2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB21_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB21_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp olt float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ole1(float %x) {
+; X64-LABEL: fcmp_ole1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB22_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB22_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ole1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB22_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB22_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ole float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ole2(float %x) {
+; X64-LABEL: fcmp_ole2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jb LBB23_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB23_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ole2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB23_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB23_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ole float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_one1(float %x) {
+; GISEL-X64-LABEL: fcmp_one1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB24_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB24_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp one float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_one2(float %x) {
+; X64-LABEL: fcmp_one2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    je LBB25_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB25_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_one2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB25_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB25_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp one float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ord1(float %x) {
+; X64-LABEL: fcmp_ord1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB26_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB26_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ord1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setnp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB26_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB26_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ord float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ord2(float %x) {
+; X64-LABEL: fcmp_ord2:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB27_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB27_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ord2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setnp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB27_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB27_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ord float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uno1(float %x) {
+; X64-LABEL: fcmp_uno1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB28_2
+; X64-NEXT:  ## %bb.1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB28_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_uno1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB28_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB28_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uno float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uno2(float %x) {
+; X64-LABEL: fcmp_uno2:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jp LBB29_2
+; X64-NEXT:  ## %bb.1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB29_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_uno2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setp %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB29_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB29_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uno float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ueq1(float %x) {
+; GISEL-X64-LABEL: fcmp_ueq1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB30_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB30_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ueq float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ueq2(float %x) {
+; X64-LABEL: fcmp_ueq2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    je LBB31_2
+; X64-NEXT:  ## %bb.1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB31_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ueq2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    jne LBB31_2
+; GISEL-X64-NEXT:  ## %bb.1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB31_2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ueq float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ugt1(float %x) {
+; X64-LABEL: fcmp_ugt1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jnp LBB32_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB32_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ugt1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB32_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB32_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ugt float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ugt2(float %x) {
+; X64-LABEL: fcmp_ugt2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    jae LBB33_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB33_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ugt2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB33_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB33_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ugt float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uge1(float %x) {
+; GISEL-X64-LABEL: fcmp_uge1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB34_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB34_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uge float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_uge2(float %x) {
+; X64-LABEL: fcmp_uge2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm0, %xmm1
+; X64-NEXT:    ja LBB35_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB35_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_uge2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm1
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB35_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB35_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp uge float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ult1(float %x) {
+; X64-LABEL: fcmp_ult1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jnp LBB36_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB36_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ult1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB36_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB36_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ult float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ult2(float %x) {
+; X64-LABEL: fcmp_ult2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jae LBB37_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB37_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ult2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB37_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB37_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ult float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ule1(float %x) {
+; GISEL-X64-LABEL: fcmp_ule1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB38_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB38_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ule float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_ule2(float %x) {
+; X64-LABEL: fcmp_ule2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    ja LBB39_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB39_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_ule2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB39_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB39_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp ule float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_une1(float %x) {
+; X64-LABEL: fcmp_une1:
+; X64:       ## %bb.0:
+; X64-NEXT:    ucomiss %xmm0, %xmm0
+; X64-NEXT:    jnp LBB40_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB40_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_une1:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    ucomiss %xmm0, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    setp %cl
+; GISEL-X64-NEXT:    orb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB40_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB40_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp une float %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @fcmp_une2(float %x) {
+; X64-LABEL: fcmp_une2:
+; X64:       ## %bb.0:
+; X64-NEXT:    xorps %xmm1, %xmm1
+; X64-NEXT:    ucomiss %xmm1, %xmm0
+; X64-NEXT:    jne LBB41_2
+; X64-NEXT:    jnp LBB41_1
+; X64-NEXT:  LBB41_2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB41_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: fcmp_une2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    movss {{.*#+}} xmm1 = [0.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; GISEL-X64-NEXT:    ucomiss %xmm1, %xmm0
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    setp %cl
+; GISEL-X64-NEXT:    orb %al, %cl
+; GISEL-X64-NEXT:    testb $1, %cl
+; GISEL-X64-NEXT:    je LBB41_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB41_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+  %1 = fcmp une float %x, 0.000000e+00
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
diff --git a/llvm/test/CodeGen/X86/isel-brcond-icmp.ll b/llvm/test/CodeGen/X86/isel-brcond-icmp.ll
new file mode 100644
index 00000000000000..59a45d9d72f5b7
--- /dev/null
+++ b/llvm/test/CodeGen/X86/isel-brcond-icmp.ll
@@ -0,0 +1,1107 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -global-isel=0                    -mtriple=x86_64-apple-darwin10 -verify-machineinstrs | FileCheck %s --check-prefixes=X64,SDAG
+; RUN: llc < %s -fast-isel -fast-isel-abort=1     -mtriple=x86_64-apple-darwin10 -verify-machineinstrs | FileCheck %s --check-prefixes=X64,FASTISEL
+; RUN: llc < %s -global-isel -global-isel-abort=1 -mtriple=x86_64-apple-darwin10 -verify-machineinstrs | FileCheck %s --check-prefixes=GISEL-X64
+; RUN: llc < %s -global-isel=0                    -mtriple=i686-apple-darwin10   -verify-machineinstrs | FileCheck %s --check-prefixes=X86,SDAG
+; RUN: llc < %s -fast-isel -fast-isel-abort=1     -mtriple=i686-apple-darwin10   -verify-machineinstrs | FileCheck %s --check-prefixes=X86,FASTISEL
+; RUN: llc < %s -global-isel -global-isel-abort=1 -mtriple=i686-apple-darwin10   -verify-machineinstrs | FileCheck %s --check-prefixes=GISEL-X86
+
+define i32 @icmp_eq_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_eq_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jne LBB0_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB0_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_eq_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB0_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB0_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_eq_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jne LBB0_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB0_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_eq_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    sete %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB0_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB0_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp eq i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ne_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_ne_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    je LBB1_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB1_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_ne_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB1_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB1_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_ne_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    je LBB1_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB1_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_ne_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setne %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB1_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB1_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ne i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ugt_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_ugt_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jbe LBB2_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB2_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_ugt_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB2_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB2_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_ugt_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jbe LBB2_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB2_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_ugt_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    seta %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB2_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB2_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ugt i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_uge_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_uge_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jb LBB3_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB3_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_uge_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB3_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB3_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_uge_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jb LBB3_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB3_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_uge_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setae %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB3_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB3_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp uge i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ult_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_ult_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jae LBB4_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB4_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_ult_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB4_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB4_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_ult_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jae LBB4_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB4_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_ult_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setb %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB4_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB4_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ult i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ule_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_ule_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    ja LBB5_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB5_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_ule_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB5_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB5_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_ule_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    ja LBB5_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB5_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_ule_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setbe %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB5_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB5_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ule i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sgt_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_sgt_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jle LBB6_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB6_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_sgt_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setg %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB6_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB6_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_sgt_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jle LBB6_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB6_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_sgt_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setg %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB6_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB6_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sgt i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sge_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_sge_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jl LBB7_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB7_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_sge_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setge %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB7_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB7_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_sge_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jl LBB7_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB7_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_sge_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setge %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB7_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB7_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sge i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_slt_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_slt_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jge LBB8_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB8_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_slt_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setl %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB8_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB8_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_slt_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jge LBB8_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB8_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_slt_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setl %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB8_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB8_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp slt i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sle_2(i32 %x, i32 %y) {
+; X64-LABEL: icmp_sle_2:
+; X64:       ## %bb.0:
+; X64-NEXT:    cmpl %esi, %edi
+; X64-NEXT:    jg LBB9_1
+; X64-NEXT:  ## %bb.2: ## %bb1
+; X64-NEXT:    xorl %eax, %eax
+; X64-NEXT:    retq
+; X64-NEXT:  LBB9_1: ## %bb2
+; X64-NEXT:    movl $1, %eax
+; X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_sle_2:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %esi, %edi
+; GISEL-X64-NEXT:    setle %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB9_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB9_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; X86-LABEL: icmp_sle_2:
+; X86:       ## %bb.0:
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    cmpl {{[0-9]+\(%esp\), %eax|%eax, [0-9]+\(%esp\)}}
+; X86-NEXT:    jg LBB9_1
+; X86-NEXT:  ## %bb.2: ## %bb1
+; X86-NEXT:    xorl %eax, %eax
+; X86-NEXT:    retl
+; X86-NEXT:  LBB9_1: ## %bb2
+; X86-NEXT:    movl $1, %eax
+; X86-NEXT:    retl
+;
+; GISEL-X86-LABEL: icmp_sle_2:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, {{[0-9]+}}(%esp)
+; GISEL-X86-NEXT:    setle %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB9_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB9_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sle i32 %x, %y
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_eq(i32 %x) {
+; SDAG-LABEL: icmp_eq:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    movb $1, %al
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB10_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB10_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_eq:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    xorl %eax, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_eq:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    sete %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB10_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB10_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_eq:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    sete %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB10_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB10_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp eq i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ne(i32 %x) {
+; SDAG-LABEL: icmp_ne:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB11_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB11_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_ne:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    movl $1, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_ne:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setne %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB11_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB11_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_ne:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setne %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB11_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB11_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ne i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ugt(i32 %x) {
+; SDAG-LABEL: icmp_ugt:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB12_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB12_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_ugt:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    movl $1, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_ugt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    seta %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB12_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB12_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_ugt:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    seta %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB12_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB12_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ugt i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_uge(i32 %x) {
+; SDAG-LABEL: icmp_uge:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    movb $1, %al
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB13_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB13_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-X64-LABEL: icmp_uge:
+; FASTISEL-X64:       ## %bb.0:
+; FASTISEL-X64-NEXT:    xorl %eax, %eax
+; FASTISEL-X64-NEXT:    retq
+;
+; GISEL-X64-LABEL: icmp_uge:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setae %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB13_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB13_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_uge:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setae %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB13_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB13_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp uge i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ult(i32 %x) {
+; SDAG-LABEL: icmp_ult:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB14_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB14_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-X64-LABEL: icmp_ult:
+; FASTISEL-X64:       ## %bb.0:
+; FASTISEL-X64-NEXT:    movl $1, %eax
+; FASTISEL-X64-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_ult:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setb %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB14_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB14_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_ult:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setb %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB14_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB14_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ult i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_ule(i32 %x) {
+; SDAG-LABEL: icmp_ule:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    movb $1, %al
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB15_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB15_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_ule:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    xorl %eax, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_ule:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setbe %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB15_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB15_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_ule:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setbe %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB15_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB15_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp ule i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sgt(i32 %x) {
+; SDAG-LABEL: icmp_sgt:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB16_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB16_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_sgt:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    movl $1, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_sgt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setg %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB16_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB16_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_sgt:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setg %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB16_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB16_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sgt i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sge(i32 %x) {
+; SDAG-LABEL: icmp_sge:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    movb $1, %al
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB17_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB17_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_sge:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    xorl %eax, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_sge:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setge %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB17_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB17_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_sge:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setge %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB17_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB17_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sge i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_slt(i32 %x) {
+; SDAG-LABEL: icmp_slt:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB18_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB18_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_slt:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    movl $1, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_slt:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setl %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB18_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB18_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_slt:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setl %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB18_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB18_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp slt i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}
+
+define i32 @icmp_sle(i32 %x) {
+; SDAG-LABEL: icmp_sle:
+; SDAG:       ## %bb.0:
+; SDAG-NEXT:    movb $1, %al
+; SDAG-NEXT:    testb %al, %al
+; SDAG-NEXT:    je LBB19_1
+; SDAG-NEXT:  ## %bb.2: ## %bb1
+; SDAG-NEXT:    xorl %eax, %eax
+; SDAG-NEXT:    ret{{q|l}}
+; SDAG-NEXT:  LBB19_1: ## %bb2
+; SDAG-NEXT:    movl $1, %eax
+; SDAG-NEXT:    ret{{q|l}}
+;
+; FASTISEL-LABEL: icmp_sle:
+; FASTISEL:       ## %bb.0:
+; FASTISEL-NEXT:    xorl %eax, %eax
+; FASTISEL-NEXT:    ret{{q|l}}
+;
+; GISEL-X64-LABEL: icmp_sle:
+; GISEL-X64:       ## %bb.0:
+; GISEL-X64-NEXT:    cmpl %edi, %edi
+; GISEL-X64-NEXT:    setle %al
+; GISEL-X64-NEXT:    testb $1, %al
+; GISEL-X64-NEXT:    je LBB19_1
+; GISEL-X64-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X64-NEXT:    xorl %eax, %eax
+; GISEL-X64-NEXT:    retq
+; GISEL-X64-NEXT:  LBB19_1: ## %bb2
+; GISEL-X64-NEXT:    movl $1, %eax
+; GISEL-X64-NEXT:    retq
+;
+; GISEL-X86-LABEL: icmp_sle:
+; GISEL-X86:       ## %bb.0:
+; GISEL-X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; GISEL-X86-NEXT:    cmpl %eax, %eax
+; GISEL-X86-NEXT:    setle %al
+; GISEL-X86-NEXT:    testb $1, %al
+; GISEL-X86-NEXT:    je LBB19_1
+; GISEL-X86-NEXT:  ## %bb.2: ## %bb1
+; GISEL-X86-NEXT:    xorl %eax, %eax
+; GISEL-X86-NEXT:    retl
+; GISEL-X86-NEXT:  LBB19_1: ## %bb2
+; GISEL-X86-NEXT:    movl $1, %eax
+; GISEL-X86-NEXT:    retl
+  %1 = icmp sle i32 %x, %x
+  br i1 %1, label %bb1, label %bb2
+bb2:
+  ret i32 1
+bb1:
+  ret i32 0
+}

>From c3220016a6642c95715a073d60213d65f7613613 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bal=C3=A1zs=20K=C3=A9ri?= <balazs.keri at ericsson.com>
Date: Thu, 8 Feb 2024 11:09:57 +0100
Subject: [PATCH 11/72] [clang][analyzer] Add missing stream related functions
 to StdLibraryFunctionsChecker. (#76979)

Some stream functions were recently added to `StreamChecker` that were
not modeled by `StdCLibraryFunctionsChecker`. To ensure consistency
these functions are added to the other checker too.
Some of the related tests are re-organized.
---
 .../Checkers/StdLibraryFunctionsChecker.cpp   |  79 ++++++++++--
 .../Inputs/std-c-library-functions-POSIX.h    |  15 ++-
 .../Analysis/std-c-library-functions-POSIX.c  |  16 ++-
 clang/test/Analysis/std-c-library-functions.c |   4 +-
 clang/test/Analysis/stream-error.c            |  26 ----
 clang/test/Analysis/stream-noopen.c           | 120 +++++++++++++++---
 clang/test/Analysis/stream.c                  |  25 +++-
 7 files changed, 221 insertions(+), 64 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
index 0c6293e67a86f2..6b8ac2629453d4 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
@@ -2023,13 +2023,6 @@ void StdLibraryFunctionsChecker::initFunctionSummaries(
                                            {{EOFv, EOFv}, {0, UCharRangeMax}},
                                            "an unsigned char value or EOF")));
 
-  // The getc() family of functions that returns either a char or an EOF.
-  addToFunctionSummaryMap(
-      {"getc", "fgetc"}, Signature(ArgTypes{FilePtrTy}, RetType{IntTy}),
-      Summary(NoEvalCall)
-          .Case({ReturnValueCondition(WithinRange,
-                                      {{EOFv, EOFv}, {0, UCharRangeMax}})},
-                ErrnoIrrelevant));
   addToFunctionSummaryMap(
       "getchar", Signature(ArgTypes{}, RetType{IntTy}),
       Summary(NoEvalCall)
@@ -2139,7 +2132,17 @@ void StdLibraryFunctionsChecker::initFunctionSummaries(
         std::move(GetenvSummary));
   }
 
-  if (ModelPOSIX) {
+  if (!ModelPOSIX) {
+    // Without POSIX use of 'errno' is not specified (in these cases).
+    // Add these functions without 'errno' checks.
+    addToFunctionSummaryMap(
+        {"getc", "fgetc"}, Signature(ArgTypes{FilePtrTy}, RetType{IntTy}),
+        Summary(NoEvalCall)
+            .Case({ReturnValueCondition(WithinRange,
+                                        {{EOFv, EOFv}, {0, UCharRangeMax}})},
+                  ErrnoIrrelevant)
+            .ArgConstraint(NotNull(ArgNo(0))));
+  } else {
     const auto ReturnsZeroOrMinusOne =
         ConstraintSet{ReturnValueCondition(WithinRange, Range(-1, 0))};
     const auto ReturnsZero =
@@ -2231,6 +2234,63 @@ void StdLibraryFunctionsChecker::initFunctionSummaries(
             .Case(ReturnsMinusOne, ErrnoNEZeroIrrelevant, GenericFailureMsg)
             .ArgConstraint(NotNull(ArgNo(0))));
 
+    std::optional<QualType> Off_tTy = lookupTy("off_t");
+    std::optional<RangeInt> Off_tMax = getMaxValue(Off_tTy);
+
+    // int fgetc(FILE *stream);
+    // 'getc' is the same as 'fgetc' but may be a macro
+    addToFunctionSummaryMap(
+        {"getc", "fgetc"}, Signature(ArgTypes{FilePtrTy}, RetType{IntTy}),
+        Summary(NoEvalCall)
+            .Case({ReturnValueCondition(WithinRange, {{0, UCharRangeMax}})},
+                  ErrnoMustNotBeChecked, GenericSuccessMsg)
+            .Case({ReturnValueCondition(WithinRange, SingleValue(EOFv))},
+                  ErrnoIrrelevant, GenericFailureMsg)
+            .ArgConstraint(NotNull(ArgNo(0))));
+
+    // int fputc(int c, FILE *stream);
+    // 'putc' is the same as 'fputc' but may be a macro
+    addToFunctionSummaryMap(
+        {"putc", "fputc"},
+        Signature(ArgTypes{IntTy, FilePtrTy}, RetType{IntTy}),
+        Summary(NoEvalCall)
+            .Case({ArgumentCondition(0, WithinRange, Range(0, UCharRangeMax)),
+                   ReturnValueCondition(BO_EQ, ArgNo(0))},
+                  ErrnoMustNotBeChecked, GenericSuccessMsg)
+            .Case({ArgumentCondition(0, OutOfRange, Range(0, UCharRangeMax)),
+                   ReturnValueCondition(WithinRange, Range(0, UCharRangeMax))},
+                  ErrnoMustNotBeChecked, GenericSuccessMsg)
+            .Case({ReturnValueCondition(WithinRange, SingleValue(EOFv))},
+                  ErrnoNEZeroIrrelevant, GenericFailureMsg)
+            .ArgConstraint(NotNull(ArgNo(1))));
+
+    // char *fgets(char *restrict s, int n, FILE *restrict stream);
+    addToFunctionSummaryMap(
+        "fgets",
+        Signature(ArgTypes{CharPtrRestrictTy, IntTy, FilePtrRestrictTy},
+                  RetType{CharPtrTy}),
+        Summary(NoEvalCall)
+            .Case({ReturnValueCondition(BO_EQ, ArgNo(0))},
+                  ErrnoMustNotBeChecked, GenericSuccessMsg)
+            .Case({IsNull(Ret)}, ErrnoIrrelevant, GenericFailureMsg)
+            .ArgConstraint(NotNull(ArgNo(0)))
+            .ArgConstraint(ArgumentCondition(1, WithinRange, Range(0, IntMax)))
+            .ArgConstraint(
+                BufferSize(/*Buffer=*/ArgNo(0), /*BufSize=*/ArgNo(1)))
+            .ArgConstraint(NotNull(ArgNo(2))));
+
+    // int fputs(const char *restrict s, FILE *restrict stream);
+    addToFunctionSummaryMap(
+        "fputs",
+        Signature(ArgTypes{ConstCharPtrRestrictTy, FilePtrRestrictTy},
+                  RetType{IntTy}),
+        Summary(NoEvalCall)
+            .Case(ReturnsNonnegative, ErrnoMustNotBeChecked, GenericSuccessMsg)
+            .Case({ReturnValueCondition(WithinRange, SingleValue(EOFv))},
+                  ErrnoNEZeroIrrelevant, GenericFailureMsg)
+            .ArgConstraint(NotNull(ArgNo(0)))
+            .ArgConstraint(NotNull(ArgNo(1))));
+
     // int ungetc(int c, FILE *stream);
     addToFunctionSummaryMap(
         "ungetc", Signature(ArgTypes{IntTy, FilePtrTy}, RetType{IntTy}),
@@ -2250,9 +2310,6 @@ void StdLibraryFunctionsChecker::initFunctionSummaries(
                 0, WithinRange, {{EOFv, EOFv}, {0, UCharRangeMax}}))
             .ArgConstraint(NotNull(ArgNo(1))));
 
-    std::optional<QualType> Off_tTy = lookupTy("off_t");
-    std::optional<RangeInt> Off_tMax = getMaxValue(Off_tTy);
-
     // int fseek(FILE *stream, long offset, int whence);
     // FIXME: It can be possible to get the 'SEEK_' values (like EOFv) and use
     // these for condition of arg 2.
diff --git a/clang/test/Analysis/Inputs/std-c-library-functions-POSIX.h b/clang/test/Analysis/Inputs/std-c-library-functions-POSIX.h
index 63e22ebdb30602..b146068eedb080 100644
--- a/clang/test/Analysis/Inputs/std-c-library-functions-POSIX.h
+++ b/clang/test/Analysis/Inputs/std-c-library-functions-POSIX.h
@@ -11,6 +11,7 @@ typedef unsigned long int pthread_t;
 typedef unsigned long time_t;
 typedef unsigned long clockid_t;
 typedef __INT64_TYPE__ off64_t;
+typedef __INT64_TYPE__ fpos_t;
 
 typedef struct {
   int a;
@@ -42,9 +43,22 @@ FILE *fopen(const char *restrict pathname, const char *restrict mode);
 FILE *tmpfile(void);
 FILE *freopen(const char *restrict pathname, const char *restrict mode,
               FILE *restrict stream);
+FILE *fdopen(int fd, const char *mode);
 int fclose(FILE *stream);
+int putc(int c, FILE *stream);
+int fputc(int c, FILE *stream);
+char *fgets(char *restrict s, int n, FILE *restrict stream);
+int fputs(const char *restrict s, FILE *restrict stream);
 int fseek(FILE *stream, long offset, int whence);
+int fgetpos(FILE *restrict stream, fpos_t *restrict pos);
+int fsetpos(FILE *stream, const fpos_t *pos);
+int fflush(FILE *stream);
+long ftell(FILE *stream);
 int fileno(FILE *stream);
+void rewind(FILE *stream);
+void clearerr(FILE *stream);
+int feof(FILE *stream);
+int ferror(FILE *stream);
 long a64l(const char *str64);
 char *l64a(long value);
 int open(const char *path, int oflag, ...);
@@ -100,7 +114,6 @@ int pclose(FILE *stream);
 int close(int fildes);
 long fpathconf(int fildes, int name);
 long pathconf(const char *path, int name);
-FILE *fdopen(int fd, const char *mode);
 void rewinddir(DIR *dir);
 void seekdir(DIR *dirp, long loc);
 int rand_r(unsigned int *seedp);
diff --git a/clang/test/Analysis/std-c-library-functions-POSIX.c b/clang/test/Analysis/std-c-library-functions-POSIX.c
index 03aa8e2e00a75d..b53f3132b86877 100644
--- a/clang/test/Analysis/std-c-library-functions-POSIX.c
+++ b/clang/test/Analysis/std-c-library-functions-POSIX.c
@@ -23,10 +23,22 @@
 // CHECK: Loaded summary for: FILE *popen(const char *command, const char *type)
 // CHECK: Loaded summary for: int fclose(FILE *stream)
 // CHECK: Loaded summary for: int pclose(FILE *stream)
+// CHECK: Loaded summary for: int getc(FILE *)
+// CHECK: Loaded summary for: int fgetc(FILE *)
+// CHECK: Loaded summary for: int putc(int c, FILE *stream)
+// CHECK: Loaded summary for: int fputc(int c, FILE *stream)
+// CHECK: Loaded summary for: char *fgets(char *restrict s, int n, FILE *restrict stream)
+// CHECK: Loaded summary for: int fputs(const char *restrict s, FILE *restrict stream)
 // CHECK: Loaded summary for: int fseek(FILE *stream, long offset, int whence)
-// CHECK: Loaded summary for: int fseeko(FILE *stream, off_t offset, int whence)
-// CHECK: Loaded summary for: off_t ftello(FILE *stream)
+// CHECK: Loaded summary for: int fgetpos(FILE *restrict stream, fpos_t *restrict pos)
+// CHECK: Loaded summary for: int fsetpos(FILE *stream, const fpos_t *pos)
+// CHECK: Loaded summary for: int fflush(FILE *stream)
+// CHECK: Loaded summary for: long ftell(FILE *stream)
 // CHECK: Loaded summary for: int fileno(FILE *stream)
+// CHECK: Loaded summary for: void rewind(FILE *stream)
+// CHECK: Loaded summary for: void clearerr(FILE *stream)
+// CHECK: Loaded summary for: int feof(FILE *stream)
+// CHECK: Loaded summary for: int ferror(FILE *stream)
 // CHECK: Loaded summary for: long a64l(const char *str64)
 // CHECK: Loaded summary for: char *l64a(long value)
 // CHECK: Loaded summary for: int open(const char *path, int oflag, ...)
diff --git a/clang/test/Analysis/std-c-library-functions.c b/clang/test/Analysis/std-c-library-functions.c
index b7eb6b284460e5..e6564e2bae7611 100644
--- a/clang/test/Analysis/std-c-library-functions.c
+++ b/clang/test/Analysis/std-c-library-functions.c
@@ -53,8 +53,6 @@
 // CHECK-NEXT: Loaded summary for: int toupper(int)
 // CHECK-NEXT: Loaded summary for: int tolower(int)
 // CHECK-NEXT: Loaded summary for: int toascii(int)
-// CHECK-NEXT: Loaded summary for: int getc(FILE *)
-// CHECK-NEXT: Loaded summary for: int fgetc(FILE *)
 // CHECK-NEXT: Loaded summary for: int getchar(void)
 // CHECK-NEXT: Loaded summary for: unsigned int fread(void *restrict, size_t, size_t, FILE *restrict)
 // CHECK-NEXT: Loaded summary for: unsigned int fwrite(const void *restrict, size_t, size_t, FILE *restrict)
@@ -63,6 +61,8 @@
 // CHECK-NEXT: Loaded summary for: ssize_t getline(char **restrict, size_t *restrict, FILE *restrict)
 // CHECK-NEXT: Loaded summary for: ssize_t getdelim(char **restrict, size_t *restrict, int, FILE *restrict)
 // CHECK-NEXT: Loaded summary for: char *getenv(const char *)
+// CHECK-NEXT: Loaded summary for: int getc(FILE *)
+// CHECK-NEXT: Loaded summary for: int fgetc(FILE *)
 
 #include "Inputs/std-c-library-functions.h"
 
diff --git a/clang/test/Analysis/stream-error.c b/clang/test/Analysis/stream-error.c
index cd4b0093cfcb23..4bab07577ccd53 100644
--- a/clang/test/Analysis/stream-error.c
+++ b/clang/test/Analysis/stream-error.c
@@ -491,32 +491,6 @@ void error_ftello(void) {
   fclose(F);
 }
 
-void error_fflush_after_fclose(void) {
-  FILE *F = tmpfile();
-  int Ret;
-  fflush(NULL);                      // no-warning
-  if (!F)
-    return;
-  if ((Ret = fflush(F)) != 0)
-    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
-  fclose(F);
-  fflush(F);                         // expected-warning {{Stream might be already closed}}
-}
-
-void error_fflush_on_open_failed_stream(void) {
-  FILE *F = tmpfile();
-  if (!F) {
-    fflush(F); // no-warning
-    return;
-  }
-  fclose(F);
-}
-
-void error_fflush_on_unknown_stream(FILE *F) {
-  fflush(F);   // no-warning
-  fclose(F);   // no-warning
-}
-
 void error_fflush_on_non_null_stream_clear_error_states(void) {
   FILE *F0 = tmpfile(), *F1 = tmpfile();
   // `fflush` clears a non-EOF stream's error state.
diff --git a/clang/test/Analysis/stream-noopen.c b/clang/test/Analysis/stream-noopen.c
index 8ad101ee1e8c13..8bd01a90cf8596 100644
--- a/clang/test/Analysis/stream-noopen.c
+++ b/clang/test/Analysis/stream-noopen.c
@@ -57,6 +57,95 @@ void test_fwrite(FILE *F) {
   clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
 }
 
+void test_fgetc(FILE *F) {
+  int Ret = fgetc(F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  if (Ret != EOF) {
+    if (errno) {} // expected-warning {{undefined}}
+  } else {
+    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+                                     // expected-warning at -1 {{FALSE}}
+  }
+  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
+}
+
+void test_fputc(FILE *F) {
+  int Ret = fputc('a', F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  if (Ret != EOF) {
+    clang_analyzer_eval(Ret == 'a'); // expected-warning {{TRUE}}
+    if (errno) {} // expected-warning {{undefined}}
+  } else {
+    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+  }
+  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
+}
+
+void test_fgets(char *Buf, int N, FILE *F) {
+  char *Ret = fgets(Buf, N, F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  clang_analyzer_eval(Buf != NULL); // expected-warning {{TRUE}}
+  clang_analyzer_eval(N >= 0); // expected-warning {{TRUE}}
+  if (Ret == Buf) {
+    if (errno) {} // expected-warning {{undefined}}
+  } else {
+    clang_analyzer_eval(Ret == 0); // expected-warning {{TRUE}}
+    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+                                     // expected-warning at -1 {{FALSE}}
+  }
+  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
+
+  char Buf1[10];
+  Ret = fgets(Buf1, 11, F); // expected-warning {{The 1st argument to 'fgets' is a buffer with size 10}}
+}
+
+void test_fgets_bufsize(FILE *F) {
+  char Buf[10];
+  fgets(Buf, 11, F); // expected-warning {{The 1st argument to 'fgets' is a buffer with size 10}}
+}
+
+void test_fputs(char *Buf, FILE *F) {
+  int Ret = fputs(Buf, F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  clang_analyzer_eval(Buf != NULL); // expected-warning {{TRUE}}
+  if (Ret >= 0) {
+    if (errno) {} // expected-warning {{undefined}}
+  } else {
+    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
+    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+  }
+  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
+}
+
+void test_ungetc(FILE *F) {
+  int Ret = ungetc('X', F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  if (Ret == 'X') {
+    if (errno) {} // expected-warning {{undefined}}
+  } else {
+    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
+    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+  }
+  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
+}
+
+void test_ungetc_EOF(FILE *F, int C) {
+  int Ret = ungetc(EOF, F);
+  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
+  clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
+  clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
+  Ret = ungetc(C, F);
+  if (Ret == EOF) {
+    clang_analyzer_eval(C == EOF); // expected-warning {{TRUE}}
+                                   // expected-warning at -1{{FALSE}}
+  }
+}
+
 void test_fclose(FILE *F) {
   int Ret = fclose(F);
   clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
@@ -138,28 +227,17 @@ void test_rewind(FILE *F) {
   rewind(F);
 }
 
-void test_ungetc(FILE *F) {
-  int Ret = ungetc('X', F);
-  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
-  if (Ret == 'X') {
-    if (errno) {} // expected-warning {{undefined}}
-  } else {
-    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
-    clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
-  }
-  clang_analyzer_eval(feof(F)); // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(ferror(F)); // expected-warning {{UNKNOWN}}
-}
-
-void test_ungetc_EOF(FILE *F, int C) {
-  int Ret = ungetc(EOF, F);
-  clang_analyzer_eval(F != NULL); // expected-warning {{TRUE}}
-  clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
-  clang_analyzer_eval(errno != 0); // expected-warning {{TRUE}}
-  Ret = ungetc(C, F);
+void test_fflush(FILE *F) {
+  errno = 0;
+  int Ret = fflush(F);
+  clang_analyzer_eval(F != NULL); // expected-warning{{TRUE}}
+                                  // expected-warning at -1{{FALSE}}
   if (Ret == EOF) {
-    clang_analyzer_eval(C == EOF); // expected-warning {{TRUE}}
-                                   // expected-warning at -1{{FALSE}}
+    clang_analyzer_eval(errno != 0); // expected-warning{{TRUE}}
+  } else {
+    clang_analyzer_eval(Ret == 0); // expected-warning{{TRUE}}
+    clang_analyzer_eval(errno == 0); // expected-warning{{TRUE}}
+                                     // expected-warning at -1{{FALSE}}
   }
 }
 
diff --git a/clang/test/Analysis/stream.c b/clang/test/Analysis/stream.c
index 36a9b4e26b07a2..378c9154f8f6a8 100644
--- a/clang/test/Analysis/stream.c
+++ b/clang/test/Analysis/stream.c
@@ -1,7 +1,9 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,alpha.unix.Stream -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,alpha.unix.Stream,debug.ExprInspection -verify %s
 
 #include "Inputs/system-header-simulator.h"
 
+void clang_analyzer_eval(int);
+
 void check_fread(void) {
   FILE *fp = tmpfile();
   fread(0, 0, 0, fp); // expected-warning {{Stream pointer might be NULL}}
@@ -316,3 +318,24 @@ void check_leak_noreturn_2(void) {
 } // expected-warning {{Opened stream never closed. Potential resource leak}}
 // FIXME: This warning should be placed at the `return` above.
 // See https://reviews.llvm.org/D83120 about details.
+
+void fflush_after_fclose(void) {
+  FILE *F = tmpfile();
+  int Ret;
+  fflush(NULL);                      // no-warning
+  if (!F)
+    return;
+  if ((Ret = fflush(F)) != 0)
+    clang_analyzer_eval(Ret == EOF); // expected-warning {{TRUE}}
+  fclose(F);
+  fflush(F);                         // expected-warning {{Stream might be already closed}}
+}
+
+void fflush_on_open_failed_stream(void) {
+  FILE *F = tmpfile();
+  if (!F) {
+    fflush(F); // no-warning
+    return;
+  }
+  fclose(F);
+}

>From 2076f244993c186c9c6396399ec06e6e7fa43124 Mon Sep 17 00:00:00 2001
From: Simon Camphausen <simon.camphausen at iml.fraunhofer.de>
Date: Thu, 8 Feb 2024 11:27:08 +0100
Subject: [PATCH 12/72] [mlir][EmitC] Add builders for call_opaque op (#80879)

This allows to omit the default valued attributes and therefore write
more compact code.
---
 mlir/include/mlir/Dialect/EmitC/IR/EmitC.td | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index 39cc360cef41d4..c50fdf397a0fec 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -122,6 +122,19 @@ def EmitC_CallOpaqueOp : EmitC_Op<"call_opaque", []> {
     Variadic<AnyType>:$operands
   );
   let results = (outs Variadic<AnyType>);
+  let builders = [
+    OpBuilder<(ins
+      "::mlir::TypeRange":$resultTypes,
+      "::llvm::StringRef":$callee,
+      "::mlir::ValueRange":$operands,
+      CArg<"::mlir::ArrayAttr", "{}">:$args,
+      CArg<"::mlir::ArrayAttr", "{}">:$template_args), [{
+        build($_builder, $_state, resultTypes, callee, args, template_args,
+            operands);
+      }]
+    >
+  ];
+
   let assemblyFormat = [{
     $callee `(` $operands `)` attr-dict `:` functional-type($operands, results)
   }];

>From 385b0ff48b827bb2c982b1e99f80e84fbc710bbc Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 10:27:34 +0000
Subject: [PATCH 13/72] [DebugInfo][RemoveDIs] Erase ranges of instructions
 individually (#81007)

The BasicBlock::erase method simply removes a range of instructions from
the instlist by unlinking them. However, now that we're attaching
debug-info directly to instructions, some cleanup is required, so use
eraseFromParent on each instruction instead.

This is less efficient, but rare, and seemingly only WASM EH Prepare
uses this method of BasicBlock. Detected via a memory leak check in
asan.

(asan is always the final boss for whatever I do).
---
 llvm/lib/IR/BasicBlock.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index bb55f48df4b314..fe9d0d08c5fe97 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -677,7 +677,9 @@ BasicBlock *BasicBlock::splitBasicBlockBefore(iterator I, const Twine &BBName) {
 
 BasicBlock::iterator BasicBlock::erase(BasicBlock::iterator FromIt,
                                        BasicBlock::iterator ToIt) {
-  return InstList.erase(FromIt, ToIt);
+  for (Instruction &I : make_early_inc_range(make_range(FromIt, ToIt)))
+    I.eraseFromParent();
+  return ToIt;
 }
 
 void BasicBlock::replacePhiUsesWith(BasicBlock *Old, BasicBlock *New) {

>From 69829916aafb7f935b7b4ac7ba3a415da968bcab Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 10:44:43 +0000
Subject: [PATCH 14/72] [DebugInfo] Handle dbg.assigns in FastISel (#80734)

There are some rare circumstances where dbg.assign intrinsics can reach
FastISel. They are a more specialised kind of dbg.value intrinsic with
more information about the originating alloca. They only occur during
optimisation, but might reach FastISel through always_inlining an
optimised function into an optnone function.

This is a slight problem as it's not safe (for debug-info accuracy) to
ignore any intrinsics, and for RemoveDIs (the intrinsic-replacement
project) it causes a crash through an unhandled switch case. To get
around this, we can just treat the dbg.assign as a dbg.value (it's an
actual subclass) and use the variable location information from the
dbg.value fields. This loses a small amount of debug-info about stack
locations, but is more accurate than just ignoring the intrinsic.

(This has popped up deep in an LTO build of a large codebase while
testing RemoveDIs, I figured it'd be good to fix it for the
intrinsic-form at the same time, just to demonstrate the correct
behaviour).
---
 llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp  |  7 +++
 llvm/lib/CodeGen/SelectionDAG/FastISel.cpp    | 10 +++-
 .../X86/dont-drop-dbg-assigns-in-isels.ll     | 46 +++++++++++++++++++
 3 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/DebugInfo/X86/dont-drop-dbg-assigns-in-isels.ll

diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index dd38317c26bff6..c1d8e890a66edb 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2120,6 +2120,13 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
                                                 ListSize, Alignment));
     return true;
   }
+  case Intrinsic::dbg_assign:
+    // A dbg.assign is a dbg.value with more information about stack locations,
+    // typically produced during optimisation of variables with leaked
+    // addresses. We can treat it like a normal dbg_value intrinsic here; to
+    // benefit from the full analysis of stack/SSA locations, GlobalISel would
+    // need to register for and use the AssignmentTrackingAnalysis pass.
+    LLVM_FALLTHROUGH;
   case Intrinsic::dbg_value: {
     // This form of DBG_VALUE is target-independent.
     const DbgValueInst &DI = cast<DbgValueInst>(CI);
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index 4df79f474e8d2b..f8756527da87f6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -1197,7 +1197,8 @@ void FastISel::handleDbgInfo(const Instruction *II) {
       V = DPV.getVariableLocationOp(0);
 
     bool Res = false;
-    if (DPV.getType() == DPValue::LocationType::Value) {
+    if (DPV.getType() == DPValue::LocationType::Value ||
+        DPV.getType() == DPValue::LocationType::Assign) {
       Res = lowerDbgValue(V, DPV.getExpression(), DPV.getVariable(),
                           DPV.getDebugLoc());
     } else {
@@ -1393,6 +1394,13 @@ bool FastISel::selectIntrinsicCall(const IntrinsicInst *II) {
 
     return true;
   }
+  case Intrinsic::dbg_assign:
+    // A dbg.assign is a dbg.value with more information, typically produced
+    // during optimisation. If one reaches fastisel then something odd has
+    // happened (such as an optimised function being always-inlined into an
+    // optnone function). We will not be using the extra information in the
+    // dbg.assign in that case, just use its dbg.value fields.
+    LLVM_FALLTHROUGH;
   case Intrinsic::dbg_value: {
     // This form of DBG_VALUE is target-independent.
     const DbgValueInst *DI = cast<DbgValueInst>(II);
diff --git a/llvm/test/DebugInfo/X86/dont-drop-dbg-assigns-in-isels.ll b/llvm/test/DebugInfo/X86/dont-drop-dbg-assigns-in-isels.ll
new file mode 100644
index 00000000000000..77c9aa5764fb84
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/dont-drop-dbg-assigns-in-isels.ll
@@ -0,0 +1,46 @@
+; RUN: llc %s -fast-isel -start-after=codegenprepare -stop-before=finalize-isel -o - | FileCheck %s
+; RUN: llc %s -fast-isel -start-after=codegenprepare -stop-before=finalize-isel -o - --try-experimental-debuginfo-iterators | FileCheck %s
+; RUN: llc %s -global-isel -start-after=codegenprepare -stop-before=finalize-isel -o - | FileCheck %s
+; RUN: llc %s -global-isel -start-after=codegenprepare -stop-before=finalize-isel -o - --try-experimental-debuginfo-iterators | FileCheck %s
+
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-unknown"
+
+; CHECK: DBG_VALUE
+
+declare void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata, metadata)
+
+define dso_local i32 @foo(i32 %a, i32 %b) local_unnamed_addr !dbg !8 {
+entry:
+  call void @llvm.dbg.assign(metadata !DIArgList(i32 %a, i32 %b), metadata !16, metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus), metadata !21, metadata ptr undef, metadata !DIExpression()), !dbg !17
+  %mul = mul nsw i32 %b, %a, !dbg !18
+  ret i32 %mul, !dbg !18
+}
+
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!3, !4, !5, !19, !6}
+!llvm.ident = !{!7}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 11.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: None)
+!1 = !DIFile(filename: "debug_value_list_selectiondag.cpp", directory: "/")
+!2 = !{}
+!3 = !{i32 2, !"CodeView", i32 1}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = !{i32 1, !"wchar_size", i32 2}
+!6 = !{i32 7, !"PIC Level", i32 2}
+!7 = !{!"clang version 11.0.0"}
+!8 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: !9, file: !9, line: 1, type: !10, scopeLine: 1, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !13)
+!9 = !DIFile(filename: ".\\debug_value_list.cpp", directory: "/tmp")
+!10 = !DISubroutineType(types: !11)
+!11 = !{!12, !12, !12}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !{!14, !15, !16}
+!14 = !DILocalVariable(name: "b", arg: 2, scope: !8, file: !9, line: 1, type: !12)
+!15 = !DILocalVariable(name: "a", arg: 1, scope: !8, file: !9, line: 1, type: !12)
+!16 = !DILocalVariable(name: "c", scope: !8, file: !9, line: 2, type: !12)
+!17 = !DILocation(line: 0, scope: !8)
+!18 = !DILocation(line: 3, scope: !8)
+!19 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!20 = !DILocalVariable(name: "d", scope: !8, file: !9, line: 2, type: !12)
+!21 = distinct !DIAssignID()

>From 8fa646702fef19d79cbb0bd8858fe1e21b33fa24 Mon Sep 17 00:00:00 2001
From: David Green <david.green at arm.com>
Date: Thu, 8 Feb 2024 11:07:33 +0000
Subject: [PATCH 15/72] [BasicAA] Scalable offset with scalable typesize.
 (#80818)

This patch adds a simple alias analysis check for accesses that are scalable
with a offset between them that is also trivially scalable (there are no other
constant/variable offsets). We essentially divide each side by vscale and are
left needing to check that the offset >= typesize.
---
 llvm/lib/Analysis/BasicAliasAnalysis.cpp | 21 +++++++++++++++++++++
 llvm/test/Analysis/BasicAA/vscale.ll     | 22 +++++++++++-----------
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 19c4393add6ab9..ae31814bb06735 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1170,6 +1170,27 @@ AliasResult BasicAAResult::aliasGEP(
     }
   }
 
+  // VScale Alias Analysis - Given one scalable offset between accesses and a
+  // scalable typesize, we can divide each side by vscale, treating both values
+  // as a constant. We prove that Offset/vscale >= TypeSize/vscale.
+  if (DecompGEP1.VarIndices.size() == 1 && DecompGEP1.VarIndices[0].IsNSW &&
+      DecompGEP1.VarIndices[0].Val.TruncBits == 0 &&
+      DecompGEP1.Offset.isZero() &&
+      PatternMatch::match(DecompGEP1.VarIndices[0].Val.V,
+                          PatternMatch::m_VScale())) {
+    const VariableGEPIndex &ScalableVar = DecompGEP1.VarIndices[0];
+    APInt Scale =
+        ScalableVar.IsNegated ? -ScalableVar.Scale : ScalableVar.Scale;
+    LocationSize VLeftSize = Scale.isNegative() ? V1Size : V2Size;
+
+    // Note that we do not check that the typesize is scalable, as vscale >= 1
+    // so noalias still holds so long as the dependency distance is at least as
+    // big as the typesize.
+    if (VLeftSize.hasValue() &&
+        Scale.uge(VLeftSize.getValue().getKnownMinValue()))
+      return AliasResult::NoAlias;
+  }
+
   // Bail on analysing scalable LocationSize
   if (V1Size.isScalable() || V2Size.isScalable())
     return AliasResult::MayAlias;
diff --git a/llvm/test/Analysis/BasicAA/vscale.ll b/llvm/test/Analysis/BasicAA/vscale.ll
index 1b9118bf3853ab..ce0c6f145d1c88 100644
--- a/llvm/test/Analysis/BasicAA/vscale.ll
+++ b/llvm/test/Analysis/BasicAA/vscale.ll
@@ -339,15 +339,15 @@ define void @vscale_neg_notscalable(ptr %p) {
 }
 
 ; CHECK-LABEL: vscale_neg_scalable
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
 define void @vscale_neg_scalable(ptr %p) {
   %v = call i64 @llvm.vscale.i64()
@@ -393,15 +393,15 @@ define void @vscale_pos_notscalable(ptr %p) {
 }
 
 ; CHECK-LABEL: vscale_pos_scalable
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16m16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vm16m16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vm16m16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %p
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %m16pv16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16pv16, <vscale x 4 x i32>* %vm16m16
 define void @vscale_pos_scalable(ptr %p) {
   %v = call i64 @llvm.vscale.i64()
@@ -421,9 +421,9 @@ define void @vscale_pos_scalable(ptr %p) {
 
 ; CHECK-LABEL: vscale_v1v2types
 ; CHECK-DAG:   MustAlias:    <4 x i32>* %p, <vscale x 4 x i32>* %p
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
-; CHECK-DAG:   MayAlias:     <4 x i32>* %p, <vscale x 4 x i32>* %vm16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %p, <vscale x 4 x i32>* %vm16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %p, <4 x i32>* %vm16
 ; CHECK-DAG:   NoAlias:      <4 x i32>* %p, <4 x i32>* %vm16
 ; CHECK-DAG:   MustAlias:    <4 x i32>* %vm16, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %p
@@ -435,8 +435,8 @@ define void @vscale_pos_scalable(ptr %p) {
 ; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <vscale x 4 x i32>* %vm16
 ; CHECK-DAG:   MayAlias:     <4 x i32>* %m16, <4 x i32>* %vm16
 ; CHECK-DAG:   MustAlias:    <4 x i32>* %m16, <vscale x 4 x i32>* %m16
-; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vp16
-; CHECK-DAG:   MayAlias:     <4 x i32>* %p, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   NoAlias:      <vscale x 4 x i32>* %p, <vscale x 4 x i32>* %vp16
+; CHECK-DAG:   NoAlias:      <4 x i32>* %p, <vscale x 4 x i32>* %vp16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %vm16, <vscale x 4 x i32>* %vp16
 ; CHECK-DAG:   MayAlias:     <4 x i32>* %vm16, <vscale x 4 x i32>* %vp16
 ; CHECK-DAG:   MayAlias:     <vscale x 4 x i32>* %m16, <vscale x 4 x i32>* %vp16

>From 381d955e79d7c747d8ec6fba1b317c562aa86135 Mon Sep 17 00:00:00 2001
From: Alex Bradbury <asb at igalia.com>
Date: Thu, 8 Feb 2024 11:07:01 +0000
Subject: [PATCH 16/72] [RISCV][test] Add test coverage for
 RISCVInstrInfo::isCopyInstrImpl

---
 .../Target/RISCV/RISCVInstrInfoTest.cpp       | 63 +++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/llvm/unittests/Target/RISCV/RISCVInstrInfoTest.cpp b/llvm/unittests/Target/RISCV/RISCVInstrInfoTest.cpp
index 5836239bc56fd6..5f3ce53f5d274e 100644
--- a/llvm/unittests/Target/RISCV/RISCVInstrInfoTest.cpp
+++ b/llvm/unittests/Target/RISCV/RISCVInstrInfoTest.cpp
@@ -94,6 +94,69 @@ TEST_P(RISCVInstrInfoTest, IsAddImmediate) {
   }
 }
 
+TEST_P(RISCVInstrInfoTest, IsCopyInstrImpl) {
+  const RISCVInstrInfo *TII = ST->getInstrInfo();
+  DebugLoc DL;
+
+  // ADDI.
+
+  MachineInstr *MI1 = BuildMI(*MF, DL, TII->get(RISCV::ADDI), RISCV::X1)
+                          .addReg(RISCV::X2)
+                          .addImm(-128)
+                          .getInstr();
+  auto MI1Res = TII->isCopyInstrImpl(*MI1);
+  EXPECT_FALSE(MI1Res.has_value());
+
+  MachineInstr *MI2 = BuildMI(*MF, DL, TII->get(RISCV::ADDI), RISCV::X1)
+                          .addReg(RISCV::X2)
+                          .addImm(0)
+                          .getInstr();
+  auto MI2Res = TII->isCopyInstrImpl(*MI2);
+  ASSERT_TRUE(MI2Res.has_value());
+  EXPECT_EQ(MI2Res->Destination->getReg(), RISCV::X1);
+  EXPECT_EQ(MI2Res->Source->getReg(), RISCV::X2);
+
+  // Partial coverage of FSGNJ_* instructions.
+
+  MachineInstr *MI3 = BuildMI(*MF, DL, TII->get(RISCV::FSGNJ_D), RISCV::F1_D)
+                          .addReg(RISCV::F2_D)
+                          .addReg(RISCV::F1_D)
+                          .getInstr();
+  auto MI3Res = TII->isCopyInstrImpl(*MI3);
+  EXPECT_FALSE(MI3Res.has_value());
+
+  MachineInstr *MI4 = BuildMI(*MF, DL, TII->get(RISCV::FSGNJ_D), RISCV::F1_D)
+                          .addReg(RISCV::F2_D)
+                          .addReg(RISCV::F2_D)
+                          .getInstr();
+  auto MI4Res = TII->isCopyInstrImpl(*MI4);
+  ASSERT_TRUE(MI4Res.has_value());
+  EXPECT_EQ(MI4Res->Destination->getReg(), RISCV::F1_D);
+  EXPECT_EQ(MI4Res->Source->getReg(), RISCV::F2_D);
+
+  // ADD. TODO: Should return true for add reg, x0 and add x0, reg.
+  MachineInstr *MI5 = BuildMI(*MF, DL, TII->get(RISCV::ADD), RISCV::X1)
+                          .addReg(RISCV::X2)
+                          .addReg(RISCV::X3)
+                          .getInstr();
+  auto MI5Res = TII->isCopyInstrImpl(*MI5);
+  EXPECT_FALSE(MI5Res.has_value());
+
+  MachineInstr *MI6 = BuildMI(*MF, DL, TII->get(RISCV::ADD), RISCV::X1)
+                          .addReg(RISCV::X0)
+                          .addReg(RISCV::X2)
+                          .getInstr();
+  auto MI6Res = TII->isCopyInstrImpl(*MI6);
+  EXPECT_FALSE(MI6Res.has_value());
+
+  MachineInstr *MI7 = BuildMI(*MF, DL, TII->get(RISCV::ADD), RISCV::X1)
+                          .addReg(RISCV::X2)
+                          .addReg(RISCV::X0)
+                          .getInstr();
+  auto MI7Res = TII->isCopyInstrImpl(*MI7);
+  EXPECT_FALSE(MI7Res.has_value());
+}
+
 TEST_P(RISCVInstrInfoTest, GetMemOperandsWithOffsetWidth) {
   const RISCVInstrInfo *TII = ST->getInstrInfo();
   const TargetRegisterInfo *TRI = ST->getRegisterInfo();

>From f3d0222cba7407ad0f1184c4f845a9accf187546 Mon Sep 17 00:00:00 2001
From: Michael Buch <michaelbuch12 at gmail.com>
Date: Thu, 8 Feb 2024 11:09:45 +0000
Subject: [PATCH 17/72] [lldb][TypeSynthetic][NFC] Make
 SyntheticChildrenFrontend::Update() return an enum (#80167)

This patch changes the return value of
`SyntheticChildrenFrontend::Update` to a scoped enum that aims to
describe what the return value means.
---
 .../lldb/DataFormatters/TypeSynthetic.h       | 27 +++---
 .../lldb/DataFormatters/VectorIterator.h      |  2 +-
 lldb/include/lldb/lldb-enumerations.h         |  9 ++
 .../Core/ValueObjectSyntheticFilter.cpp       |  6 +-
 lldb/source/DataFormatters/TypeSynthetic.cpp  |  8 +-
 lldb/source/DataFormatters/VectorType.cpp     |  4 +-
 .../Language/CPlusPlus/BlockPointer.cpp       |  4 +-
 .../Plugins/Language/CPlusPlus/Coroutines.cpp | 16 ++--
 .../Plugins/Language/CPlusPlus/Coroutines.h   |  2 +-
 .../Language/CPlusPlus/GenericBitset.cpp      |  8 +-
 .../Language/CPlusPlus/GenericOptional.cpp    |  8 +-
 .../Plugins/Language/CPlusPlus/LibCxx.cpp     | 63 +++++++-------
 .../Plugins/Language/CPlusPlus/LibCxx.h       |  8 +-
 .../Language/CPlusPlus/LibCxxAtomic.cpp       |  7 +-
 .../CPlusPlus/LibCxxInitializerList.cpp       | 10 +--
 .../Plugins/Language/CPlusPlus/LibCxxList.cpp | 32 +++----
 .../Plugins/Language/CPlusPlus/LibCxxMap.cpp  |  9 +-
 .../Language/CPlusPlus/LibCxxQueue.cpp        |  8 +-
 .../CPlusPlus/LibCxxRangesRefView.cpp         | 11 +--
 .../Plugins/Language/CPlusPlus/LibCxxSpan.cpp |  9 +-
 .../Language/CPlusPlus/LibCxxTuple.cpp        |  8 +-
 .../Language/CPlusPlus/LibCxxUnorderedMap.cpp | 20 ++---
 .../Language/CPlusPlus/LibCxxVariant.cpp      | 12 +--
 .../Language/CPlusPlus/LibCxxVector.cpp       | 28 ++++---
 .../Plugins/Language/CPlusPlus/LibStdcpp.cpp  | 44 +++++-----
 .../Language/CPlusPlus/LibStdcppTuple.cpp     |  8 +-
 .../CPlusPlus/LibStdcppUniquePointer.cpp      |  8 +-
 lldb/source/Plugins/Language/ObjC/Cocoa.cpp   |  4 +-
 lldb/source/Plugins/Language/ObjC/NSArray.cpp | 45 +++++-----
 .../Plugins/Language/ObjC/NSDictionary.cpp    | 83 ++++++++++---------
 lldb/source/Plugins/Language/ObjC/NSError.cpp | 12 +--
 .../Plugins/Language/ObjC/NSException.cpp     |  9 +-
 .../Plugins/Language/ObjC/NSIndexPath.cpp     | 14 ++--
 lldb/source/Plugins/Language/ObjC/NSSet.cpp   | 46 +++++-----
 34 files changed, 321 insertions(+), 271 deletions(-)

diff --git a/lldb/include/lldb/DataFormatters/TypeSynthetic.h b/lldb/include/lldb/DataFormatters/TypeSynthetic.h
index 41be9b7efda8fd..23cc054b399a67 100644
--- a/lldb/include/lldb/DataFormatters/TypeSynthetic.h
+++ b/lldb/include/lldb/DataFormatters/TypeSynthetic.h
@@ -49,14 +49,15 @@ class SyntheticChildrenFrontEnd {
 
   virtual size_t GetIndexOfChildWithName(ConstString name) = 0;
 
-  // this function is assumed to always succeed and it if fails, the front-end
-  // should know to deal with it in the correct way (most probably, by refusing
-  // to return any children) the return value of Update() should actually be
-  // interpreted as "ValueObjectSyntheticFilter cache is good/bad" if =true,
-  // ValueObjectSyntheticFilter is allowed to use the children it fetched
-  // previously and cached if =false, ValueObjectSyntheticFilter must throw
-  // away its cache, and query again for children
-  virtual bool Update() = 0;
+  /// This function is assumed to always succeed and if it fails, the front-end
+  /// should know to deal with it in the correct way (most probably, by refusing
+  /// to return any children). The return value of \ref Update should actually
+  /// be interpreted as "ValueObjectSyntheticFilter cache is good/bad". If this
+  /// function returns \ref lldb::ChildCacheState::eReuse, \ref
+  /// ValueObjectSyntheticFilter is allowed to use the children it fetched
+  /// previously and cached. Otherwise, \ref ValueObjectSyntheticFilter must
+  /// throw away its cache, and query again for children.
+  virtual lldb::ChildCacheState Update() = 0;
 
   // if this function returns false, then CalculateNumChildren() MUST return 0
   // since UI frontends might validly decide not to inquire for children given
@@ -116,7 +117,9 @@ class SyntheticValueProviderFrontEnd : public SyntheticChildrenFrontEnd {
     return UINT32_MAX;
   }
 
-  bool Update() override { return false; }
+  lldb::ChildCacheState Update() override {
+    return lldb::ChildCacheState::eRefetch;
+  }
 
   bool MightHaveChildren() override { return false; }
 
@@ -328,7 +331,9 @@ class TypeFilterImpl : public SyntheticChildren {
           filter->GetExpressionPathAtIndex(idx), true);
     }
 
-    bool Update() override { return false; }
+    lldb::ChildCacheState Update() override {
+      return lldb::ChildCacheState::eRefetch;
+    }
 
     bool MightHaveChildren() override { return filter->GetCount() > 0; }
 
@@ -427,7 +432,7 @@ class ScriptedSyntheticChildren : public SyntheticChildren {
 
     lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-    bool Update() override;
+    lldb::ChildCacheState Update() override;
 
     bool MightHaveChildren() override;
 
diff --git a/lldb/include/lldb/DataFormatters/VectorIterator.h b/lldb/include/lldb/DataFormatters/VectorIterator.h
index 3414298f255b6a..5f774bb72c3a3a 100644
--- a/lldb/include/lldb/DataFormatters/VectorIterator.h
+++ b/lldb/include/lldb/DataFormatters/VectorIterator.h
@@ -28,7 +28,7 @@ class VectorIteratorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
diff --git a/lldb/include/lldb/lldb-enumerations.h b/lldb/include/lldb/lldb-enumerations.h
index 392d333c23a447..7e9b538aa8372b 100644
--- a/lldb/include/lldb/lldb-enumerations.h
+++ b/lldb/include/lldb/lldb-enumerations.h
@@ -1305,6 +1305,15 @@ enum CompletionType {
   eTerminatorCompletion = (1ul << 27)
 };
 
+/// Specifies if children need to be re-computed
+/// after a call to \ref SyntheticChildrenFrontEnd::Update.
+enum class ChildCacheState {
+  eRefetch = 0, ///< Children need to be recomputed dynamically.
+
+  eReuse = 1, ///< Children did not change and don't need to be recomputed;
+              ///< re-use what we computed the last time we called Update.
+};
+
 } // namespace lldb
 
 #endif // LLDB_LLDB_ENUMERATIONS_H
diff --git a/lldb/source/Core/ValueObjectSyntheticFilter.cpp b/lldb/source/Core/ValueObjectSyntheticFilter.cpp
index 43bc532c4a0410..e8b4b02d11a0bb 100644
--- a/lldb/source/Core/ValueObjectSyntheticFilter.cpp
+++ b/lldb/source/Core/ValueObjectSyntheticFilter.cpp
@@ -43,7 +43,9 @@ class DummySyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   bool MightHaveChildren() override { return m_backend.MightHaveChildren(); }
 
-  bool Update() override { return false; }
+  lldb::ChildCacheState Update() override {
+    return lldb::ChildCacheState::eRefetch;
+  }
 };
 
 ValueObjectSynthetic::ValueObjectSynthetic(ValueObject &parent,
@@ -177,7 +179,7 @@ bool ValueObjectSynthetic::UpdateValue() {
   }
 
   // let our backend do its update
-  if (!m_synth_filter_up->Update()) {
+  if (m_synth_filter_up->Update() == lldb::ChildCacheState::eRefetch) {
     LLDB_LOGF(log,
               "[ValueObjectSynthetic::UpdateValue] name=%s, synthetic "
               "filter said caches are stale - clearing",
diff --git a/lldb/source/DataFormatters/TypeSynthetic.cpp b/lldb/source/DataFormatters/TypeSynthetic.cpp
index de042e474903e5..8a6f132a39577a 100644
--- a/lldb/source/DataFormatters/TypeSynthetic.cpp
+++ b/lldb/source/DataFormatters/TypeSynthetic.cpp
@@ -190,11 +190,13 @@ size_t ScriptedSyntheticChildren::FrontEnd::CalculateNumChildren(uint32_t max) {
   return m_interpreter->CalculateNumChildren(m_wrapper_sp, max);
 }
 
-bool ScriptedSyntheticChildren::FrontEnd::Update() {
+lldb::ChildCacheState ScriptedSyntheticChildren::FrontEnd::Update() {
   if (!m_wrapper_sp || m_interpreter == nullptr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
-  return m_interpreter->UpdateSynthProviderInstance(m_wrapper_sp);
+  return m_interpreter->UpdateSynthProviderInstance(m_wrapper_sp)
+             ? lldb::ChildCacheState::eReuse
+             : lldb::ChildCacheState::eRefetch;
 }
 
 bool ScriptedSyntheticChildren::FrontEnd::MightHaveChildren() {
diff --git a/lldb/source/DataFormatters/VectorType.cpp b/lldb/source/DataFormatters/VectorType.cpp
index 57dae0b2c71f0f..c94ca68319ee2c 100644
--- a/lldb/source/DataFormatters/VectorType.cpp
+++ b/lldb/source/DataFormatters/VectorType.cpp
@@ -245,7 +245,7 @@ class VectorTypeSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
     return child_sp;
   }
 
-  bool Update() override {
+  lldb::ChildCacheState Update() override {
     m_parent_format = m_backend.GetFormat();
     CompilerType parent_type(m_backend.GetCompilerType());
     CompilerType element_type;
@@ -258,7 +258,7 @@ class VectorTypeSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
         ::CalculateNumChildren(element_type, num_elements, m_child_type)
             .value_or(0);
     m_item_format = GetItemFormatForFormat(m_parent_format, m_child_type);
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
 
   bool MightHaveChildren() override { return true; }
diff --git a/lldb/source/Plugins/Language/CPlusPlus/BlockPointer.cpp b/lldb/source/Plugins/Language/CPlusPlus/BlockPointer.cpp
index 314a4aca8d2663..2e43aa3fa1d8bf 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/BlockPointer.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/BlockPointer.cpp
@@ -136,7 +136,9 @@ class BlockPointerSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   // return true if this object is now safe to use forever without ever
   // updating again; the typical (and tested) answer here is 'false'
-  bool Update() override { return false; }
+  lldb::ChildCacheState Update() override {
+    return lldb::ChildCacheState::eRefetch;
+  }
 
   // maybe return false if the block pointer is, say, null
   bool MightHaveChildren() override { return true; }
diff --git a/lldb/source/Plugins/Language/CPlusPlus/Coroutines.cpp b/lldb/source/Plugins/Language/CPlusPlus/Coroutines.cpp
index 6aeae97667c168..742017438bcf4a 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/Coroutines.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/Coroutines.cpp
@@ -125,24 +125,24 @@ lldb::ValueObjectSP lldb_private::formatters::
   return lldb::ValueObjectSP();
 }
 
-bool lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::
-    Update() {
+lldb::ChildCacheState
+lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::Update() {
   m_resume_ptr_sp.reset();
   m_destroy_ptr_sp.reset();
   m_promise_ptr_sp.reset();
 
   ValueObjectSP valobj_sp = m_backend.GetNonSyntheticValue();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   lldb::addr_t frame_ptr_addr = GetCoroFramePtrFromHandle(valobj_sp);
   if (frame_ptr_addr == 0 || frame_ptr_addr == LLDB_INVALID_ADDRESS)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   auto ts = valobj_sp->GetCompilerType().GetTypeSystem();
   auto ast_ctx = ts.dyn_cast_or_null<TypeSystemClang>();
   if (!ast_ctx)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   // Create the `resume` and `destroy` children.
   lldb::TargetSP target_sp = m_backend.GetTargetSP();
@@ -165,7 +165,7 @@ bool lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::
   CompilerType promise_type(
       valobj_sp->GetCompilerType().GetTypeTemplateArgument(0));
   if (!promise_type)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   // Try to infer the promise_type if it was type-erased
   if (promise_type.IsVoidType()) {
@@ -180,7 +180,7 @@ bool lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::
   // If we don't know the promise type, we don't display the `promise` member.
   // `CreateValueObjectFromAddress` below would fail for `void` types.
   if (promise_type.IsVoidType()) {
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
 
   // Add the `promise` member. We intentionally add `promise` as a pointer type
@@ -194,7 +194,7 @@ bool lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::
   if (error.Success())
     m_promise_ptr_sp = promisePtr->Clone(ConstString("promise"));
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::StdlibCoroutineHandleSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/Coroutines.h b/lldb/source/Plugins/Language/CPlusPlus/Coroutines.h
index b26cc9ed6132d4..d38c7ecefa6e13 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/Coroutines.h
+++ b/lldb/source/Plugins/Language/CPlusPlus/Coroutines.h
@@ -38,7 +38,7 @@ class StdlibCoroutineHandleSyntheticFrontEnd
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
diff --git a/lldb/source/Plugins/Language/CPlusPlus/GenericBitset.cpp b/lldb/source/Plugins/Language/CPlusPlus/GenericBitset.cpp
index 2876efc5c41a55..ac316638523584 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/GenericBitset.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/GenericBitset.cpp
@@ -33,7 +33,7 @@ class GenericBitsetFrontEnd : public SyntheticChildrenFrontEnd {
   }
 
   bool MightHaveChildren() override { return true; }
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
   size_t CalculateNumChildren() override { return m_elements.size(); }
   ValueObjectSP GetChildAtIndex(size_t idx) override;
 
@@ -78,13 +78,13 @@ llvm::StringRef GenericBitsetFrontEnd::GetDataContainerMemberName() {
   llvm_unreachable("Unknown StdLib enum");
 }
 
-bool GenericBitsetFrontEnd::Update() {
+lldb::ChildCacheState GenericBitsetFrontEnd::Update() {
   m_elements.clear();
   m_first = nullptr;
 
   TargetSP target_sp = m_backend.GetTargetSP();
   if (!target_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   size_t size = 0;
 
@@ -94,7 +94,7 @@ bool GenericBitsetFrontEnd::Update() {
   m_elements.assign(size, ValueObjectSP());
   m_first =
       m_backend.GetChildMemberWithName(GetDataContainerMemberName()).get();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 ValueObjectSP GenericBitsetFrontEnd::GetChildAtIndex(size_t idx) {
diff --git a/lldb/source/Plugins/Language/CPlusPlus/GenericOptional.cpp b/lldb/source/Plugins/Language/CPlusPlus/GenericOptional.cpp
index 7415e915844fcd..57331eaa986890 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/GenericOptional.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/GenericOptional.cpp
@@ -44,7 +44,7 @@ class GenericOptionalFrontend : public SyntheticChildrenFrontEnd {
   size_t CalculateNumChildren() override { return m_has_value ? 1U : 0U; }
 
   ValueObjectSP GetChildAtIndex(size_t idx) override;
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
 private:
   bool m_has_value = false;
@@ -61,7 +61,7 @@ GenericOptionalFrontend::GenericOptionalFrontend(ValueObject &valobj,
   }
 }
 
-bool GenericOptionalFrontend::Update() {
+lldb::ChildCacheState GenericOptionalFrontend::Update() {
   ValueObjectSP engaged_sp;
 
   if (m_stdlib == StdLib::LibCxx)
@@ -71,14 +71,14 @@ bool GenericOptionalFrontend::Update() {
                      ->GetChildMemberWithName("_M_engaged");
 
   if (!engaged_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   // _M_engaged/__engaged is a bool flag and is true if the optional contains a
   // value. Converting it to unsigned gives us a size of 1 if it contains a
   // value and 0 if not.
   m_has_value = engaged_sp->GetValueAsUnsigned(0) != 0;
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 ValueObjectSP GenericOptionalFrontend::GetChildAtIndex(size_t _idx) {
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
index d0bdbe1fd4d91a..a7d7066bb2c11d 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
@@ -231,21 +231,22 @@ lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::
     Update();
 }
 
-bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
   m_pair_sp.reset();
   m_pair_ptr = nullptr;
 
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   TargetSP target_sp(valobj_sp->GetTargetSP());
 
   if (!target_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   // this must be a ValueObject* because it is a child of the ValueObject we
   // are producing children for it if were a ValueObjectSP, we would end up
@@ -278,7 +279,7 @@ bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
       auto __i_(valobj_sp->GetChildMemberWithName("__i_"));
       if (!__i_) {
         m_pair_ptr = nullptr;
-        return false;
+        return lldb::ChildCacheState::eRefetch;
       }
       CompilerType pair_type(
           __i_->GetCompilerType().GetTypeTemplateArgument(0));
@@ -290,7 +291,7 @@ bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
           0, name, &bit_offset_ptr, &bitfield_bit_size_ptr, &is_bitfield_ptr);
       if (!pair_type) {
         m_pair_ptr = nullptr;
-        return false;
+        return lldb::ChildCacheState::eRefetch;
       }
 
       auto addr(m_pair_ptr->GetValueAsUnsigned(LLDB_INVALID_ADDRESS));
@@ -299,7 +300,7 @@ bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
         auto ts = pair_type.GetTypeSystem();
         auto ast_ctx = ts.dyn_cast_or_null<TypeSystemClang>();
         if (!ast_ctx)
-          return false;
+          return lldb::ChildCacheState::eRefetch;
 
         // Mimick layout of std::__tree_iterator::__ptr_ and read it in
         // from process memory.
@@ -328,14 +329,14 @@ bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
              {"payload", pair_type}});
         std::optional<uint64_t> size = tree_node_type.GetByteSize(nullptr);
         if (!size)
-          return false;
+          return lldb::ChildCacheState::eRefetch;
         WritableDataBufferSP buffer_sp(new DataBufferHeap(*size, 0));
         ProcessSP process_sp(target_sp->GetProcessSP());
         Status error;
         process_sp->ReadMemory(addr, buffer_sp->GetBytes(),
                                buffer_sp->GetByteSize(), error);
         if (error.Fail())
-          return false;
+          return lldb::ChildCacheState::eRefetch;
         DataExtractor extractor(buffer_sp, process_sp->GetByteOrder(),
                                 process_sp->GetAddressByteSize());
         auto pair_sp = CreateValueObjectFromData(
@@ -347,7 +348,7 @@ bool lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::Update() {
     }
   }
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 size_t lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::
@@ -399,22 +400,22 @@ lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
     Update();
 }
 
-bool lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
-    Update() {
+lldb::ChildCacheState lldb_private::formatters::
+    LibCxxUnorderedMapIteratorSyntheticFrontEnd::Update() {
   m_pair_sp.reset();
   m_iter_ptr = nullptr;
 
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   TargetSP target_sp(valobj_sp->GetTargetSP());
 
   if (!target_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   auto exprPathOptions = ValueObject::GetValueForExpressionPathOptions()
                              .DontCheckDotVsArrowSyntax()
@@ -437,7 +438,7 @@ bool lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
     auto iter_child(valobj_sp->GetChildMemberWithName("__i_"));
     if (!iter_child) {
       m_iter_ptr = nullptr;
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     }
 
     CompilerType node_type(iter_child->GetCompilerType()
@@ -455,19 +456,19 @@ bool lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
         0, name, &bit_offset_ptr, &bitfield_bit_size_ptr, &is_bitfield_ptr);
     if (!pair_type) {
       m_iter_ptr = nullptr;
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     }
 
     uint64_t addr = m_iter_ptr->GetValueAsUnsigned(LLDB_INVALID_ADDRESS);
     m_iter_ptr = nullptr;
 
     if (addr == 0 || addr == LLDB_INVALID_ADDRESS)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     auto ts = pair_type.GetTypeSystem();
     auto ast_ctx = ts.dyn_cast_or_null<TypeSystemClang>();
     if (!ast_ctx)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     // Mimick layout of std::__hash_iterator::__node_ and read it in
     // from process memory.
@@ -489,14 +490,14 @@ bool lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
          {"__value_", pair_type}});
     std::optional<uint64_t> size = tree_node_type.GetByteSize(nullptr);
     if (!size)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     WritableDataBufferSP buffer_sp(new DataBufferHeap(*size, 0));
     ProcessSP process_sp(target_sp->GetProcessSP());
     Status error;
     process_sp->ReadMemory(addr, buffer_sp->GetBytes(),
                            buffer_sp->GetByteSize(), error);
     if (error.Fail())
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     DataExtractor extractor(buffer_sp, process_sp->GetByteOrder(),
                             process_sp->GetAddressByteSize());
     auto pair_sp = CreateValueObjectFromData(
@@ -505,7 +506,7 @@ bool lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
       m_pair_sp = pair_sp->GetChildAtIndex(2);
   }
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 size_t lldb_private::formatters::LibCxxUnorderedMapIteratorSyntheticFrontEnd::
@@ -600,22 +601,23 @@ lldb_private::formatters::LibcxxSharedPtrSyntheticFrontEnd::GetChildAtIndex(
   return lldb::ValueObjectSP();
 }
 
-bool lldb_private::formatters::LibcxxSharedPtrSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxSharedPtrSyntheticFrontEnd::Update() {
   m_cntrl = nullptr;
 
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   TargetSP target_sp(valobj_sp->GetTargetSP());
   if (!target_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   lldb::ValueObjectSP cntrl_sp(valobj_sp->GetChildMemberWithName("__cntrl_"));
 
   m_cntrl = cntrl_sp.get(); // need to store the raw pointer to avoid a circular
                             // dependency
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxSharedPtrSyntheticFrontEnd::
@@ -689,14 +691,15 @@ lldb_private::formatters::LibcxxUniquePtrSyntheticFrontEnd::GetChildAtIndex(
   return lldb::ValueObjectSP();
 }
 
-bool lldb_private::formatters::LibcxxUniquePtrSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxUniquePtrSyntheticFrontEnd::Update() {
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP ptr_sp(valobj_sp->GetChildMemberWithName("__ptr_"));
   if (!ptr_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   // Retrieve the actual pointer and the deleter, and clone them to give them
   // user-friendly names.
@@ -708,7 +711,7 @@ bool lldb_private::formatters::LibcxxUniquePtrSyntheticFrontEnd::Update() {
   if (deleter_sp)
     m_deleter_sp = deleter_sp->Clone(ConstString("deleter"));
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxUniquePtrSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.h b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.h
index 72da6b2426efec..cc8e13d10d39ce 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.h
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.h
@@ -91,7 +91,7 @@ class LibCxxMapIteratorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -139,7 +139,7 @@ class LibCxxUnorderedMapIteratorSyntheticFrontEnd
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -170,7 +170,7 @@ class LibcxxSharedPtrSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -190,7 +190,7 @@ class LibcxxUniquePtrSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxAtomic.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxAtomic.cpp
index eacc60886c6eba..c81b1e8012f6a9 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxAtomic.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxAtomic.cpp
@@ -94,7 +94,7 @@ class LibcxxStdAtomicSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -110,12 +110,13 @@ lldb_private::formatters::LibcxxStdAtomicSyntheticFrontEnd::
     LibcxxStdAtomicSyntheticFrontEnd(lldb::ValueObjectSP valobj_sp)
     : SyntheticChildrenFrontEnd(*valobj_sp) {}
 
-bool lldb_private::formatters::LibcxxStdAtomicSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdAtomicSyntheticFrontEnd::Update() {
   ValueObjectSP atomic_value = GetLibCxxAtomicValue(m_backend);
   if (atomic_value)
     m_real_child = GetLibCxxAtomicValue(m_backend).get();
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxStdAtomicSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxInitializerList.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxInitializerList.cpp
index bfd7b881a7288a..3c33f94f923734 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxInitializerList.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxInitializerList.cpp
@@ -30,7 +30,7 @@ class LibcxxInitializerListSyntheticFrontEnd
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -82,13 +82,13 @@ lldb::ValueObjectSP lldb_private::formatters::
                                       m_element_type);
 }
 
-bool lldb_private::formatters::LibcxxInitializerListSyntheticFrontEnd::
-    Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxInitializerListSyntheticFrontEnd::Update() {
   m_start = nullptr;
   m_num_elements = 0;
   m_element_type = m_backend.GetCompilerType().GetTypeTemplateArgument(0);
   if (!m_element_type.IsValid())
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   if (std::optional<uint64_t> size = m_element_type.GetByteSize(nullptr)) {
     m_element_size = *size;
@@ -96,7 +96,7 @@ bool lldb_private::formatters::LibcxxInitializerListSyntheticFrontEnd::
     m_start = m_backend.GetChildMemberWithName("__begin_").get();
   }
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxInitializerListSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp
index 2e2e2a8b0515a9..e28ef818b10faf 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp
@@ -109,7 +109,7 @@ class AbstractListFrontEnd : public SyntheticChildrenFrontEnd {
     return ExtractIndexFromString(name.GetCString());
   }
   bool MightHaveChildren() override { return true; }
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
 protected:
   AbstractListFrontEnd(ValueObject &valobj)
@@ -138,7 +138,7 @@ class ForwardListFrontEnd : public AbstractListFrontEnd {
 
   size_t CalculateNumChildren() override;
   ValueObjectSP GetChildAtIndex(size_t idx) override;
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 };
 
 class ListFrontEnd : public AbstractListFrontEnd {
@@ -151,7 +151,7 @@ class ListFrontEnd : public AbstractListFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
 private:
   lldb::addr_t m_node_address = 0;
@@ -160,7 +160,7 @@ class ListFrontEnd : public AbstractListFrontEnd {
 
 } // end anonymous namespace
 
-bool AbstractListFrontEnd::Update() {
+lldb::ChildCacheState AbstractListFrontEnd::Update() {
   m_loop_detected = 0;
   m_count = UINT32_MAX;
   m_head = nullptr;
@@ -180,10 +180,10 @@ bool AbstractListFrontEnd::Update() {
     list_type = list_type.GetNonReferenceType();
 
   if (list_type.GetNumTemplateArguments() == 0)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_element_type = list_type.GetTypeTemplateArgument(0);
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool AbstractListFrontEnd::HasLoop(size_t count) {
@@ -284,22 +284,22 @@ ValueObjectSP ForwardListFrontEnd::GetChildAtIndex(size_t idx) {
                                    m_element_type);
 }
 
-bool ForwardListFrontEnd::Update() {
+lldb::ChildCacheState ForwardListFrontEnd::Update() {
   AbstractListFrontEnd::Update();
 
   Status err;
   ValueObjectSP backend_addr(m_backend.AddressOf(err));
   if (err.Fail() || !backend_addr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP impl_sp(m_backend.GetChildMemberWithName("__before_begin_"));
   if (!impl_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   impl_sp = GetFirstValueOfLibCXXCompressedPair(*impl_sp);
   if (!impl_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_head = impl_sp->GetChildMemberWithName("__next_").get();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 ListFrontEnd::ListFrontEnd(lldb::ValueObjectSP valobj_sp)
@@ -394,7 +394,7 @@ lldb::ValueObjectSP ListFrontEnd::GetChildAtIndex(size_t idx) {
                                    m_element_type);
 }
 
-bool ListFrontEnd::Update() {
+lldb::ChildCacheState ListFrontEnd::Update() {
   AbstractListFrontEnd::Update();
   m_tail = nullptr;
   m_node_address = 0;
@@ -402,16 +402,16 @@ bool ListFrontEnd::Update() {
   Status err;
   ValueObjectSP backend_addr(m_backend.AddressOf(err));
   if (err.Fail() || !backend_addr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_node_address = backend_addr->GetValueAsUnsigned(0);
   if (!m_node_address || m_node_address == LLDB_INVALID_ADDRESS)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   ValueObjectSP impl_sp(m_backend.GetChildMemberWithName("__end_"));
   if (!impl_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_head = impl_sp->GetChildMemberWithName("__next_").get();
   m_tail = impl_sp->GetChildMemberWithName("__prev_").get();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 SyntheticChildrenFrontEnd *formatters::LibcxxStdListSyntheticFrontEndCreator(
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
index d3ee63a35e107c..d208acfc9da47e 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
@@ -181,7 +181,7 @@ class LibcxxStdMapSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -405,15 +405,16 @@ lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetChildAtIndex(
   return potential_child_sp;
 }
 
-bool lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::Update() {
   m_count = UINT32_MAX;
   m_tree = m_root_node = nullptr;
   m_iterators.clear();
   m_tree = m_backend.GetChildMemberWithName("__tree_").get();
   if (!m_tree)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_root_node = m_tree->GetChildMemberWithName("__begin_node_").get();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxQueue.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxQueue.cpp
index c31940af08813b..83f93b16fc9a2d 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxQueue.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxQueue.cpp
@@ -26,7 +26,7 @@ class QueueFrontEnd : public SyntheticChildrenFrontEnd {
   }
 
   bool MightHaveChildren() override { return true; }
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   size_t CalculateNumChildren() override {
     return m_container_sp ? m_container_sp->GetNumChildren() : 0;
@@ -47,13 +47,13 @@ class QueueFrontEnd : public SyntheticChildrenFrontEnd {
 };
 } // namespace
 
-bool QueueFrontEnd::Update() {
+lldb::ChildCacheState QueueFrontEnd::Update() {
   m_container_sp = nullptr;
   ValueObjectSP c_sp = m_backend.GetChildMemberWithName("c");
   if (!c_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_container_sp = c_sp->GetSyntheticValue().get();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 SyntheticChildrenFrontEnd *
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxRangesRefView.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxRangesRefView.cpp
index 6aeb557a95ff38..c032d67c66cb47 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxRangesRefView.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxRangesRefView.cpp
@@ -38,7 +38,7 @@ class LibcxxStdRangesRefViewSyntheticFrontEnd
     return m_range_sp;
   }
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override { return true; }
 
@@ -59,17 +59,18 @@ lldb_private::formatters::LibcxxStdRangesRefViewSyntheticFrontEnd::
     Update();
 }
 
-bool lldb_private::formatters::LibcxxStdRangesRefViewSyntheticFrontEnd::
-    Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdRangesRefViewSyntheticFrontEnd::Update() {
   ValueObjectSP range_ptr =
       GetChildMemberWithName(m_backend, {ConstString("__range_")});
   if (!range_ptr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   lldb_private::Status error;
   m_range_sp = range_ptr->Dereference(error);
 
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 lldb_private::SyntheticChildrenFrontEnd *
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxSpan.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxSpan.cpp
index ec062ed21ee405..4ddfaef9c0ad54 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxSpan.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxSpan.cpp
@@ -53,7 +53,7 @@ class LibcxxStdSpanSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
   // This function checks for a '__size' member to determine the number
   // of elements in the span. If no such member exists, we get the size
   // from the only other place it can be: the template argument.
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -93,12 +93,13 @@ lldb_private::formatters::LibcxxStdSpanSyntheticFrontEnd::GetChildAtIndex(
                                       m_element_type);
 }
 
-bool lldb_private::formatters::LibcxxStdSpanSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdSpanSyntheticFrontEnd::Update() {
   // Get element type.
   ValueObjectSP data_type_finder_sp = GetChildMemberWithName(
       m_backend, {ConstString("__data_"), ConstString("__data")});
   if (!data_type_finder_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_element_type = data_type_finder_sp->GetCompilerType().GetPointeeType();
 
@@ -122,7 +123,7 @@ bool lldb_private::formatters::LibcxxStdSpanSyntheticFrontEnd::Update() {
     }
   }
 
-  return true;
+  return lldb::ChildCacheState::eReuse;
 }
 
 bool lldb_private::formatters::LibcxxStdSpanSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxTuple.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxTuple.cpp
index 9024ed4dba45fd..546871012d2b38 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxTuple.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxTuple.cpp
@@ -25,7 +25,7 @@ class TupleFrontEnd: public SyntheticChildrenFrontEnd {
   }
 
   bool MightHaveChildren() override { return true; }
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
   size_t CalculateNumChildren() override { return m_elements.size(); }
   ValueObjectSP GetChildAtIndex(size_t idx) override;
 
@@ -40,7 +40,7 @@ class TupleFrontEnd: public SyntheticChildrenFrontEnd {
 };
 }
 
-bool TupleFrontEnd::Update() {
+lldb::ChildCacheState TupleFrontEnd::Update() {
   m_elements.clear();
   m_base = nullptr;
 
@@ -51,11 +51,11 @@ bool TupleFrontEnd::Update() {
     base_sp = m_backend.GetChildMemberWithName("base_");
   }
   if (!base_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_base = base_sp.get();
   m_elements.assign(base_sp->GetCompilerType().GetNumDirectBaseClasses(),
                     nullptr);
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 ValueObjectSP TupleFrontEnd::GetChildAtIndex(size_t idx) {
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp
index 1a85d37ebf0cca..4cac52f235a19a 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp
@@ -37,7 +37,7 @@ class LibcxxStdUnorderedMapSyntheticFrontEnd
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -193,41 +193,41 @@ lldb::ValueObjectSP lldb_private::formatters::
                                    m_element_type);
 }
 
-bool lldb_private::formatters::LibcxxStdUnorderedMapSyntheticFrontEnd::
-    Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdUnorderedMapSyntheticFrontEnd::Update() {
   m_num_elements = 0;
   m_next_element = nullptr;
   m_elements_cache.clear();
   ValueObjectSP table_sp = m_backend.GetChildMemberWithName("__table_");
   if (!table_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP p2_sp = table_sp->GetChildMemberWithName("__p2_");
   if (!p2_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP num_elements_sp = GetFirstValueOfLibCXXCompressedPair(*p2_sp);
   if (!num_elements_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP p1_sp = table_sp->GetChildMemberWithName("__p1_");
   if (!p1_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP value_sp = GetFirstValueOfLibCXXCompressedPair(*p1_sp);
   if (!value_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_tree = value_sp->GetChildMemberWithName("__next_").get();
   if (m_tree == nullptr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_num_elements = num_elements_sp->GetValueAsUnsigned(0);
 
   if (m_num_elements > 0)
     m_next_element = m_tree;
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxStdUnorderedMapSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxVariant.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxVariant.cpp
index e863ccca2be839..ecbb7cf0ca2b46 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxVariant.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxVariant.cpp
@@ -204,7 +204,7 @@ class VariantFrontEnd : public SyntheticChildrenFrontEnd {
   }
 
   bool MightHaveChildren() override { return true; }
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
   size_t CalculateNumChildren() override { return m_size; }
   ValueObjectSP GetChildAtIndex(size_t idx) override;
 
@@ -213,24 +213,24 @@ class VariantFrontEnd : public SyntheticChildrenFrontEnd {
 };
 } // namespace
 
-bool VariantFrontEnd::Update() {
+lldb::ChildCacheState VariantFrontEnd::Update() {
   m_size = 0;
   ValueObjectSP impl_sp = formatters::GetChildMemberWithName(
       m_backend, {ConstString("__impl_"), ConstString("__impl")});
   if (!impl_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   LibcxxVariantIndexValidity validity = LibcxxVariantGetIndexValidity(impl_sp);
 
   if (validity == LibcxxVariantIndexValidity::Invalid)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   if (validity == LibcxxVariantIndexValidity::NPos)
-    return true;
+    return lldb::ChildCacheState::eReuse;
 
   m_size = 1;
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 ValueObjectSP VariantFrontEnd::GetChildAtIndex(size_t idx) {
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxVector.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxVector.cpp
index 9d88fcf9953092..0c3c3f02b60c7b 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxVector.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxVector.cpp
@@ -29,7 +29,7 @@ class LibcxxStdVectorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -50,7 +50,7 @@ class LibcxxVectorBoolSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override { return true; }
 
@@ -116,17 +116,18 @@ lldb_private::formatters::LibcxxStdVectorSyntheticFrontEnd::GetChildAtIndex(
                                       m_element_type);
 }
 
-bool lldb_private::formatters::LibcxxStdVectorSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxStdVectorSyntheticFrontEnd::Update() {
   m_start = m_finish = nullptr;
   ValueObjectSP data_type_finder_sp(
       m_backend.GetChildMemberWithName("__end_cap_"));
   if (!data_type_finder_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   data_type_finder_sp =
       GetFirstValueOfLibCXXCompressedPair(*data_type_finder_sp);
   if (!data_type_finder_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_element_type = data_type_finder_sp->GetCompilerType().GetPointeeType();
   if (std::optional<uint64_t> size = m_element_type.GetByteSize(nullptr)) {
@@ -138,7 +139,7 @@ bool lldb_private::formatters::LibcxxStdVectorSyntheticFrontEnd::Update() {
       m_finish = m_backend.GetChildMemberWithName("__end_").get();
     }
   }
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::LibcxxStdVectorSyntheticFrontEnd::
@@ -226,29 +227,30 @@ lldb_private::formatters::LibcxxVectorBoolSyntheticFrontEnd::GetChildAtIndex(
  }
  }*/
 
-bool lldb_private::formatters::LibcxxVectorBoolSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::LibcxxVectorBoolSyntheticFrontEnd::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   ValueObjectSP size_sp(valobj_sp->GetChildMemberWithName("__size_"));
   if (!size_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_count = size_sp->GetValueAsUnsigned(0);
   if (!m_count)
-    return true;
+    return lldb::ChildCacheState::eReuse;
   ValueObjectSP begin_sp(valobj_sp->GetChildMemberWithName("__begin_"));
   if (!begin_sp) {
     m_count = 0;
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
   m_base_data_address = begin_sp->GetValueAsUnsigned(0);
   if (!m_base_data_address) {
     m_count = 0;
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 size_t lldb_private::formatters::LibcxxVectorBoolSyntheticFrontEnd::
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibStdcpp.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibStdcpp.cpp
index 23af50fdb7124e..411551839e1e61 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibStdcpp.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibStdcpp.cpp
@@ -47,7 +47,7 @@ class LibstdcppMapIteratorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -68,7 +68,7 @@ class LibStdcppSharedPtrSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -94,29 +94,29 @@ LibstdcppMapIteratorSyntheticFrontEnd::LibstdcppMapIteratorSyntheticFrontEnd(
     Update();
 }
 
-bool LibstdcppMapIteratorSyntheticFrontEnd::Update() {
+lldb::ChildCacheState LibstdcppMapIteratorSyntheticFrontEnd::Update() {
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   TargetSP target_sp(valobj_sp->GetTargetSP());
 
   if (!target_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   bool is_64bit = (target_sp->GetArchitecture().GetAddressByteSize() == 8);
 
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
 
   ValueObjectSP _M_node_sp(valobj_sp->GetChildMemberWithName("_M_node"));
   if (!_M_node_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_pair_address = _M_node_sp->GetValueAsUnsigned(0);
   if (m_pair_address == 0)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_pair_address += (is_64bit ? 32 : 16);
 
@@ -124,12 +124,12 @@ bool LibstdcppMapIteratorSyntheticFrontEnd::Update() {
   if (my_type.GetNumTemplateArguments() >= 1) {
     CompilerType pair_type = my_type.GetTypeTemplateArgument(0);
     if (!pair_type)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     m_pair_type = pair_type;
   } else
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
-  return true;
+  return lldb::ChildCacheState::eReuse;
 }
 
 size_t LibstdcppMapIteratorSyntheticFrontEnd::CalculateNumChildren() {
@@ -193,22 +193,22 @@ lldb_private::formatters::VectorIteratorSyntheticFrontEnd::
     Update();
 }
 
-bool VectorIteratorSyntheticFrontEnd::Update() {
+lldb::ChildCacheState VectorIteratorSyntheticFrontEnd::Update() {
   m_item_sp.reset();
 
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP item_ptr =
       formatters::GetChildMemberWithName(*valobj_sp, m_item_names);
   if (!item_ptr)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   if (item_ptr->GetValueAsUnsigned(0) == 0)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   Status err;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   m_item_sp = CreateValueObjectFromAddress(
@@ -216,7 +216,7 @@ bool VectorIteratorSyntheticFrontEnd::Update() {
       item_ptr->GetCompilerType().GetPointeeType());
   if (err.Fail())
     m_item_sp.reset();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 size_t VectorIteratorSyntheticFrontEnd::CalculateNumChildren() { return 1; }
@@ -390,23 +390,23 @@ LibStdcppSharedPtrSyntheticFrontEnd::GetChildAtIndex(size_t idx) {
   return lldb::ValueObjectSP();
 }
 
-bool LibStdcppSharedPtrSyntheticFrontEnd::Update() {
+lldb::ChildCacheState LibStdcppSharedPtrSyntheticFrontEnd::Update() {
   auto backend = m_backend.GetSP();
   if (!backend)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   auto valobj_sp = backend->GetNonSyntheticValue();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   auto ptr_obj_sp = valobj_sp->GetChildMemberWithName("_M_ptr");
   if (!ptr_obj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   m_ptr_obj = ptr_obj_sp->Clone(ConstString("pointer")).get();
   m_obj_obj = nullptr;
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool LibStdcppSharedPtrSyntheticFrontEnd::MightHaveChildren() { return true; }
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibStdcppTuple.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibStdcppTuple.cpp
index f1bfeae5099b7c..189f9561e52a1b 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibStdcppTuple.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibStdcppTuple.cpp
@@ -30,7 +30,7 @@ class LibStdcppTupleSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -53,12 +53,12 @@ LibStdcppTupleSyntheticFrontEnd::LibStdcppTupleSyntheticFrontEnd(
   Update();
 }
 
-bool LibStdcppTupleSyntheticFrontEnd::Update() {
+lldb::ChildCacheState LibStdcppTupleSyntheticFrontEnd::Update() {
   m_members.clear();
 
   ValueObjectSP valobj_backend_sp = m_backend.GetSP();
   if (!valobj_backend_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   ValueObjectSP next_child_sp = valobj_backend_sp->GetNonSyntheticValue();
   while (next_child_sp != nullptr) {
@@ -83,7 +83,7 @@ bool LibStdcppTupleSyntheticFrontEnd::Update() {
     }
   }
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool LibStdcppTupleSyntheticFrontEnd::MightHaveChildren() { return true; }
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibStdcppUniquePointer.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibStdcppUniquePointer.cpp
index a84d641b57bc47..3b0f6329d0e3ff 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibStdcppUniquePointer.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibStdcppUniquePointer.cpp
@@ -30,7 +30,7 @@ class LibStdcppUniquePtrSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -84,11 +84,11 @@ ValueObjectSP LibStdcppUniquePtrSyntheticFrontEnd::GetTuple() {
   return obj_child_sp;
 }
 
-bool LibStdcppUniquePtrSyntheticFrontEnd::Update() {
+lldb::ChildCacheState LibStdcppUniquePtrSyntheticFrontEnd::Update() {
   ValueObjectSP tuple_sp = GetTuple();
 
   if (!tuple_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
 
   std::unique_ptr<SyntheticChildrenFrontEnd> tuple_frontend(
       LibStdcppTupleSyntheticFrontEndCreator(nullptr, tuple_sp));
@@ -110,7 +110,7 @@ bool LibStdcppUniquePtrSyntheticFrontEnd::Update() {
   }
   m_obj_obj = nullptr;
 
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool LibStdcppUniquePtrSyntheticFrontEnd::MightHaveChildren() { return true; }
diff --git a/lldb/source/Plugins/Language/ObjC/Cocoa.cpp b/lldb/source/Plugins/Language/ObjC/Cocoa.cpp
index f1a7e04bc9d1bf..64047dc53236bf 100644
--- a/lldb/source/Plugins/Language/ObjC/Cocoa.cpp
+++ b/lldb/source/Plugins/Language/ObjC/Cocoa.cpp
@@ -1044,7 +1044,9 @@ class ObjCClassSyntheticChildrenFrontEnd : public SyntheticChildrenFrontEnd {
     return lldb::ValueObjectSP();
   }
 
-  bool Update() override { return false; }
+  lldb::ChildCacheState Update() override {
+    return lldb::ChildCacheState::eRefetch;
+  }
 
   bool MightHaveChildren() override { return false; }
 
diff --git a/lldb/source/Plugins/Language/ObjC/NSArray.cpp b/lldb/source/Plugins/Language/ObjC/NSArray.cpp
index 7d0004c572ed6b..09bf7a23d6097e 100644
--- a/lldb/source/Plugins/Language/ObjC/NSArray.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSArray.cpp
@@ -54,7 +54,7 @@ class NSArrayMSyntheticFrontEndBase : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override = 0;
+  lldb::ChildCacheState Update() override = 0;
 
   bool MightHaveChildren() override;
 
@@ -81,7 +81,7 @@ class GenericNSArrayMSyntheticFrontEnd : public NSArrayMSyntheticFrontEndBase {
 
   ~GenericNSArrayMSyntheticFrontEnd() override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
 protected:
   lldb::addr_t GetDataAddress() override;
@@ -218,7 +218,7 @@ class GenericNSArrayISyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -306,7 +306,7 @@ class NSArray0SyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -323,7 +323,7 @@ class NSArray1SyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -500,9 +500,8 @@ lldb_private::formatters::NSArrayMSyntheticFrontEndBase::GetChildAtIndex(
 }
 
 template <typename D32, typename D64>
-bool
-lldb_private::formatters::
-  GenericNSArrayMSyntheticFrontEnd<D32, D64>::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::GenericNSArrayMSyntheticFrontEnd<D32, D64>::Update() {
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
   delete m_data_32;
@@ -510,13 +509,13 @@ lldb_private::formatters::
   delete m_data_64;
   m_data_64 = nullptr;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
   if (m_ptr_size == 4) {
@@ -529,7 +528,8 @@ lldb_private::formatters::
                            error);
   }
 
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 bool
@@ -641,9 +641,9 @@ lldb_private::formatters::GenericNSArrayISyntheticFrontEnd<D32, D64, Inline>::
 }
 
 template <typename D32, typename D64, bool Inline>
-bool
-lldb_private::formatters::GenericNSArrayISyntheticFrontEnd<D32, D64, Inline>::
-  Update() {
+lldb::ChildCacheState
+lldb_private::formatters::GenericNSArrayISyntheticFrontEnd<D32, D64,
+                                                           Inline>::Update() {
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
   delete m_data_32;
@@ -651,13 +651,13 @@ lldb_private::formatters::GenericNSArrayISyntheticFrontEnd<D32, D64, Inline>::
   delete m_data_64;
   m_data_64 = nullptr;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
   if (m_ptr_size == 4) {
@@ -670,7 +670,8 @@ lldb_private::formatters::GenericNSArrayISyntheticFrontEnd<D32, D64, Inline>::
                            error);
   }
 
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 template <typename D32, typename D64, bool Inline>
@@ -723,8 +724,9 @@ lldb_private::formatters::NSArray0SyntheticFrontEnd::CalculateNumChildren() {
   return 0;
 }
 
-bool lldb_private::formatters::NSArray0SyntheticFrontEnd::Update() {
-  return false;
+lldb::ChildCacheState
+lldb_private::formatters::NSArray0SyntheticFrontEnd::Update() {
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSArray0SyntheticFrontEnd::MightHaveChildren() {
@@ -757,8 +759,9 @@ lldb_private::formatters::NSArray1SyntheticFrontEnd::CalculateNumChildren() {
   return 1;
 }
 
-bool lldb_private::formatters::NSArray1SyntheticFrontEnd::Update() {
-  return false;
+lldb::ChildCacheState
+lldb_private::formatters::NSArray1SyntheticFrontEnd::Update() {
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSArray1SyntheticFrontEnd::MightHaveChildren() {
diff --git a/lldb/source/Plugins/Language/ObjC/NSDictionary.cpp b/lldb/source/Plugins/Language/ObjC/NSDictionary.cpp
index d377ee74ccc05d..9c252a98de8357 100644
--- a/lldb/source/Plugins/Language/ObjC/NSDictionary.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSDictionary.cpp
@@ -107,7 +107,7 @@ class NSDictionaryISyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -148,7 +148,7 @@ class NSConstantDictionarySyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -180,7 +180,7 @@ class NSCFDictionarySyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -213,7 +213,7 @@ class NSDictionary1SyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -234,7 +234,7 @@ class GenericNSDictionaryMSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -266,9 +266,9 @@ namespace Foundation1100 {
     size_t CalculateNumChildren() override;
     
     lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
-    
-    bool Update() override;
-    
+
+    lldb::ChildCacheState Update() override;
+
     bool MightHaveChildren() override;
     
     size_t GetIndexOfChildWithName(ConstString name) override;
@@ -613,7 +613,8 @@ size_t lldb_private::formatters::NSDictionaryISyntheticFrontEnd::
   return (m_data_32 ? m_data_32->_used : m_data_64->_used);
 }
 
-bool lldb_private::formatters::NSDictionaryISyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSDictionaryISyntheticFrontEnd::Update() {
   m_children.clear();
   delete m_data_32;
   m_data_32 = nullptr;
@@ -622,13 +623,13 @@ bool lldb_private::formatters::NSDictionaryISyntheticFrontEnd::Update() {
   m_ptr_size = 0;
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
@@ -642,9 +643,9 @@ bool lldb_private::formatters::NSDictionaryISyntheticFrontEnd::Update() {
                            error);
   }
   if (error.Fail())
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_data_ptr = data_location + m_ptr_size;
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSDictionaryISyntheticFrontEnd::
@@ -750,20 +751,23 @@ size_t lldb_private::formatters::NSCFDictionarySyntheticFrontEnd::
   return m_hashtable.GetCount();
 }
 
-bool lldb_private::formatters::NSCFDictionarySyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSCFDictionarySyntheticFrontEnd::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
 
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
-  return m_hashtable.Update(valobj_sp->GetValueAsUnsigned(0), m_exe_ctx_ref);
+  return m_hashtable.Update(valobj_sp->GetValueAsUnsigned(0), m_exe_ctx_ref)
+             ? lldb::ChildCacheState::eReuse
+             : lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSCFDictionarySyntheticFrontEnd::
@@ -881,30 +885,33 @@ size_t lldb_private::formatters::NSConstantDictionarySyntheticFrontEnd::
   return m_size;
 }
 
-bool lldb_private::formatters::NSConstantDictionarySyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSConstantDictionarySyntheticFrontEnd::Update() {
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
   uint64_t valobj_addr = valobj_sp->GetValueAsUnsigned(0);
   m_size = process_sp->ReadUnsignedIntegerFromMemory(
       valobj_addr + 2 * m_ptr_size, m_ptr_size, 0, error);
   if (error.Fail())
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_keys_ptr =
       process_sp->ReadPointerFromMemory(valobj_addr + 3 * m_ptr_size, error);
   if (error.Fail())
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_objects_ptr =
       process_sp->ReadPointerFromMemory(valobj_addr + 4 * m_ptr_size, error);
-  return !error.Fail();
+
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSConstantDictionarySyntheticFrontEnd::
@@ -992,9 +999,10 @@ size_t lldb_private::formatters::NSDictionary1SyntheticFrontEnd::
   return 1;
 }
 
-bool lldb_private::formatters::NSDictionary1SyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSDictionary1SyntheticFrontEnd::Update() {
   m_pair.reset();
-  return false;
+  return lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSDictionary1SyntheticFrontEnd::
@@ -1087,9 +1095,9 @@ lldb_private::formatters::GenericNSDictionaryMSyntheticFrontEnd<D32,D64>::Calcul
 }
 
 template <typename D32, typename D64>
-bool
-lldb_private::formatters::GenericNSDictionaryMSyntheticFrontEnd<D32,D64>::
-  Update() {
+lldb::ChildCacheState
+lldb_private::formatters::GenericNSDictionaryMSyntheticFrontEnd<D32,
+                                                                D64>::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
@@ -1098,13 +1106,13 @@ lldb_private::formatters::GenericNSDictionaryMSyntheticFrontEnd<D32,D64>::
   delete m_data_64;
   m_data_64 = nullptr;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
@@ -1118,7 +1126,8 @@ lldb_private::formatters::GenericNSDictionaryMSyntheticFrontEnd<D32,D64>::
                            error);
   }
 
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 template <typename D32, typename D64>
@@ -1249,9 +1258,8 @@ lldb_private::formatters::Foundation1100::
   return (m_data_32 ? m_data_32->_used : m_data_64->_used);
 }
 
-bool
-lldb_private::formatters::Foundation1100::
-  NSDictionaryMSyntheticFrontEnd::Update() {
+lldb::ChildCacheState lldb_private::formatters::Foundation1100::
+    NSDictionaryMSyntheticFrontEnd::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
@@ -1260,13 +1268,13 @@ lldb_private::formatters::Foundation1100::
   delete m_data_64;
   m_data_64 = nullptr;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   Status error;
   error.Clear();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
@@ -1280,7 +1288,8 @@ lldb_private::formatters::Foundation1100::
                            error);
   }
 
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 bool
diff --git a/lldb/source/Plugins/Language/ObjC/NSError.cpp b/lldb/source/Plugins/Language/ObjC/NSError.cpp
index 99eeb2d5092f26..ce52ae542a50cb 100644
--- a/lldb/source/Plugins/Language/ObjC/NSError.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSError.cpp
@@ -133,17 +133,17 @@ class NSErrorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
     return m_child_sp;
   }
 
-  bool Update() override {
+  lldb::ChildCacheState Update() override {
     m_child_ptr = nullptr;
     m_child_sp.reset();
 
     ProcessSP process_sp(m_backend.GetProcessSP());
     if (!process_sp)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     lldb::addr_t userinfo_location = DerefToNSErrorPointer(m_backend);
     if (userinfo_location == LLDB_INVALID_ADDRESS)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     size_t ptr_size = process_sp->GetAddressByteSize();
 
@@ -152,17 +152,17 @@ class NSErrorSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
     lldb::addr_t userinfo =
         process_sp->ReadPointerFromMemory(userinfo_location, error);
     if (userinfo == LLDB_INVALID_ADDRESS || error.Fail())
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     InferiorSizedWord isw(userinfo, *process_sp);
     TypeSystemClangSP scratch_ts_sp =
         ScratchTypeSystemClang::GetForTarget(process_sp->GetTarget());
     if (!scratch_ts_sp)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
     m_child_sp = CreateValueObjectFromData(
         "_userInfo", isw.GetAsData(process_sp->GetByteOrder()),
         m_backend.GetExecutionContextRef(),
         scratch_ts_sp->GetBasicType(lldb::eBasicTypeObjCID));
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
 
   bool MightHaveChildren() override { return true; }
diff --git a/lldb/source/Plugins/Language/ObjC/NSException.cpp b/lldb/source/Plugins/Language/ObjC/NSException.cpp
index 29805bb2d5fe86..e8011e5d2ca0be 100644
--- a/lldb/source/Plugins/Language/ObjC/NSException.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSException.cpp
@@ -137,14 +137,17 @@ class NSExceptionSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
     return lldb::ValueObjectSP();
   }
 
-  bool Update() override {
+  lldb::ChildCacheState Update() override {
     m_name_sp.reset();
     m_reason_sp.reset();
     m_userinfo_sp.reset();
     m_reserved_sp.reset();
 
-    return ExtractFields(m_backend, &m_name_sp, &m_reason_sp, &m_userinfo_sp,
-                         &m_reserved_sp);
+    const auto ret = ExtractFields(m_backend, &m_name_sp, &m_reason_sp,
+                                   &m_userinfo_sp, &m_reserved_sp);
+
+    return ret ? lldb::ChildCacheState::eReuse
+               : lldb::ChildCacheState::eRefetch;
   }
 
   bool MightHaveChildren() override { return true; }
diff --git a/lldb/source/Plugins/Language/ObjC/NSIndexPath.cpp b/lldb/source/Plugins/Language/ObjC/NSIndexPath.cpp
index 2a4ce80224e9e9..69e6ab1055d8c6 100644
--- a/lldb/source/Plugins/Language/ObjC/NSIndexPath.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSIndexPath.cpp
@@ -46,17 +46,17 @@ class NSIndexPathSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
     return m_impl.GetIndexAtIndex(idx, m_uint_star_type);
   }
 
-  bool Update() override {
+  lldb::ChildCacheState Update() override {
     m_impl.Clear();
 
     auto type_system = m_backend.GetCompilerType().GetTypeSystem();
     if (!type_system)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     auto ast = ScratchTypeSystemClang::GetForTarget(
         *m_backend.GetExecutionContextRef().GetTargetSP());
     if (!ast)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     m_uint_star_type = ast->GetPointerSizedIntType(false);
 
@@ -65,18 +65,18 @@ class NSIndexPathSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
     ProcessSP process_sp = m_backend.GetProcessSP();
     if (!process_sp)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     ObjCLanguageRuntime *runtime = ObjCLanguageRuntime::Get(*process_sp);
 
     if (!runtime)
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     ObjCLanguageRuntime::ClassDescriptorSP descriptor(
         runtime->GetClassDescriptor(m_backend));
 
     if (!descriptor.get() || !descriptor->IsValid())
-      return false;
+      return lldb::ChildCacheState::eRefetch;
 
     uint64_t info_bits(0), value_bits(0), payload(0);
 
@@ -119,7 +119,7 @@ class NSIndexPathSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
         }
       }
     }
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   }
 
   bool MightHaveChildren() override { return m_impl.m_mode != Mode::Invalid; }
diff --git a/lldb/source/Plugins/Language/ObjC/NSSet.cpp b/lldb/source/Plugins/Language/ObjC/NSSet.cpp
index ed1751cc128ca2..ede64852d9a879 100644
--- a/lldb/source/Plugins/Language/ObjC/NSSet.cpp
+++ b/lldb/source/Plugins/Language/ObjC/NSSet.cpp
@@ -50,7 +50,7 @@ class NSSetISyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -88,7 +88,7 @@ class NSCFSetSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -121,7 +121,7 @@ class GenericNSSetMSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -237,7 +237,7 @@ class NSSetCodeRunningSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
   lldb::ValueObjectSP GetChildAtIndex(size_t idx) override;
 
-  bool Update() override;
+  lldb::ChildCacheState Update() override;
 
   bool MightHaveChildren() override;
 
@@ -426,7 +426,8 @@ lldb_private::formatters::NSSetISyntheticFrontEnd::CalculateNumChildren() {
   return (m_data_32 ? m_data_32->_used : m_data_64->_used);
 }
 
-bool lldb_private::formatters::NSSetISyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSSetISyntheticFrontEnd::Update() {
   m_children.clear();
   delete m_data_32;
   m_data_32 = nullptr;
@@ -435,13 +436,13 @@ bool lldb_private::formatters::NSSetISyntheticFrontEnd::Update() {
   m_ptr_size = 0;
   ValueObjectSP valobj_sp = m_backend.GetSP();
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
   Status error;
@@ -455,9 +456,9 @@ bool lldb_private::formatters::NSSetISyntheticFrontEnd::Update() {
                            error);
   }
   if (error.Fail())
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_data_ptr = data_location + m_ptr_size;
-  return true;
+  return lldb::ChildCacheState::eReuse;
 }
 
 bool lldb_private::formatters::NSSetISyntheticFrontEnd::MightHaveChildren() {
@@ -561,20 +562,23 @@ lldb_private::formatters::NSCFSetSyntheticFrontEnd::CalculateNumChildren() {
   return m_hashtable.GetCount();
 }
 
-bool lldb_private::formatters::NSCFSetSyntheticFrontEnd::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::NSCFSetSyntheticFrontEnd::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
 
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   m_order = process_sp->GetByteOrder();
-  return m_hashtable.Update(valobj_sp->GetValueAsUnsigned(0), m_exe_ctx_ref);
+  return m_hashtable.Update(valobj_sp->GetValueAsUnsigned(0), m_exe_ctx_ref)
+             ? lldb::ChildCacheState::eReuse
+             : lldb::ChildCacheState::eRefetch;
 }
 
 bool lldb_private::formatters::NSCFSetSyntheticFrontEnd::MightHaveChildren() {
@@ -701,9 +705,8 @@ lldb_private::formatters::
 }
 
 template <typename D32, typename D64>
-bool
-lldb_private::formatters::
-  GenericNSSetMSyntheticFrontEnd<D32, D64>::Update() {
+lldb::ChildCacheState
+lldb_private::formatters::GenericNSSetMSyntheticFrontEnd<D32, D64>::Update() {
   m_children.clear();
   ValueObjectSP valobj_sp = m_backend.GetSP();
   m_ptr_size = 0;
@@ -712,13 +715,13 @@ lldb_private::formatters::
   delete m_data_64;
   m_data_64 = nullptr;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   if (!valobj_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_exe_ctx_ref = valobj_sp->GetExecutionContextRef();
   lldb::ProcessSP process_sp(valobj_sp->GetProcessSP());
   if (!process_sp)
-    return false;
+    return lldb::ChildCacheState::eRefetch;
   m_ptr_size = process_sp->GetAddressByteSize();
   uint64_t data_location = valobj_sp->GetValueAsUnsigned(0) + m_ptr_size;
   Status error;
@@ -731,7 +734,8 @@ lldb_private::formatters::
     process_sp->ReadMemory(data_location, m_data_64, sizeof(D64),
                            error);
   }
-  return error.Success();
+  return error.Success() ? lldb::ChildCacheState::eReuse
+                         : lldb::ChildCacheState::eRefetch;
 }
 
 template <typename D32, typename D64>

>From 80de4289d3794054ad074d9db604b9eee7426faa Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 11:43:29 +0000
Subject: [PATCH 18/72] [DAG] tryToFoldExtendOfConstant - share the same SDLoc
 argument instead of recreating it over and over again.

---
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 61 +++++++++----------
 1 file changed, 29 insertions(+), 32 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 4adea020011b82..d3cd9b1671e1b9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12739,12 +12739,12 @@ static SDValue tryToFoldExtendSelectLoad(SDNode *N, const TargetLowering &TLI,
 /// dag nodes (see for example method DAGCombiner::visitSIGN_EXTEND).
 /// Vector extends are not folded if operations are legal; this is to
 /// avoid introducing illegal build_vector dag nodes.
-static SDValue tryToFoldExtendOfConstant(SDNode *N, const TargetLowering &TLI,
+static SDValue tryToFoldExtendOfConstant(SDNode *N, const SDLoc &DL,
+                                         const TargetLowering &TLI,
                                          SelectionDAG &DAG, bool LegalTypes) {
   unsigned Opcode = N->getOpcode();
   SDValue N0 = N->getOperand(0);
   EVT VT = N->getValueType(0);
-  SDLoc DL(N);
 
   assert((ISD::isExtOpcode(Opcode) || ISD::isExtVecInRegOpcode(Opcode)) &&
          "Expected EXTEND dag node in input!");
@@ -13400,7 +13400,7 @@ SDValue DAGCombiner::visitSIGN_EXTEND(SDNode *N) {
   if (N0.isUndef())
     return DAG.getConstant(0, DL, VT);
 
-  if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))
+  if (SDValue Res = tryToFoldExtendOfConstant(N, DL, TLI, DAG, LegalTypes))
     return Res;
 
   // fold (sext (sext x)) -> (sext x)
@@ -13669,7 +13669,7 @@ SDValue DAGCombiner::visitZERO_EXTEND(SDNode *N) {
   if (N0.isUndef())
     return DAG.getConstant(0, DL, VT);
 
-  if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))
+  if (SDValue Res = tryToFoldExtendOfConstant(N, DL, TLI, DAG, LegalTypes))
     return Res;
 
   // fold (zext (zext x)) -> (zext x)
@@ -13937,12 +13937,13 @@ SDValue DAGCombiner::visitZERO_EXTEND(SDNode *N) {
 SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
   SDValue N0 = N->getOperand(0);
   EVT VT = N->getValueType(0);
+  SDLoc DL(N);
 
   // aext(undef) = undef
   if (N0.isUndef())
     return DAG.getUNDEF(VT);
 
-  if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))
+  if (SDValue Res = tryToFoldExtendOfConstant(N, DL, TLI, DAG, LegalTypes))
     return Res;
 
   // fold (aext (aext x)) -> (aext x)
@@ -13951,7 +13952,7 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
   if (N0.getOpcode() == ISD::ANY_EXTEND  ||
       N0.getOpcode() == ISD::ZERO_EXTEND ||
       N0.getOpcode() == ISD::SIGN_EXTEND)
-    return DAG.getNode(N0.getOpcode(), SDLoc(N), VT, N0.getOperand(0));
+    return DAG.getNode(N0.getOpcode(), DL, VT, N0.getOperand(0));
 
   // fold (aext (aext_extend_vector_inreg x)) -> (aext_extend_vector_inreg x)
   // fold (aext (zext_extend_vector_inreg x)) -> (zext_extend_vector_inreg x)
@@ -13959,7 +13960,7 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
   if (N0.getOpcode() == ISD::ANY_EXTEND_VECTOR_INREG ||
       N0.getOpcode() == ISD::ZERO_EXTEND_VECTOR_INREG ||
       N0.getOpcode() == ISD::SIGN_EXTEND_VECTOR_INREG)
-    return DAG.getNode(N0.getOpcode(), SDLoc(N), VT, N0.getOperand(0));
+    return DAG.getNode(N0.getOpcode(), DL, VT, N0.getOperand(0));
 
   // fold (aext (truncate (load x))) -> (aext (smaller load x))
   // fold (aext (truncate (srl (load x), c))) -> (aext (small load (x+c/n)))
@@ -13977,7 +13978,7 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
 
   // fold (aext (truncate x))
   if (N0.getOpcode() == ISD::TRUNCATE)
-    return DAG.getAnyExtOrTrunc(N0.getOperand(0), SDLoc(N), VT);
+    return DAG.getAnyExtOrTrunc(N0.getOperand(0), DL, VT);
 
   // Fold (aext (and (trunc x), cst)) -> (and x, cst)
   // if the trunc is not free.
@@ -13985,7 +13986,6 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
       N0.getOperand(0).getOpcode() == ISD::TRUNCATE &&
       N0.getOperand(1).getOpcode() == ISD::Constant &&
       !TLI.isTruncateFree(N0.getOperand(0).getOperand(0), N0.getValueType())) {
-    SDLoc DL(N);
     SDValue X = DAG.getAnyExtOrTrunc(N0.getOperand(0).getOperand(0), DL, VT);
     SDValue Y = DAG.getNode(ISD::ANY_EXTEND, DL, VT, N0.getOperand(1));
     assert(isa<ConstantSDNode>(Y) && "Expected constant to be folded!");
@@ -14011,9 +14011,9 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
           ExtendUsesToFormExtLoad(VT, N, N0, ISD::ANY_EXTEND, SetCCs, TLI);
     if (DoXform) {
       LoadSDNode *LN0 = cast<LoadSDNode>(N0);
-      SDValue ExtLoad = DAG.getExtLoad(ISD::EXTLOAD, SDLoc(N), VT,
-                                       LN0->getChain(), LN0->getBasePtr(),
-                                       N0.getValueType(), LN0->getMemOperand());
+      SDValue ExtLoad = DAG.getExtLoad(ISD::EXTLOAD, DL, VT, LN0->getChain(),
+                                       LN0->getBasePtr(), N0.getValueType(),
+                                       LN0->getMemOperand());
       ExtendSetCCUses(SetCCs, N0, ExtLoad, ISD::ANY_EXTEND);
       // If the load value is used only by N, replace it via CombineTo N.
       bool NoReplaceTrunc = N0.hasOneUse();
@@ -14039,9 +14039,9 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
     ISD::LoadExtType ExtType = LN0->getExtensionType();
     EVT MemVT = LN0->getMemoryVT();
     if (!LegalOperations || TLI.isLoadExtLegal(ExtType, VT, MemVT)) {
-      SDValue ExtLoad = DAG.getExtLoad(ExtType, SDLoc(N),
-                                       VT, LN0->getChain(), LN0->getBasePtr(),
-                                       MemVT, LN0->getMemOperand());
+      SDValue ExtLoad =
+          DAG.getExtLoad(ExtType, DL, VT, LN0->getChain(), LN0->getBasePtr(),
+                         MemVT, LN0->getMemOperand());
       CombineTo(N, ExtLoad);
       DAG.ReplaceAllUsesOfValueWith(SDValue(LN0, 1), ExtLoad.getValue(1));
       recursivelyDeleteUnusedNodes(LN0);
@@ -14069,23 +14069,20 @@ SDValue DAGCombiner::visitANY_EXTEND(SDNode *N) {
       // we know that the element size of the sext'd result matches the
       // element size of the compare operands.
       if (VT.getSizeInBits() == N00VT.getSizeInBits())
-        return DAG.getSetCC(SDLoc(N), VT, N0.getOperand(0),
-                             N0.getOperand(1),
-                             cast<CondCodeSDNode>(N0.getOperand(2))->get());
+        return DAG.getSetCC(DL, VT, N0.getOperand(0), N0.getOperand(1),
+                            cast<CondCodeSDNode>(N0.getOperand(2))->get());
 
       // If the desired elements are smaller or larger than the source
       // elements we can use a matching integer vector type and then
       // truncate/any extend
       EVT MatchingVectorType = N00VT.changeVectorElementTypeToInteger();
-      SDValue VsetCC =
-        DAG.getSetCC(SDLoc(N), MatchingVectorType, N0.getOperand(0),
-                      N0.getOperand(1),
-                      cast<CondCodeSDNode>(N0.getOperand(2))->get());
-      return DAG.getAnyExtOrTrunc(VsetCC, SDLoc(N), VT);
+      SDValue VsetCC = DAG.getSetCC(
+          DL, MatchingVectorType, N0.getOperand(0), N0.getOperand(1),
+          cast<CondCodeSDNode>(N0.getOperand(2))->get());
+      return DAG.getAnyExtOrTrunc(VsetCC, DL, VT);
     }
 
     // aext(setcc x,y,cc) -> select_cc x, y, 1, 0, cc
-    SDLoc DL(N);
     if (SDValue SCC = SimplifySelectCC(
             DL, N0.getOperand(0), N0.getOperand(1), DAG.getConstant(1, DL, VT),
             DAG.getConstant(0, DL, VT),
@@ -14637,10 +14634,9 @@ SDValue DAGCombiner::visitSIGN_EXTEND_INREG(SDNode *N) {
   return SDValue();
 }
 
-static SDValue
-foldExtendVectorInregToExtendOfSubvector(SDNode *N, const TargetLowering &TLI,
-                                         SelectionDAG &DAG,
-                                         bool LegalOperations) {
+static SDValue foldExtendVectorInregToExtendOfSubvector(
+    SDNode *N, const SDLoc &DL, const TargetLowering &TLI, SelectionDAG &DAG,
+    bool LegalOperations) {
   unsigned InregOpcode = N->getOpcode();
   unsigned Opcode = DAG.getOpcode_EXTEND(InregOpcode);
 
@@ -14667,28 +14663,29 @@ foldExtendVectorInregToExtendOfSubvector(SDNode *N, const TargetLowering &TLI,
   if (LegalOperations && !TLI.isOperationLegal(Opcode, VT))
     return SDValue();
 
-  return DAG.getNode(Opcode, SDLoc(N), VT, Src);
+  return DAG.getNode(Opcode, DL, VT, Src);
 }
 
 SDValue DAGCombiner::visitEXTEND_VECTOR_INREG(SDNode *N) {
   SDValue N0 = N->getOperand(0);
   EVT VT = N->getValueType(0);
+  SDLoc DL(N);
 
   if (N0.isUndef()) {
     // aext_vector_inreg(undef) = undef because the top bits are undefined.
     // {s/z}ext_vector_inreg(undef) = 0 because the top bits must be the same.
     return N->getOpcode() == ISD::ANY_EXTEND_VECTOR_INREG
                ? DAG.getUNDEF(VT)
-               : DAG.getConstant(0, SDLoc(N), VT);
+               : DAG.getConstant(0, DL, VT);
   }
 
-  if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))
+  if (SDValue Res = tryToFoldExtendOfConstant(N, DL, TLI, DAG, LegalTypes))
     return Res;
 
   if (SimplifyDemandedVectorElts(SDValue(N, 0)))
     return SDValue(N, 0);
 
-  if (SDValue R = foldExtendVectorInregToExtendOfSubvector(N, TLI, DAG,
+  if (SDValue R = foldExtendVectorInregToExtendOfSubvector(N, DL, TLI, DAG,
                                                            LegalOperations))
     return R;
 

>From 61131b9d94c6c19a1c26cfa31bb17a34602bafa0 Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 11:49:04 +0000
Subject: [PATCH 19/72] [DebugInfo][RemoveDIs] Final omnibus test fixing for
 RemoveDIs (#81125)

With this, I get a clean test suite running under RemoveDIs, the
non-intrinsic representation of debug-info, including under asan. We've
previously established that we generate identical binaries for some
large projects, so this i just edge-case cleanup. The changes:
* CodeGenPrepare fixups need to apply to dbg.assigns as well as
dbg.values (a dbg.assign is a dbg.value).
* Pin a test for constant-deletion to intrinsic debug-info: this very
rare scenario uses a different kill-location sigil in dbg.value mode to
RemoveDIs mode, which generates spurious test differences.
* Suppress a memory leak in a unit test: the code for dealing with
trailing debug-info in a block is necessarily fiddly, leading to this
leak when testing it. Developer-facing interfaces for moving
instructions around always deal with this behind the scenes.
* SROA, when replacing some vector-loads, needs to insert the
replacement loads ahead of any debug-info records so that their values
remain dominated by a definition. Set the head-bit indicating our
insertion should come before debug-info.
---
 llvm/lib/CodeGen/CodeGenPrepare.cpp                        | 3 ++-
 llvm/lib/Transforms/Scalar/SROA.cpp                        | 7 ++++++-
 .../assignment-tracking/codegenprepare/sunk-addr.ll        | 5 +++++
 .../Transforms/GlobalOpt/localize-constexpr-debuginfo.ll   | 7 ++++++-
 llvm/test/Transforms/SROA/vector-promotion.ll              | 4 ++++
 llvm/unittests/IR/BasicBlockDbgInfoTest.cpp                | 1 +
 6 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 5383b15c1c7f5d..09c4922d8822cc 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -8455,7 +8455,8 @@ bool CodeGenPrepare::fixupDPValuesOnInst(Instruction &I) {
 // FIXME: should updating debug-info really cause the "changed" flag to fire,
 // which can cause a function to be reprocessed?
 bool CodeGenPrepare::fixupDPValue(DPValue &DPV) {
-  if (DPV.Type != DPValue::LocationType::Value)
+  if (DPV.Type != DPValue::LocationType::Value &&
+      DPV.Type != DPValue::LocationType::Assign)
     return false;
 
   // Does this DPValue refer to a sunk address calculation?
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index bdbaf4f55c96d0..e92e2459ead551 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -2956,7 +2956,12 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
       assert(DL.typeSizeEqualsStoreSize(LI.getType()) &&
              "Non-byte-multiple bit width");
       // Move the insertion point just past the load so that we can refer to it.
-      IRB.SetInsertPoint(&*std::next(BasicBlock::iterator(&LI)));
+      BasicBlock::iterator LIIt = std::next(LI.getIterator());
+      // Ensure the insertion point comes before any debug-info immediately
+      // after the load, so that variable values referring to the load are
+      // dominated by it.
+      LIIt.setHeadBit(true);
+      IRB.SetInsertPoint(LI.getParent(), LIIt);
       // Create a placeholder value with the same type as LI to use as the
       // basis for the new value. This allows us to replace the uses of LI with
       // the computed value, and then replace the placeholder with LI, leaving
diff --git a/llvm/test/DebugInfo/Generic/assignment-tracking/codegenprepare/sunk-addr.ll b/llvm/test/DebugInfo/Generic/assignment-tracking/codegenprepare/sunk-addr.ll
index 70548465828079..8b226aa6633060 100644
--- a/llvm/test/DebugInfo/Generic/assignment-tracking/codegenprepare/sunk-addr.ll
+++ b/llvm/test/DebugInfo/Generic/assignment-tracking/codegenprepare/sunk-addr.ll
@@ -3,6 +3,11 @@
 ; RUN:   -mtriple=x86_64-unknown-unknown %s -o - \
 ; RUN: | FileCheck %s --implicit-check-not="call void @llvm.dbg."
 
+;; Test with RemoveDIs non-intrinsic debug-info too.
+; RUN: llc -start-before=codegenprepare -stop-after=codegenprepare \
+; RUN:   -mtriple=x86_64-unknown-unknown %s -o - --try-experimental-debuginfo-iterators \
+; RUN: | FileCheck %s --implicit-check-not="call void @llvm.dbg."
+
 ;; Check that when CodeGenPrepare moves an address computation to a block it's
 ;; used in its dbg.assign uses are updated.
 ;;
diff --git a/llvm/test/Transforms/GlobalOpt/localize-constexpr-debuginfo.ll b/llvm/test/Transforms/GlobalOpt/localize-constexpr-debuginfo.ll
index 18dc038fce66af..5d6cc7db5a41f3 100644
--- a/llvm/test/Transforms/GlobalOpt/localize-constexpr-debuginfo.ll
+++ b/llvm/test/Transforms/GlobalOpt/localize-constexpr-debuginfo.ll
@@ -1,4 +1,9 @@
-; RUN: opt -S < %s -passes=globalopt | FileCheck %s
+; RUN: opt -S < %s -passes=globalopt --experimental-debuginfo-iterators=false | FileCheck %s
+;; FIXME: this test is pinned to not use RemoveDIs non-intrinsic debug-info.
+;; Constant-deletion takes a slightly different path and (correctly) replaces
+;; the operand of the debug-info record with poison instead of a null pointer.
+;; This is a spurious test difference that we'll suppress for turning RemoveDIs
+;; on.
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
diff --git a/llvm/test/Transforms/SROA/vector-promotion.ll b/llvm/test/Transforms/SROA/vector-promotion.ll
index e2aa1e2ee1c708..e48dd5bb392082 100644
--- a/llvm/test/Transforms/SROA/vector-promotion.ll
+++ b/llvm/test/Transforms/SROA/vector-promotion.ll
@@ -2,6 +2,10 @@
 ; RUN: opt < %s -passes='sroa<preserve-cfg>' -S | FileCheck %s --check-prefixes=CHECK,CHECK-PRESERVE-CFG
 ; RUN: opt < %s -passes='sroa<modify-cfg>' -S | FileCheck %s --check-prefixes=CHECK,CHECK-MODIFY-CFG
 ; RUN: opt < %s -passes=debugify,sroa -S | FileCheck %s --check-prefix=DEBUG
+;;  Ensure that these work with non-intrinsic variable locations.
+; RUN: opt < %s -passes='sroa<preserve-cfg>' -S --try-experimental-debuginfo-iterators | FileCheck %s --check-prefixes=CHECK,CHECK-PRESERVE-CFG
+; RUN: opt < %s -passes='sroa<modify-cfg>' -S --try-experimental-debuginfo-iterators | FileCheck %s --check-prefixes=CHECK,CHECK-MODIFY-CFG
+; RUN: opt < %s -passes=debugify,sroa -S --try-experimental-debuginfo-iterators | FileCheck %s --check-prefix=DEBUG
 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:32:64"
 
 %S1 = type { i64, [42 x float] }
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index 827b4a9c0cc323..ef2b288d859a7a 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -1476,6 +1476,7 @@ TEST(BasicBlockDbgInfoTest, DbgSpliceToEmpty2) {
   // ... except for some dangling DPValues.
   EXPECT_NE(Exit.getTrailingDPValues(), nullptr);
   EXPECT_FALSE(Exit.getTrailingDPValues()->empty());
+  Exit.getTrailingDPValues()->eraseFromParent();
   Exit.deleteTrailingDPValues();
 
   UseNewDbgInfoFormat = false;

>From 64ef5c8b0f429c44603c104d74ebf58cff942d73 Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 11:49:17 +0000
Subject: [PATCH 20/72] [X86] LowerBUILD_VECTOR - share the same SDLoc argument
 instead of recreating it over and over again.

---
 llvm/lib/Target/X86/X86ISelLowering.cpp | 87 ++++++++++++-------------
 1 file changed, 42 insertions(+), 45 deletions(-)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index b5b76c66c2e49e..f310010ee87ed8 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7135,6 +7135,7 @@ static bool isFoldableUseOfShuffle(SDNode *N) {
 /// The VBROADCAST node is returned when a pattern is found,
 /// or SDValue() otherwise.
 static SDValue lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
+                                           const SDLoc &dl,
                                            const X86Subtarget &Subtarget,
                                            SelectionDAG &DAG) {
   // VBROADCAST requires AVX.
@@ -7145,8 +7146,6 @@ static SDValue lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 
   MVT VT = BVOp->getSimpleValueType(0);
   unsigned NumElts = VT.getVectorNumElements();
-  SDLoc dl(BVOp);
-
   assert((VT.is128BitVector() || VT.is256BitVector() || VT.is512BitVector()) &&
          "Unsupported vector type for broadcast.");
 
@@ -7492,14 +7491,13 @@ static SDValue LowerBUILD_VECTORvXbf16(SDValue Op, SelectionDAG &DAG,
 }
 
 // Lower BUILD_VECTOR operation for v8i1 and v16i1 types.
-static SDValue LowerBUILD_VECTORvXi1(SDValue Op, SelectionDAG &DAG,
+static SDValue LowerBUILD_VECTORvXi1(SDValue Op, const SDLoc &dl,
+                                     SelectionDAG &DAG,
                                      const X86Subtarget &Subtarget) {
 
   MVT VT = Op.getSimpleValueType();
   assert((VT.getVectorElementType() == MVT::i1) &&
          "Unexpected type in LowerBUILD_VECTORvXi1!");
-
-  SDLoc dl(Op);
   if (ISD::isBuildVectorAllZeros(Op.getNode()) ||
       ISD::isBuildVectorAllOnes(Op.getNode()))
     return Op;
@@ -7618,7 +7616,7 @@ LLVM_ATTRIBUTE_UNUSED static bool isHorizOp(unsigned Opcode) {
 /// See the corrected implementation in isHopBuildVector(). Can we reduce this
 /// code because it is only used for partial h-op matching now?
 static bool isHorizontalBinOpPart(const BuildVectorSDNode *N, unsigned Opcode,
-                                  SelectionDAG &DAG,
+                                  const SDLoc &DL, SelectionDAG &DAG,
                                   unsigned BaseIdx, unsigned LastIdx,
                                   SDValue &V0, SDValue &V1) {
   EVT VT = N->getValueType(0);
@@ -7928,6 +7926,7 @@ static bool isFMAddSubOrFMSubAdd(const X86Subtarget &Subtarget,
 /// 'fsubadd' operation accordingly to X86ISD::ADDSUB or X86ISD::FMADDSUB or
 /// X86ISD::FMSUBADD node.
 static SDValue lowerToAddSubOrFMAddSub(const BuildVectorSDNode *BV,
+                                       const SDLoc &DL,
                                        const X86Subtarget &Subtarget,
                                        SelectionDAG &DAG) {
   SDValue Opnd0, Opnd1;
@@ -7938,7 +7937,6 @@ static SDValue lowerToAddSubOrFMAddSub(const BuildVectorSDNode *BV,
     return SDValue();
 
   MVT VT = BV->getSimpleValueType(0);
-  SDLoc DL(BV);
 
   // Try to generate X86ISD::FMADDSUB node here.
   SDValue Opnd2;
@@ -8057,22 +8055,22 @@ static bool isHopBuildVector(const BuildVectorSDNode *BV, SelectionDAG &DAG,
 }
 
 static SDValue getHopForBuildVector(const BuildVectorSDNode *BV,
-                                    SelectionDAG &DAG, unsigned HOpcode,
-                                    SDValue V0, SDValue V1) {
+                                    const SDLoc &DL, SelectionDAG &DAG,
+                                    unsigned HOpcode, SDValue V0, SDValue V1) {
   // If either input vector is not the same size as the build vector,
   // extract/insert the low bits to the correct size.
   // This is free (examples: zmm --> xmm, xmm --> ymm).
   MVT VT = BV->getSimpleValueType(0);
   unsigned Width = VT.getSizeInBits();
   if (V0.getValueSizeInBits() > Width)
-    V0 = extractSubVector(V0, 0, DAG, SDLoc(BV), Width);
+    V0 = extractSubVector(V0, 0, DAG, DL, Width);
   else if (V0.getValueSizeInBits() < Width)
-    V0 = insertSubVector(DAG.getUNDEF(VT), V0, 0, DAG, SDLoc(BV), Width);
+    V0 = insertSubVector(DAG.getUNDEF(VT), V0, 0, DAG, DL, Width);
 
   if (V1.getValueSizeInBits() > Width)
-    V1 = extractSubVector(V1, 0, DAG, SDLoc(BV), Width);
+    V1 = extractSubVector(V1, 0, DAG, DL, Width);
   else if (V1.getValueSizeInBits() < Width)
-    V1 = insertSubVector(DAG.getUNDEF(VT), V1, 0, DAG, SDLoc(BV), Width);
+    V1 = insertSubVector(DAG.getUNDEF(VT), V1, 0, DAG, DL, Width);
 
   unsigned NumElts = VT.getVectorNumElements();
   APInt DemandedElts = APInt::getAllOnes(NumElts);
@@ -8084,17 +8082,17 @@ static SDValue getHopForBuildVector(const BuildVectorSDNode *BV,
   unsigned HalfNumElts = NumElts / 2;
   if (VT.is256BitVector() && DemandedElts.lshr(HalfNumElts) == 0) {
     MVT HalfVT = VT.getHalfNumVectorElementsVT();
-    V0 = extractSubVector(V0, 0, DAG, SDLoc(BV), 128);
-    V1 = extractSubVector(V1, 0, DAG, SDLoc(BV), 128);
-    SDValue Half = DAG.getNode(HOpcode, SDLoc(BV), HalfVT, V0, V1);
-    return insertSubVector(DAG.getUNDEF(VT), Half, 0, DAG, SDLoc(BV), 256);
+    V0 = extractSubVector(V0, 0, DAG, DL, 128);
+    V1 = extractSubVector(V1, 0, DAG, DL, 128);
+    SDValue Half = DAG.getNode(HOpcode, DL, HalfVT, V0, V1);
+    return insertSubVector(DAG.getUNDEF(VT), Half, 0, DAG, DL, 256);
   }
 
-  return DAG.getNode(HOpcode, SDLoc(BV), VT, V0, V1);
+  return DAG.getNode(HOpcode, DL, VT, V0, V1);
 }
 
 /// Lower BUILD_VECTOR to a horizontal add/sub operation if possible.
-static SDValue LowerToHorizontalOp(const BuildVectorSDNode *BV,
+static SDValue LowerToHorizontalOp(const BuildVectorSDNode *BV, const SDLoc &DL,
                                    const X86Subtarget &Subtarget,
                                    SelectionDAG &DAG) {
   // We need at least 2 non-undef elements to make this worthwhile by default.
@@ -8114,7 +8112,7 @@ static SDValue LowerToHorizontalOp(const BuildVectorSDNode *BV,
     unsigned HOpcode;
     SDValue V0, V1;
     if (isHopBuildVector(BV, DAG, HOpcode, V0, V1))
-      return getHopForBuildVector(BV, DAG, HOpcode, V0, V1);
+      return getHopForBuildVector(BV, DL, DAG, HOpcode, V0, V1);
   }
 
   // Try harder to match 256-bit ops by using extract/concat.
@@ -8134,22 +8132,21 @@ static SDValue LowerToHorizontalOp(const BuildVectorSDNode *BV,
     if (BV->getOperand(i)->isUndef())
       NumUndefsHI++;
 
-  SDLoc DL(BV);
   SDValue InVec0, InVec1;
   if (VT == MVT::v8i32 || VT == MVT::v16i16) {
     SDValue InVec2, InVec3;
     unsigned X86Opcode;
     bool CanFold = true;
 
-    if (isHorizontalBinOpPart(BV, ISD::ADD, DAG, 0, Half, InVec0, InVec1) &&
-        isHorizontalBinOpPart(BV, ISD::ADD, DAG, Half, NumElts, InVec2,
+    if (isHorizontalBinOpPart(BV, ISD::ADD, DL, DAG, 0, Half, InVec0, InVec1) &&
+        isHorizontalBinOpPart(BV, ISD::ADD, DL, DAG, Half, NumElts, InVec2,
                               InVec3) &&
         ((InVec0.isUndef() || InVec2.isUndef()) || InVec0 == InVec2) &&
         ((InVec1.isUndef() || InVec3.isUndef()) || InVec1 == InVec3))
       X86Opcode = X86ISD::HADD;
-    else if (isHorizontalBinOpPart(BV, ISD::SUB, DAG, 0, Half, InVec0,
+    else if (isHorizontalBinOpPart(BV, ISD::SUB, DL, DAG, 0, Half, InVec0,
                                    InVec1) &&
-             isHorizontalBinOpPart(BV, ISD::SUB, DAG, Half, NumElts, InVec2,
+             isHorizontalBinOpPart(BV, ISD::SUB, DL, DAG, Half, NumElts, InVec2,
                                    InVec3) &&
              ((InVec0.isUndef() || InVec2.isUndef()) || InVec0 == InVec2) &&
              ((InVec1.isUndef() || InVec3.isUndef()) || InVec1 == InVec3))
@@ -8179,15 +8176,16 @@ static SDValue LowerToHorizontalOp(const BuildVectorSDNode *BV,
   if (VT == MVT::v8f32 || VT == MVT::v4f64 || VT == MVT::v8i32 ||
       VT == MVT::v16i16) {
     unsigned X86Opcode;
-    if (isHorizontalBinOpPart(BV, ISD::ADD, DAG, 0, NumElts, InVec0, InVec1))
+    if (isHorizontalBinOpPart(BV, ISD::ADD, DL, DAG, 0, NumElts, InVec0,
+                              InVec1))
       X86Opcode = X86ISD::HADD;
-    else if (isHorizontalBinOpPart(BV, ISD::SUB, DAG, 0, NumElts, InVec0,
+    else if (isHorizontalBinOpPart(BV, ISD::SUB, DL, DAG, 0, NumElts, InVec0,
                                    InVec1))
       X86Opcode = X86ISD::HSUB;
-    else if (isHorizontalBinOpPart(BV, ISD::FADD, DAG, 0, NumElts, InVec0,
+    else if (isHorizontalBinOpPart(BV, ISD::FADD, DL, DAG, 0, NumElts, InVec0,
                                    InVec1))
       X86Opcode = X86ISD::FHADD;
-    else if (isHorizontalBinOpPart(BV, ISD::FSUB, DAG, 0, NumElts, InVec0,
+    else if (isHorizontalBinOpPart(BV, ISD::FSUB, DL, DAG, 0, NumElts, InVec0,
                                    InVec1))
       X86Opcode = X86ISD::FHSUB;
     else
@@ -8218,10 +8216,9 @@ static SDValue LowerShift(SDValue Op, const X86Subtarget &Subtarget,
 /// NOTE: Its not in our interest to start make a general purpose vectorizer
 /// from this, but enough scalar bit operations are created from the later
 /// legalization + scalarization stages to need basic support.
-static SDValue lowerBuildVectorToBitOp(BuildVectorSDNode *Op,
+static SDValue lowerBuildVectorToBitOp(BuildVectorSDNode *Op, const SDLoc &DL,
                                        const X86Subtarget &Subtarget,
                                        SelectionDAG &DAG) {
-  SDLoc DL(Op);
   MVT VT = Op->getSimpleValueType(0);
   unsigned NumElems = VT.getVectorNumElements();
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
@@ -8296,9 +8293,9 @@ static SDValue lowerBuildVectorToBitOp(BuildVectorSDNode *Op,
 /// Create a vector constant without a load. SSE/AVX provide the bare minimum
 /// functionality to do this, so it's all zeros, all ones, or some derivation
 /// that is cheap to calculate.
-static SDValue materializeVectorConstant(SDValue Op, SelectionDAG &DAG,
+static SDValue materializeVectorConstant(SDValue Op, const SDLoc &DL,
+                                         SelectionDAG &DAG,
                                          const X86Subtarget &Subtarget) {
-  SDLoc DL(Op);
   MVT VT = Op.getSimpleValueType();
 
   // Vectors containing all zeros can be matched by pxor and xorps.
@@ -8322,7 +8319,7 @@ static SDValue materializeVectorConstant(SDValue Op, SelectionDAG &DAG,
 /// from a vector of source values and a vector of extraction indices.
 /// The vectors might be manipulated to match the type of the permute op.
 static SDValue createVariablePermute(MVT VT, SDValue SrcVec, SDValue IndicesVec,
-                                     SDLoc &DL, SelectionDAG &DAG,
+                                     const SDLoc &DL, SelectionDAG &DAG,
                                      const X86Subtarget &Subtarget) {
   MVT ShuffleVT = VT;
   EVT IndicesVT = EVT(VT).changeVectorElementTypeToInteger();
@@ -8590,7 +8587,8 @@ static SDValue createVariablePermute(MVT VT, SDValue SrcVec, SDValue IndicesVec,
 // TODO: Utilize pshufb and zero mask blending to support more efficient
 // construction of vectors with constant-0 elements.
 static SDValue
-LowerBUILD_VECTORAsVariablePermute(SDValue V, SelectionDAG &DAG,
+LowerBUILD_VECTORAsVariablePermute(SDValue V, const SDLoc &DL,
+                                   SelectionDAG &DAG,
                                    const X86Subtarget &Subtarget) {
   SDValue SrcVec, IndicesVec;
   // Check for a match of the permute source vector and permute index elements.
@@ -8629,7 +8627,6 @@ LowerBUILD_VECTORAsVariablePermute(SDValue V, SelectionDAG &DAG,
       return SDValue();
   }
 
-  SDLoc DL(V);
   MVT VT = V.getSimpleValueType();
   return createVariablePermute(VT, SrcVec, IndicesVec, DL, DAG, Subtarget);
 }
@@ -8645,14 +8642,14 @@ X86TargetLowering::LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const {
 
   // Generate vectors for predicate vectors.
   if (VT.getVectorElementType() == MVT::i1 && Subtarget.hasAVX512())
-    return LowerBUILD_VECTORvXi1(Op, DAG, Subtarget);
+    return LowerBUILD_VECTORvXi1(Op, dl, DAG, Subtarget);
 
   if (VT.getVectorElementType() == MVT::bf16 &&
       (Subtarget.hasAVXNECONVERT() || Subtarget.hasBF16()))
     return LowerBUILD_VECTORvXbf16(Op, DAG, Subtarget);
 
-  if (SDValue VectorConstant = materializeVectorConstant(Op, DAG, Subtarget))
-    return VectorConstant;
+  if (SDValue VectorCst = materializeVectorConstant(Op, dl, DAG, Subtarget))
+    return VectorCst;
 
   unsigned EVTBits = EltVT.getSizeInBits();
   APInt UndefMask = APInt::getZero(NumElems);
@@ -8747,13 +8744,13 @@ X86TargetLowering::LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const {
     }
   }
 
-  if (SDValue AddSub = lowerToAddSubOrFMAddSub(BV, Subtarget, DAG))
+  if (SDValue AddSub = lowerToAddSubOrFMAddSub(BV, dl, Subtarget, DAG))
     return AddSub;
-  if (SDValue HorizontalOp = LowerToHorizontalOp(BV, Subtarget, DAG))
+  if (SDValue HorizontalOp = LowerToHorizontalOp(BV, dl, Subtarget, DAG))
     return HorizontalOp;
-  if (SDValue Broadcast = lowerBuildVectorAsBroadcast(BV, Subtarget, DAG))
+  if (SDValue Broadcast = lowerBuildVectorAsBroadcast(BV, dl, Subtarget, DAG))
     return Broadcast;
-  if (SDValue BitOp = lowerBuildVectorToBitOp(BV, Subtarget, DAG))
+  if (SDValue BitOp = lowerBuildVectorToBitOp(BV, dl, Subtarget, DAG))
     return BitOp;
 
   unsigned NumZero = ZeroMask.popcount();
@@ -8901,8 +8898,8 @@ X86TargetLowering::LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const {
   if (IsAllConstants)
     return SDValue();
 
-  if (SDValue V = LowerBUILD_VECTORAsVariablePermute(Op, DAG, Subtarget))
-      return V;
+  if (SDValue V = LowerBUILD_VECTORAsVariablePermute(Op, dl, DAG, Subtarget))
+    return V;
 
   // See if we can use a vector load to get all of the elements.
   {

>From ff0bfcfb4b4f85f2d05322efbebf3cfaf4c4107c Mon Sep 17 00:00:00 2001
From: Sergio Afonso <safonsof at amd.com>
Date: Thu, 8 Feb 2024 12:33:43 +0000
Subject: [PATCH 21/72] [Flang][Lower] NFC: Update target-features/target-cpu
 tests (#80984)

Previously, some of these lowering tests inadvertently relied on a
default triple not introducing any target features. This caused failures
when compiling on a ppc64le-linux-unknown-gnu system.

This patch updates these lowering tests to always explicitly set the
target triple and check that the -target-cpu and -target-features
compiler options are processed as expected.
---
 flang/test/Lower/target-features-amdgcn.f90 | 23 +++++++++++----------
 flang/test/Lower/target-features-x86_64.f90 | 16 +++++++-------
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/flang/test/Lower/target-features-amdgcn.f90 b/flang/test/Lower/target-features-amdgcn.f90
index 1f0439bba80a68..382230d7353dc2 100644
--- a/flang/test/Lower/target-features-amdgcn.f90
+++ b/flang/test/Lower/target-features-amdgcn.f90
@@ -1,21 +1,22 @@
 ! REQUIRES: amdgpu-registered-target
-! RUN: %flang_fc1 -emit-fir %s -o - | FileCheck %s --check-prefixes=ALL,NONE
-! RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa %s -o - | FileCheck %s --check-prefixes=ALL,TRIPLE
-! RUN: %flang_fc1 -emit-fir -target-cpu gfx90a %s -o - | FileCheck %s --check-prefixes=ALL,CPU
-! RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx90a %s -o - | FileCheck %s --check-prefixes=ALL,BOTH
+! RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx90a %s -o - | FileCheck %s --check-prefixes=ALL,CPU
+! RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-feature +sse %s -o - | FileCheck %s --check-prefixes=ALL,FEATURE
+! RUN: %flang_fc1 -emit-fir -triple amdgcn-amd-amdhsa -target-cpu gfx90a -target-feature +sse %s -o - | FileCheck %s --check-prefixes=ALL,BOTH
 
 ! ALL: module attributes {
 
-! NONE-NOT: fir.target_cpu
-! NONE-NOT: fir.target_features
-
-! TRIPLE-SAME: fir.target_cpu = "generic-hsa"
-! TRIPLE-NOT: fir.target_features
-
 ! CPU-SAME: fir.target_cpu = "gfx90a"
-! CPU-NOT: fir.target_features
+! CPU-SAME: fir.target_features = #llvm.target_features<[
+! CPU-SAME: "+gfx90a-insts"
+! CPU-SAME: ]>
+
+! FEATURE-SAME: fir.target_features = #llvm.target_features<[
+! FEATURE-NOT:  "+gfx90a-insts"
+! FEATURE-SAME: "+sse"
+! FEATURE-SAME: ]>
 
 ! BOTH-SAME: fir.target_cpu = "gfx90a"
 ! BOTH-SAME: fir.target_features = #llvm.target_features<[
 ! BOTH-SAME: "+gfx90a-insts"
+! BOTH-SAME: "+sse"
 ! BOTH-SAME: ]>
diff --git a/flang/test/Lower/target-features-x86_64.f90 b/flang/test/Lower/target-features-x86_64.f90
index 1b628b6b5b9c85..282c47923808de 100644
--- a/flang/test/Lower/target-features-x86_64.f90
+++ b/flang/test/Lower/target-features-x86_64.f90
@@ -1,19 +1,17 @@
 ! REQUIRES: x86-registered-target
-! RUN: %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu %s -o - | FileCheck %s --check-prefixes=ALL,NONE
 ! RUN: %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -target-cpu x86-64 %s -o - | FileCheck %s --check-prefixes=ALL,CPU
 ! RUN: %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -target-feature +sse %s -o - | FileCheck %s --check-prefixes=ALL,FEATURE
 ! RUN: %flang_fc1 -emit-fir -triple x86_64-unknown-linux-gnu -target-cpu x86-64 -target-feature +sse %s -o - | FileCheck %s --check-prefixes=ALL,BOTH
 
 ! ALL: module attributes {
 
-! NONE-NOT: fir.target_cpu
-! NONE-NOT: fir.target_features
+! CPU-SAME:     fir.target_cpu = "x86-64"
 
-! CPU-SAME: fir.target_cpu = "x86-64"
-! CPU-NOT: fir.target_features
-
-! FEATURE-NOT: fir.target_cpu
-! FEATURE-SAME: fir.target_features = #llvm.target_features<["+sse"]>
+! FEATURE-SAME: fir.target_features = #llvm.target_features<[
+! FEATURE-SAME: "+sse"
+! FEATURE-SAME: ]>
 
 ! BOTH-SAME: fir.target_cpu = "x86-64"
-! BOTH-SAME: fir.target_features = #llvm.target_features<["+sse"]>
+! BOTH-SAME: fir.target_features = #llvm.target_features<[
+! BOTH-SAME: "+sse"
+! BOTH-SAME: ]>

>From 9f91c381ecd1245feeb0c0826b5259d3f5788871 Mon Sep 17 00:00:00 2001
From: Zain Jaffal <zain at jjaffal.com>
Date: Tue, 2 Jan 2024 16:52:59 +0000
Subject: [PATCH 22/72] [InstCombine] Add tests for x / sqrt(y / z) with
 fast-math

---
 llvm/test/Transforms/InstCombine/fdiv-sqrt.ll | 85 +++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100644 llvm/test/Transforms/InstCombine/fdiv-sqrt.ll

diff --git a/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll b/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll
new file mode 100644
index 00000000000000..a8d4b6d5d622d4
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll
@@ -0,0 +1,85 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+declare double @llvm.sqrt.f64(double)
+
+define double @sqrt_div_fast(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_fast(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv fast double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call fast double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv fast double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv fast double %y, %z
+  %sqrt = call fast double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv fast double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv double %y, %z
+  %sqrt = call double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_reassoc_arcp(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_reassoc_arcp(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv reassoc arcp double %y, %z
+  %sqrt = call reassoc arcp double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv reassoc arcp double %x, %sqrt
+  ret double %div1
+}
+
+declare void @use(double)
+define double @sqrt_div_fast_multiple_uses_1(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_fast_multiple_uses_1(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv fast double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    call void @use(double [[DIV]])
+; CHECK-NEXT:    [[SQRT:%.*]] = call fast double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv fast double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv fast double %y, %z
+  call void @use(double %div)
+  %sqrt = call fast double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv fast double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_fast_multiple_uses_2(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_fast_multiple_uses_2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv fast double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call fast double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    call void @use(double [[SQRT]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv fast double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv fast double %y, %z
+  %sqrt = call fast double @llvm.sqrt.f64(double %div)
+  call void @use(double %sqrt)
+  %div1 = fdiv fast double %x, %sqrt
+  ret double %div1
+}
+

>From d18df36a45d4016df153f5985097867182cf5958 Mon Sep 17 00:00:00 2001
From: Zain Jaffal <zain at jjaffal.com>
Date: Sat, 6 Jan 2024 17:31:48 +0000
Subject: [PATCH 23/72] [InstCombine] Add additional tests for fdiv-sqrt

Add more tests where some of the instructions have missing flags.
---
 llvm/test/Transforms/InstCombine/fdiv-sqrt.ll | 96 ++++++++++++++++++-
 1 file changed, 93 insertions(+), 3 deletions(-)

diff --git a/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll b/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll
index a8d4b6d5d622d4..346271be7da761 100644
--- a/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll
+++ b/llvm/test/Transforms/InstCombine/fdiv-sqrt.ll
@@ -42,9 +42,99 @@ define double @sqrt_div_reassoc_arcp(double %x, double %y, double %z) {
 ; CHECK-NEXT:    ret double [[DIV1]]
 ;
 entry:
-  %div = fdiv reassoc arcp double %y, %z
-  %sqrt = call reassoc arcp double @llvm.sqrt.f64(double %div)
-  %div1 = fdiv reassoc arcp double %x, %sqrt
+  %div = fdiv arcp reassoc double %y, %z
+  %sqrt = call arcp reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp reassoc double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_reassoc_missing(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_reassoc_missing(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv arcp double %y, %z
+  %sqrt = call arcp reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp reassoc double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_reassoc_missing2(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_reassoc_missing2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv arcp reassoc double %y, %z
+  %sqrt = call arcp double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp reassoc double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_reassoc_missing3(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_reassoc_missing3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv arcp reassoc double %y, %z
+  %sqrt = call arcp reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_arcp_missing(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_arcp_missing(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv reassoc double %y, %z
+  %sqrt = call arcp reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp reassoc double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_arcp_missing2(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_arcp_missing2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc arcp double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv arcp reassoc double %y, %z
+  %sqrt = call reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv arcp reassoc double %x, %sqrt
+  ret double %div1
+}
+
+define double @sqrt_div_arcp_missing3(double %x, double %y, double %z) {
+; CHECK-LABEL: @sqrt_div_arcp_missing3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc arcp double [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[SQRT:%.*]] = call reassoc arcp double @llvm.sqrt.f64(double [[DIV]])
+; CHECK-NEXT:    [[DIV1:%.*]] = fdiv reassoc double [[X:%.*]], [[SQRT]]
+; CHECK-NEXT:    ret double [[DIV1]]
+;
+entry:
+  %div = fdiv arcp reassoc double %y, %z
+  %sqrt = call arcp reassoc double @llvm.sqrt.f64(double %div)
+  %div1 = fdiv reassoc double %x, %sqrt
   ret double %div1
 }
 

>From 219deac3d3f0a1ca9a8b58855562d7a94d571421 Mon Sep 17 00:00:00 2001
From: whisperity <whisperity at gmail.com>
Date: Thu, 8 Feb 2024 13:37:55 +0100
Subject: [PATCH 24/72] [clang][Sema] Subclass `-Wshorten-64-to-32` under
 `-Wimplicit-int-conversion` (#80814)

Although "implicit int conversions" is supposed to be a superset
containing the more specific "64-to-32" case, previously they were a
disjoint set, only enabled in common in the much larger `-Wconversion`.
---
 clang/docs/ReleaseNotes.rst                   |  4 ++++
 clang/include/clang/Basic/DiagnosticGroups.td |  6 +++---
 clang/test/Sema/conversion-64-32.c            |  6 +++++-
 ...onversion-implicit-int-includes-64-to-32.c | 21 +++++++++++++++++++
 4 files changed, 33 insertions(+), 4 deletions(-)
 create mode 100644 clang/test/Sema/conversion-implicit-int-includes-64-to-32.c

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 46a03b7c91220d..57c46c36955a8a 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -150,9 +150,13 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses member template declarations with multiple declarators.
 - Clang now diagnoses use of the ``template`` keyword after declarative nested name specifiers.
+
 - Clang now diagnoses constexpr constructor for not initializing atleast one member of union
 - Fixes(`#46689 Constexpr constructor not initializing a union member is not diagnosed`)
 
+- The ``-Wshorten-64-to-32`` diagnostic is now grouped under ``-Wimplicit-int-conversion`` instead
+   of ``-Wconversion``. Fixes `#69444 <https://github.com/llvm/llvm-project/issues/69444>`_.
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 6765721ae7002c..975eca0ad9b642 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -108,8 +108,10 @@ def EnumConversion : DiagGroup<"enum-conversion",
                                 EnumCompareConditional]>;
 def ObjCSignedCharBoolImplicitIntConversion :
   DiagGroup<"objc-signed-char-bool-implicit-int-conversion">;
+def Shorten64To32 : DiagGroup<"shorten-64-to-32">;
 def ImplicitIntConversion : DiagGroup<"implicit-int-conversion",
-                                     [ObjCSignedCharBoolImplicitIntConversion]>;
+                                     [Shorten64To32,
+                                      ObjCSignedCharBoolImplicitIntConversion]>;
 def ImplicitConstIntFloatConversion : DiagGroup<"implicit-const-int-float-conversion">;
 def ImplicitIntFloatConversion : DiagGroup<"implicit-int-float-conversion",
  [ImplicitConstIntFloatConversion]>;
@@ -631,7 +633,6 @@ def Shadow : DiagGroup<"shadow", [ShadowFieldInConstructorModified,
 def ShadowAll : DiagGroup<"shadow-all", [Shadow, ShadowFieldInConstructor,
                                          ShadowUncapturedLocal, ShadowField]>;
 
-def Shorten64To32 : DiagGroup<"shorten-64-to-32">;
 def : DiagGroup<"sign-promo">;
 def SignCompare : DiagGroup<"sign-compare">;
 def SwitchDefault  : DiagGroup<"switch-default">;
@@ -942,7 +943,6 @@ def Conversion : DiagGroup<"conversion",
                             EnumConversion,
                             BitFieldEnumConversion,
                             FloatConversion,
-                            Shorten64To32,
                             IntConversion,
                             ImplicitIntConversion,
                             ImplicitFloatConversion,
diff --git a/clang/test/Sema/conversion-64-32.c b/clang/test/Sema/conversion-64-32.c
index dc417edcbc2168..c172dd109f3be2 100644
--- a/clang/test/Sema/conversion-64-32.c
+++ b/clang/test/Sema/conversion-64-32.c
@@ -9,9 +9,13 @@ typedef long long long2 __attribute__((__vector_size__(16)));
 
 int4 test1(long2 a) {
   int4  v127 = a;  // no warning.
-  return v127; 
+  return v127;
 }
 
 int test2(long v) {
   return v / 2; // expected-warning {{implicit conversion loses integer precision: 'long' to 'int'}}
 }
+
+char test3(short s) {
+  return s * 2; // no warning.
+}
diff --git a/clang/test/Sema/conversion-implicit-int-includes-64-to-32.c b/clang/test/Sema/conversion-implicit-int-includes-64-to-32.c
new file mode 100644
index 00000000000000..e22ccbe821f65c
--- /dev/null
+++ b/clang/test/Sema/conversion-implicit-int-includes-64-to-32.c
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fsyntax-only -verify -Wimplicit-int-conversion -triple x86_64-apple-darwin %s
+
+int test0(long v) {
+  return v; // expected-warning {{implicit conversion loses integer precision}}
+}
+
+typedef int  int4  __attribute__ ((vector_size(16)));
+typedef long long long2 __attribute__((__vector_size__(16)));
+
+int4 test1(long2 a) {
+  int4  v127 = a;  // no warning.
+  return v127;
+}
+
+int test2(long v) {
+  return v / 2; // expected-warning {{implicit conversion loses integer precision: 'long' to 'int'}}
+}
+
+char test3(short s) {
+  return s * 2; // expected-warning {{implicit conversion loses integer precision: 'int' to 'char'}}
+}

>From 3086cd5df9c3748b8e07d0300d39e6714fdd381c Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 12:34:50 +0000
Subject: [PATCH 25/72] [X86] Add X86::getVectorRegisterWidth helper. NFC.

Replaces internal helper used by addConstantComments to allow reuse in a future patch.
---
 llvm/lib/Target/X86/X86InstrInfo.cpp   | 12 ++++++++++++
 llvm/lib/Target/X86/X86InstrInfo.h     |  3 +++
 llvm/lib/Target/X86/X86MCInstLower.cpp | 24 ++++++------------------
 3 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 0d30a31377727a..0f21880f6df90c 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -3423,6 +3423,18 @@ unsigned X86::getSwappedVCMPImm(unsigned Imm) {
   return Imm;
 }
 
+unsigned X86::getVectorRegisterWidth(const MCOperandInfo &Info) {
+  if (Info.RegClass == X86::VR128RegClassID ||
+      Info.RegClass == X86::VR128XRegClassID)
+    return 128;
+  if (Info.RegClass == X86::VR256RegClassID ||
+      Info.RegClass == X86::VR256XRegClassID)
+    return 256;
+  if (Info.RegClass == X86::VR512RegClassID)
+    return 512;
+  llvm_unreachable("Unknown register class!");
+}
+
 /// Return true if the Reg is X87 register.
 static bool isX87Reg(unsigned Reg) {
   return (Reg == X86::FPCW || Reg == X86::FPSW ||
diff --git a/llvm/lib/Target/X86/X86InstrInfo.h b/llvm/lib/Target/X86/X86InstrInfo.h
index ee0d2d059df8d7..996a24d9e8a944 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.h
+++ b/llvm/lib/Target/X86/X86InstrInfo.h
@@ -77,6 +77,9 @@ unsigned getSwappedVPCOMImm(unsigned Imm);
 /// Get the VCMP immediate if the opcodes are swapped.
 unsigned getSwappedVCMPImm(unsigned Imm);
 
+/// Get the width of the vector register operand.
+unsigned getVectorRegisterWidth(const MCOperandInfo &Info);
+
 /// Check if the instruction is X87 instruction.
 bool isX87Instruction(MachineInstr &MI);
 
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index b336ba3ea34404..d3b7d97a83caf0 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1388,18 +1388,6 @@ PrevCrossBBInst(MachineBasicBlock::const_iterator MBBI) {
   return MBBI;
 }
 
-static unsigned getRegisterWidth(const MCOperandInfo &Info) {
-  if (Info.RegClass == X86::VR128RegClassID ||
-      Info.RegClass == X86::VR128XRegClassID)
-    return 128;
-  if (Info.RegClass == X86::VR256RegClassID ||
-      Info.RegClass == X86::VR256XRegClassID)
-    return 256;
-  if (Info.RegClass == X86::VR512RegClassID)
-    return 512;
-  llvm_unreachable("Unknown register class!");
-}
-
 static unsigned getSrcIdx(const MachineInstr* MI, unsigned SrcIdx) {
   if (X86II::isKMasked(MI->getDesc().TSFlags)) {
     // Skip mask operand.
@@ -1648,7 +1636,7 @@ static void printZeroExtend(const MachineInstr *MI, MCStreamer &OutStreamer,
   CS << " = ";
 
   SmallVector<int> Mask;
-  unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+  unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
   assert((Width % DstEltBits) == 0 && (DstEltBits % SrcEltBits) == 0 &&
          "Illegal extension ratio");
   DecodeZeroExtendMask(SrcEltBits, DstEltBits, Width / DstEltBits, false, Mask);
@@ -1753,7 +1741,7 @@ static void addConstantComments(const MachineInstr *MI,
   case X86::VPSHUFBZrmkz: {
     unsigned SrcIdx = getSrcIdx(MI, 1);
     if (auto *C = X86::getConstantFromPool(*MI, SrcIdx + 1)) {
-      unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+      unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
       SmallVector<int, 64> Mask;
       DecodePSHUFBMask(C, Width, Mask);
       if (!Mask.empty())
@@ -1775,7 +1763,7 @@ static void addConstantComments(const MachineInstr *MI,
   case X86::VPERMILPSZrmkz: {
     unsigned SrcIdx = getSrcIdx(MI, 1);
     if (auto *C = X86::getConstantFromPool(*MI, SrcIdx + 1)) {
-      unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+      unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
       SmallVector<int, 16> Mask;
       DecodeVPERMILPMask(C, 32, Width, Mask);
       if (!Mask.empty())
@@ -1796,7 +1784,7 @@ static void addConstantComments(const MachineInstr *MI,
   case X86::VPERMILPDZrmkz: {
     unsigned SrcIdx = getSrcIdx(MI, 1);
     if (auto *C = X86::getConstantFromPool(*MI, SrcIdx + 1)) {
-      unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+      unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
       SmallVector<int, 16> Mask;
       DecodeVPERMILPMask(C, 64, Width, Mask);
       if (!Mask.empty())
@@ -1824,7 +1812,7 @@ static void addConstantComments(const MachineInstr *MI,
     }
 
     if (auto *C = X86::getConstantFromPool(*MI, 3)) {
-      unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+      unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
       SmallVector<int, 16> Mask;
       DecodeVPERMIL2PMask(C, (unsigned)CtrlOp.getImm(), ElSize, Width, Mask);
       if (!Mask.empty())
@@ -1835,7 +1823,7 @@ static void addConstantComments(const MachineInstr *MI,
 
   case X86::VPPERMrrm: {
     if (auto *C = X86::getConstantFromPool(*MI, 3)) {
-      unsigned Width = getRegisterWidth(MI->getDesc().operands()[0]);
+      unsigned Width = X86::getVectorRegisterWidth(MI->getDesc().operands()[0]);
       SmallVector<int, 16> Mask;
       DecodeVPPERMMask(C, Width, Mask);
       if (!Mask.empty())

>From 6f3941ffce4f4aabd0cb8b8d4c0e943579a9a4f6 Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 12:41:55 +0000
Subject: [PATCH 26/72] [NFCI][RemoveDIs] Build LLVM with RemoveDIs iterators

This commit flips a bit to make LLVM build with "debuginfo iterators",
causing BasicBlock::iterator to contain a bit that's used for debug-info
purposes. More about this can be read on Discourse [0], but the runtime
impact of this should be negligable (iterators usually end up being
inlined), and there should be no change to LLVMs behaviour as a result of
this commit.

What this does mean though, is that roughly 400 debug-info tests where
we've added "--try-experimental-debuginfo-iterators" to RUNlines are going
to start operating in RemoveDIs mode. These are already tested on the
new-debug-iterators buildbot [1], and I've even tested with asan, so I'm
not _expecting_ any turbulence.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
[1] https://lab.llvm.org/buildbot/#/builders/275
---
 llvm/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index 485c76b8bb936d..c31980a47f39b7 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -654,7 +654,7 @@ option(LLVM_EXTERNALIZE_DEBUGINFO
   "Generate dSYM files and strip executables and libraries (Darwin Only)" OFF)
 
 option(LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS
-  "Add extra Booleans to ilist_iterators to communicate facts for debug-info" OFF)
+  "Add extra Booleans to ilist_iterators to communicate facts for debug-info" ON)
 
 set(LLVM_CODESIGNING_IDENTITY "" CACHE STRING
   "Sign executables and dylibs with the given identity or skip if empty (Darwin Only)")

>From a0703cec4efb81ebc41dcbd109188a95526e6a6d Mon Sep 17 00:00:00 2001
From: agozillon <Andrew.Gozillon at amd.com>
Date: Thu, 8 Feb 2024 14:03:39 +0100
Subject: [PATCH 27/72] [Flang][bbc] Prevent bbc -emit-fir command invoking
 OpenMP passes twice (#80927)

Currently when the bbc tool is invoked with the emit-fir command the pass pipeline will be invoked twice for verification causing the previously added OpenMP pass pipeline to be invoked multiple times.

This change seeks to prevent that from occurring by using a seperate pass manager and run command immediately when it is necessary for the OpenMP passes to be executed.
---
 flang/tools/bbc/bbc.cpp | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/flang/tools/bbc/bbc.cpp b/flang/tools/bbc/bbc.cpp
index 9d5caf5c6804ea..c9358c83e795c4 100644
--- a/flang/tools/bbc/bbc.cpp
+++ b/flang/tools/bbc/bbc.cpp
@@ -256,6 +256,22 @@ createTargetMachine(llvm::StringRef targetTriple, std::string &error) {
                                      /*Reloc::Model=*/std::nullopt)};
 }
 
+/// Build and execute the OpenMPFIRPassPipeline with its own instance
+/// of the pass manager, allowing it to be invoked as soon as it's
+/// required without impacting the main pass pipeline that may be invoked
+/// more than once for verification.
+static mlir::LogicalResult runOpenMPPasses(mlir::ModuleOp mlirModule) {
+  mlir::PassManager pm(mlirModule->getName(),
+                       mlir::OpPassManager::Nesting::Implicit);
+  fir::createOpenMPFIRPassPipeline(pm, enableOpenMPDevice);
+  (void)mlir::applyPassManagerCLOptions(pm);
+  if (mlir::failed(pm.run(mlirModule))) {
+    llvm::errs() << "FATAL: failed to correctly apply OpenMP pass pipeline";
+    return mlir::failure();
+  }
+  return mlir::success();
+}
+
 //===----------------------------------------------------------------------===//
 // Translate Fortran input to FIR, a dialect of MLIR.
 //===----------------------------------------------------------------------===//
@@ -369,14 +385,16 @@ static mlir::LogicalResult convertFortranSourceToMLIR(
                            "could not open output file ")
            << outputName;
 
+  // WARNING: This pipeline must be run immediately after the lowering to
+  // ensure that the FIR is correct with respect to OpenMP operations/
+  // attributes.
+  if (enableOpenMP)
+    if (mlir::failed(runOpenMPPasses(mlirModule)))
+      return mlir::failure();
+
   // Otherwise run the default passes.
   mlir::PassManager pm(mlirModule->getName(),
                        mlir::OpPassManager::Nesting::Implicit);
-  if (enableOpenMP)
-    // WARNING: This pipeline must be run immediately after the lowering to
-    // ensure that the FIR is correct with respect to OpenMP operations/
-    // attributes.
-    fir::createOpenMPFIRPassPipeline(pm, enableOpenMPDevice);
   pm.enableVerifier(/*verifyPasses=*/true);
   (void)mlir::applyPassManagerCLOptions(pm);
   if (passPipeline.hasAnyOccurrences()) {

>From 2226bf207e1e4aa58fc39a5623e108188c8bf137 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= <martin at martin.st>
Date: Thu, 8 Feb 2024 15:28:46 +0200
Subject: [PATCH 28/72] [OpenMP] [cmake] Don't use -fno-semantic-interposition
 on Windows (#81113)

This was added in 4b7beab4187ab0766c3d7b272511d5751431a8da. When the
flag was added implicitly elsewhere, it was added via
llvm/cmake/modules/HandleLLVMOptions.cmake, where it wasn't added on
Windows/Cygwin targets.

This avoids one warning per object file in OpenMP.
---
 openmp/cmake/HandleOpenMPOptions.cmake | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/openmp/cmake/HandleOpenMPOptions.cmake b/openmp/cmake/HandleOpenMPOptions.cmake
index 71346201129b68..9387d9b3b0ff75 100644
--- a/openmp/cmake/HandleOpenMPOptions.cmake
+++ b/openmp/cmake/HandleOpenMPOptions.cmake
@@ -46,7 +46,11 @@ append_if(OPENMP_HAVE_WEXTRA_FLAG "-Wno-extra" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WPEDANTIC_FLAG "-Wno-pedantic" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WMAYBE_UNINITIALIZED_FLAG "-Wno-maybe-uninitialized" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 
-append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION "-fno-semantic-interposition" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+if (NOT (WIN32 OR CYGWIN))
+  # This flag is not relevant on Windows; the flag is accepted, but produces warnings
+  # about argument unused during compilation.
+  append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION "-fno-semantic-interposition" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+endif()
 append_if(OPENMP_HAVE_FUNCTION_SECTIONS "-ffunction-section" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_DATA_SECTIONS "-fdata-sections" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 

>From 708b6b70600b7d01b65e65c634834eacb932918b Mon Sep 17 00:00:00 2001
From: Mariya Podchishchaeva <mariya.podchishchaeva at intel.com>
Date: Thu, 8 Feb 2024 16:31:57 +0300
Subject: [PATCH 29/72] [clang] Use CPlusPlus language option instead of Bool
 (#80975)

As it was pointed out in
https://github.com/llvm/llvm-project/pull/80724, we should not be
checking `getLangOpts().Bool` when determining something related to
logical operators, since it only indicates that bool keyword is present,
not which semantic logical operators have. As a side effect a missing
`-Wpointer-bool-conversion` in OpenCL C was restored since like C23,
OpenCL C has bool keyword but logical operators still return int.
---
 clang/lib/Sema/SemaChecking.cpp    | 8 ++++----
 clang/test/SemaOpenCL/operators.cl | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index c775ff207ba837..f8b73c7923baba 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -16129,10 +16129,10 @@ static void CheckConditionalOperator(Sema &S, AbstractConditionalOperator *E,
 /// Check conversion of given expression to boolean.
 /// Input argument E is a logical expression.
 static void CheckBoolLikeConversion(Sema &S, Expr *E, SourceLocation CC) {
-  // While C23 does have bool as a keyword, we still need to run the bool-like
-  // conversion checks as bools are still not used as the return type from
-  // "boolean" operators or as the input type for conditional operators.
-  if (S.getLangOpts().Bool && !S.getLangOpts().C23)
+  // Run the bool-like conversion checks only for C since there bools are
+  // still not used as the return type from "boolean" operators or as the input
+  // type for conditional operators.
+  if (S.getLangOpts().CPlusPlus)
     return;
   if (E->IgnoreParenImpCasts()->getType()->isAtomicType())
     return;
diff --git a/clang/test/SemaOpenCL/operators.cl b/clang/test/SemaOpenCL/operators.cl
index cf359acd5acb97..76a7692a7105c8 100644
--- a/clang/test/SemaOpenCL/operators.cl
+++ b/clang/test/SemaOpenCL/operators.cl
@@ -118,6 +118,6 @@ kernel void pointer_ops(){
   bool b = !p;
   b = p==0;
   int i;
-  b = !&i;
+  b = !&i; // expected-warning {{address of 'i' will always evaluate to 'true'}}
   b = &i==(int *)1;
 }

>From 941a757a624a67357b668e47bbcdf63242ff2ae6 Mon Sep 17 00:00:00 2001
From: Uday Bondhugula <uday at polymagelabs.com>
Date: Thu, 8 Feb 2024 19:16:29 +0530
Subject: [PATCH 30/72] [MLIR] Fix crash in AffineMap::replace for zero result
 maps (#80930)

Fix obvious bug in AffineMap::replace for the case of zero result maps.
Extend/complete inferExprsFromList to work with empty expression lists.
---
 .../mlir/Dialect/Affine/IR/AffineOps.td       |  3 ++-
 .../mlir/Dialect/Utils/StructuredOpsUtils.h   |  4 +++-
 mlir/include/mlir/IR/AffineMap.h              |  6 +++--
 .../Conversion/TosaToLinalg/TosaToLinalg.cpp  | 14 +++++++----
 .../Conversion/VectorToGPU/VectorToGPU.cpp    |  8 +++++--
 mlir/lib/Dialect/Affine/IR/AffineOps.cpp      |  8 +++++--
 mlir/lib/Dialect/Linalg/Transforms/Split.cpp  |  4 +++-
 mlir/lib/Dialect/Linalg/Utils/Utils.cpp       | 17 ++++++-------
 .../Transforms/SparseGPUCodegen.cpp           |  4 +++-
 mlir/lib/Dialect/Vector/IR/VectorOps.cpp      |  7 +++---
 .../Vector/Transforms/LowerVectorContract.cpp |  4 +++-
 .../VectorTransferSplitRewritePatterns.cpp    |  2 +-
 .../Vector/Transforms/VectorTransforms.cpp    |  9 ++++---
 mlir/lib/IR/AffineMap.cpp                     | 24 ++++++++++++-------
 mlir/lib/IR/BuiltinTypes.cpp                  |  2 +-
 mlir/unittests/IR/AffineMapTest.cpp           | 23 ++++++++++++++++++
 mlir/unittests/IR/CMakeLists.txt              |  1 +
 17 files changed, 99 insertions(+), 41 deletions(-)
 create mode 100644 mlir/unittests/IR/AffineMapTest.cpp

diff --git a/mlir/include/mlir/Dialect/Affine/IR/AffineOps.td b/mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
index 225e4d3194e230..edcfcfd830c443 100644
--- a/mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
+++ b/mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
@@ -67,7 +67,8 @@ def AffineApplyOp : Affine_Op<"apply", [Pure]> {
     OpBuilder<(ins "ArrayRef<AffineExpr> ":$exprList,"ValueRange":$mapOperands),
     [{
       build($_builder, $_state, $_builder.getIndexType(),
-            AffineMap::inferFromExprList(exprList).front(), mapOperands);
+            AffineMap::inferFromExprList(exprList, $_builder.getContext())
+                                        .front(), mapOperands);
     }]>
   ];
 
diff --git a/mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h b/mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h
index 134c5569fbb2f3..929a2a7d396496 100644
--- a/mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h
+++ b/mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h
@@ -121,7 +121,9 @@ class StructuredGenerator {
   }
 
   bool layout(MapList l) {
-    auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+    auto infer = [&](MapList m) {
+      return AffineMap::inferFromExprList(m, ctx);
+    };
     return maps == infer(l);
   }
 
diff --git a/mlir/include/mlir/IR/AffineMap.h b/mlir/include/mlir/IR/AffineMap.h
index cd751af5bb2558..cce141253989e5 100644
--- a/mlir/include/mlir/IR/AffineMap.h
+++ b/mlir/include/mlir/IR/AffineMap.h
@@ -122,9 +122,11 @@ class AffineMap {
   /// `exprs.size()`, as many dims as the largest dim in `exprs` and as many
   /// symbols as the largest symbol in `exprs`.
   static SmallVector<AffineMap, 4>
-  inferFromExprList(ArrayRef<ArrayRef<AffineExpr>> exprsList);
+  inferFromExprList(ArrayRef<ArrayRef<AffineExpr>> exprsList,
+                    MLIRContext *context);
   static SmallVector<AffineMap, 4>
-  inferFromExprList(ArrayRef<SmallVector<AffineExpr, 4>> exprsList);
+  inferFromExprList(ArrayRef<SmallVector<AffineExpr, 4>> exprsList,
+                    MLIRContext *context);
 
   MLIRContext *getContext() const;
 
diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index 1eb5678b417552..f4f6dadfb37166 100644
--- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
+++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
@@ -2010,7 +2010,8 @@ class ArgMaxConverter : public OpRewritePattern<tosa::ArgMaxOp> {
     }
 
     bool didEncounterError = false;
-    auto maps = AffineMap::inferFromExprList({srcExprs, dstExprs, dstExprs});
+    auto maps = AffineMap::inferFromExprList({srcExprs, dstExprs, dstExprs},
+                                             rewriter.getContext());
     auto linalgOp = rewriter.create<linalg::GenericOp>(
         loc, ArrayRef<Type>({resultTy, resultMaxTy}), input,
         ValueRange({filledTensorIdx, filledTensorMax}), maps, iteratorTypes,
@@ -2351,9 +2352,11 @@ struct RFFT2dConverter final : public OpRewritePattern<RFFT2dOp> {
         createZeroTensor(rewriter, loc, outputType, dynamicSizes)};
 
     // Indexing maps for input and output tensors
-    auto indexingMaps = AffineMap::inferFromExprList(llvm::ArrayRef{
-        affineDimsExpr(rewriter, 0, 3, 4), affineDimsExpr(rewriter, 0, 1, 2),
-        affineDimsExpr(rewriter, 0, 1, 2)});
+    auto indexingMaps = AffineMap::inferFromExprList(
+        llvm::ArrayRef{affineDimsExpr(rewriter, 0, 3, 4),
+                       affineDimsExpr(rewriter, 0, 1, 2),
+                       affineDimsExpr(rewriter, 0, 1, 2)},
+        rewriter.getContext());
 
     // Width and height dimensions of the original input.
     auto dimH = rewriter.createOrFold<tensor::DimOp>(loc, input, 1);
@@ -2463,7 +2466,8 @@ struct FFT2dConverter final : OpRewritePattern<FFT2dOp> {
         ArrayRef{RFFT2dConverter::affineDimsExpr(rewriter, 0, 3, 4),
                  RFFT2dConverter::affineDimsExpr(rewriter, 0, 3, 4),
                  RFFT2dConverter::affineDimsExpr(rewriter, 0, 1, 2),
-                 RFFT2dConverter::affineDimsExpr(rewriter, 0, 1, 2)});
+                 RFFT2dConverter::affineDimsExpr(rewriter, 0, 1, 2)},
+        rewriter.getContext());
 
     // Width and height dimensions of the original input.
     auto dimH = rewriter.createOrFold<tensor::DimOp>(loc, input_real, 1);
diff --git a/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp b/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
index b63baf330c8645..85fb8a539912f7 100644
--- a/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
+++ b/mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
@@ -77,7 +77,9 @@ static void getXferIndices(RewriterBase &rewriter, TransferOpType xferOp,
 static bool contractSupportsMMAMatrixType(vector::ContractionOp contract,
                                           bool useNvGpu) {
   using MapList = ArrayRef<ArrayRef<AffineExpr>>;
-  auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+  auto infer = [&](MapList m) {
+    return AffineMap::inferFromExprList(m, contract.getContext());
+  };
   AffineExpr m, n, k;
   bindDims(contract.getContext(), m, n, k);
   auto iteratorTypes = contract.getIteratorTypes().getValue();
@@ -394,7 +396,9 @@ struct PrepareContractToGPUMMA
 
     // Set up the parallel/reduction structure in right form.
     using MapList = ArrayRef<ArrayRef<AffineExpr>>;
-    auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+    auto infer = [&](MapList m) {
+      return AffineMap::inferFromExprList(m, op.getContext());
+    };
     AffineExpr m, n, k;
     bindDims(rewriter.getContext(), m, n, k);
     static constexpr std::array<int64_t, 2> perm = {1, 0};
diff --git a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
index adb56ab36438bf..c4b13193f4e773 100644
--- a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+++ b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
@@ -1145,7 +1145,9 @@ AffineApplyOp
 mlir::affine::makeComposedAffineApply(OpBuilder &b, Location loc, AffineExpr e,
                                       ArrayRef<OpFoldResult> operands) {
   return makeComposedAffineApply(
-      b, loc, AffineMap::inferFromExprList(ArrayRef<AffineExpr>{e}).front(),
+      b, loc,
+      AffineMap::inferFromExprList(ArrayRef<AffineExpr>{e}, b.getContext())
+          .front(),
       operands);
 }
 
@@ -1220,7 +1222,9 @@ mlir::affine::makeComposedFoldedAffineApply(OpBuilder &b, Location loc,
                                             AffineExpr expr,
                                             ArrayRef<OpFoldResult> operands) {
   return makeComposedFoldedAffineApply(
-      b, loc, AffineMap::inferFromExprList(ArrayRef<AffineExpr>{expr}).front(),
+      b, loc,
+      AffineMap::inferFromExprList(ArrayRef<AffineExpr>{expr}, b.getContext())
+          .front(),
       operands);
 }
 
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Split.cpp b/mlir/lib/Dialect/Linalg/Transforms/Split.cpp
index 0174db45a83db2..47b5fcd4014a04 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Split.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Split.cpp
@@ -83,7 +83,9 @@ linalg::splitOp(RewriterBase &rewriter, TilingInterface op, unsigned dimension,
   bindDims(rewriter.getContext(), d0, d1, d2);
   OpFoldResult minSplitPoint = affine::makeComposedFoldedAffineMin(
       rewriter, op.getLoc(),
-      AffineMap::inferFromExprList(ArrayRef<AffineExpr>{d0, d1 + d2}).front(),
+      AffineMap::inferFromExprList(ArrayRef<AffineExpr>{d0, d1 + d2},
+                                   rewriter.getContext())
+          .front(),
       {splitPoint, offsets[dimension], sizes[dimension]});
 
   // Compute the size of the second part. Return early if the second part would
diff --git a/mlir/lib/Dialect/Linalg/Utils/Utils.cpp b/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
index 986b5f3e1fb604..5d220c6cdd7e58 100644
--- a/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+++ b/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
@@ -670,7 +670,8 @@ computeSliceParameters(OpBuilder &builder, Location loc, Value valueToTile,
                               << ": make sure in bound with affine.min\n");
 
       AffineExpr dim0, dim1, dim2;
-      bindDims(builder.getContext(), dim0, dim1, dim2);
+      MLIRContext *context = builder.getContext();
+      bindDims(context, dim0, dim1, dim2);
 
       // Get the dimension size for this dimension. We need to first calculate
       // the max index and then plus one. This is important because for
@@ -678,12 +679,12 @@ computeSliceParameters(OpBuilder &builder, Location loc, Value valueToTile,
       // form `(d0 * s0 + d1)`, where `d0`/`d1 is an output/filter window
       // dimension and `s0` is stride. Directly use the dimension size of
       // output/filer window dimensions will cause incorrect calculation.
-      AffineMap minusOneMap =
-          AffineMap::inferFromExprList({ArrayRef<AffineExpr>{dim0 - 1}})
-              .front();
-      AffineMap plusOneMap =
-          AffineMap::inferFromExprList({ArrayRef<AffineExpr>{dim0 + 1}})
-              .front();
+      AffineMap minusOneMap = AffineMap::inferFromExprList(
+                                  {ArrayRef<AffineExpr>{dim0 - 1}}, context)
+                                  .front();
+      AffineMap plusOneMap = AffineMap::inferFromExprList(
+                                 {ArrayRef<AffineExpr>{dim0 + 1}}, context)
+                                 .front();
       SmallVector<OpFoldResult> maxIndices =
           llvm::to_vector(llvm::map_range(ubs, [&](OpFoldResult ub) {
             return makeComposedFoldedAffineApply(rewriter, loc, minusOneMap,
@@ -696,7 +697,7 @@ computeSliceParameters(OpBuilder &builder, Location loc, Value valueToTile,
 
       // Compute min(dim - offset, size) to avoid out-of-bounds accesses.
       AffineMap minMap = AffineMap::inferFromExprList(
-                             {ArrayRef<AffineExpr>{dim1 - dim2, dim0}})
+                             {ArrayRef<AffineExpr>{dim1 - dim2, dim0}}, context)
                              .front();
       size =
           makeComposedFoldedAffineMin(rewriter, loc, minMap, {size, d, offset});
diff --git a/mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp b/mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
index 87a37a7926e9e5..dd3af9d8354123 100644
--- a/mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
+++ b/mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
@@ -1263,7 +1263,9 @@ struct LinalgOpRewriter : public OpRewritePattern<linalg::GenericOp> {
     SmallVector<AffineMap, 4> maps = op.getIndexingMapsArray();
 
     using MapList = ArrayRef<ArrayRef<AffineExpr>>;
-    auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+    auto infer = [&](MapList m) {
+      return AffineMap::inferFromExprList(m, op.getContext());
+    };
     AffineExpr i, j, k;
     bindDims(getContext(), i, j, k);
 
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 452354413e8833..5be6a628904cdf 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -675,9 +675,10 @@ void vector::ContractionOp::build(OpBuilder &builder, OperationState &result,
                                   ArrayRef<IteratorType> iteratorTypes) {
   result.addOperands({lhs, rhs, acc});
   result.addTypes(acc.getType());
-  result.addAttribute(getIndexingMapsAttrName(result.name),
-                      builder.getAffineMapArrayAttr(
-                          AffineMap::inferFromExprList(indexingExprs)));
+  result.addAttribute(
+      getIndexingMapsAttrName(result.name),
+      builder.getAffineMapArrayAttr(
+          AffineMap::inferFromExprList(indexingExprs, builder.getContext())));
   result.addAttribute(
       getIteratorTypesAttrName(result.name),
       builder.getArrayAttr(llvm::to_vector(llvm::map_range(
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp
index 446eb853d2e92d..0eaf9f71a37d21 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp
@@ -695,7 +695,9 @@ ContractionOpToDotLowering::matchAndRewrite(vector::ContractionOp op,
   Value lhs = op.getLhs(), rhs = op.getRhs();
 
   using MapList = ArrayRef<ArrayRef<AffineExpr>>;
-  auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+  auto infer = [&](MapList m) {
+    return AffineMap::inferFromExprList(m, op.getContext());
+  };
   AffineExpr m, n, k;
   bindDims(rewriter.getContext(), m, n, k);
   SmallVector<AffineMap> maps = op.getIndexingMapsArray();
diff --git a/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp b/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
index f1a27168bd4e54..b844c2bfa837ce 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorTransferSplitRewritePatterns.cpp
@@ -209,7 +209,7 @@ createSubViewIntersection(RewriterBase &b, VectorTransferOpInterface xferOp,
     AffineExpr i, j, k;
     bindDims(xferOp.getContext(), i, j, k);
     SmallVector<AffineMap, 4> maps =
-        AffineMap::inferFromExprList(MapList{{i - j, k}});
+        AffineMap::inferFromExprList(MapList{{i - j, k}}, b.getContext());
     // affine_min(%dimMemRef - %index, %dimAlloc)
     Value affineMin = b.create<affine::AffineMinOp>(
         loc, index.getType(), maps[0], ValueRange{dimMemRef, index, dimAlloc});
diff --git a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
index 4034dc40685a4b..53ae138d1e43a0 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
@@ -160,8 +160,9 @@ struct MultiReduceToContract
         iteratorTypes.push_back(vector::IteratorType::reduction);
       }
     }
-    auto dstMap = AffineMap::get(/*dimCount=*/reductionMask.size(),
-                                 /*symCount=*/0, exprs, reduceOp.getContext());
+    auto dstMap =
+        AffineMap::get(/*dimCount=*/reductionMask.size(),
+                       /*symbolCount=*/0, exprs, reduceOp.getContext());
     rewriter.replaceOpWithNewOp<mlir::vector::ContractionOp>(
         reduceOp, mulOp->getOperand(0), mulOp->getOperand(1), reduceOp.getAcc(),
         rewriter.getAffineMapArrayAttr({srcMap, srcMap, dstMap}),
@@ -1399,7 +1400,9 @@ struct CanonicalizeContractMatmulToMMT final
 
     // Set up the parallel/reduction structure in right form.
     using MapList = ArrayRef<ArrayRef<AffineExpr>>;
-    auto infer = [](MapList m) { return AffineMap::inferFromExprList(m); };
+    auto infer = [&](MapList m) {
+      return AffineMap::inferFromExprList(m, op.getContext());
+    };
     AffineExpr m;
     AffineExpr n;
     AffineExpr k;
diff --git a/mlir/lib/IR/AffineMap.cpp b/mlir/lib/IR/AffineMap.cpp
index c2804626635947..4aa0d4f34a09fa 100644
--- a/mlir/lib/IR/AffineMap.cpp
+++ b/mlir/lib/IR/AffineMap.cpp
@@ -272,12 +272,16 @@ AffineMap AffineMap::getMultiDimMapWithTargets(unsigned numDims,
   return result;
 }
 
+/// Creates an affine map each for each list of AffineExpr's in `exprsList`
+/// while inferring the right number of dimensional and symbolic inputs needed
+/// based on the maximum dimensional and symbolic identifier appearing in the
+/// expressions.
 template <typename AffineExprContainer>
 static SmallVector<AffineMap, 4>
-inferFromExprList(ArrayRef<AffineExprContainer> exprsList) {
-  assert(!exprsList.empty());
-  assert(!exprsList[0].empty());
-  auto context = exprsList[0][0].getContext();
+inferFromExprList(ArrayRef<AffineExprContainer> exprsList,
+                  MLIRContext *context) {
+  if (exprsList.empty())
+    return {};
   int64_t maxDim = -1, maxSym = -1;
   getMaxDimAndSymbol(exprsList, maxDim, maxSym);
   SmallVector<AffineMap, 4> maps;
@@ -289,13 +293,15 @@ inferFromExprList(ArrayRef<AffineExprContainer> exprsList) {
 }
 
 SmallVector<AffineMap, 4>
-AffineMap::inferFromExprList(ArrayRef<ArrayRef<AffineExpr>> exprsList) {
-  return ::inferFromExprList(exprsList);
+AffineMap::inferFromExprList(ArrayRef<ArrayRef<AffineExpr>> exprsList,
+                             MLIRContext *context) {
+  return ::inferFromExprList(exprsList, context);
 }
 
 SmallVector<AffineMap, 4>
-AffineMap::inferFromExprList(ArrayRef<SmallVector<AffineExpr, 4>> exprsList) {
-  return ::inferFromExprList(exprsList);
+AffineMap::inferFromExprList(ArrayRef<SmallVector<AffineExpr, 4>> exprsList,
+                             MLIRContext *context) {
+  return ::inferFromExprList(exprsList, context);
 }
 
 uint64_t AffineMap::getLargestKnownDivisorOfMapExprs() {
@@ -521,7 +527,7 @@ AffineMap::replace(const DenseMap<AffineExpr, AffineExpr> &map) const {
   newResults.reserve(getNumResults());
   for (AffineExpr e : getResults())
     newResults.push_back(e.replace(map));
-  return AffineMap::inferFromExprList(newResults).front();
+  return AffineMap::inferFromExprList(newResults, getContext()).front();
 }
 
 AffineMap AffineMap::dropResults(const llvm::SmallBitVector &positions) const {
diff --git a/mlir/lib/IR/BuiltinTypes.cpp b/mlir/lib/IR/BuiltinTypes.cpp
index 9b8ee3d4528035..1794b38478a72d 100644
--- a/mlir/lib/IR/BuiltinTypes.cpp
+++ b/mlir/lib/IR/BuiltinTypes.cpp
@@ -921,7 +921,7 @@ AffineExpr mlir::makeCanonicalStridedLayoutExpr(ArrayRef<int64_t> sizes,
     return getAffineConstantExpr(0, context);
 
   assert(!exprs.empty() && "expected exprs");
-  auto maps = AffineMap::inferFromExprList(exprs);
+  auto maps = AffineMap::inferFromExprList(exprs, context);
   assert(!maps.empty() && "Expected one non-empty map");
   unsigned numDims = maps[0].getNumDims(), nSymbols = maps[0].getNumSymbols();
 
diff --git a/mlir/unittests/IR/AffineMapTest.cpp b/mlir/unittests/IR/AffineMapTest.cpp
new file mode 100644
index 00000000000000..081afadd632f1b
--- /dev/null
+++ b/mlir/unittests/IR/AffineMapTest.cpp
@@ -0,0 +1,23 @@
+//===- AffineMapTest.cpp - unit tests for affine map API ------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/IR/AffineMap.h"
+#include "mlir/IR/Builders.h"
+#include "gtest/gtest.h"
+
+using namespace mlir;
+
+// Test AffineMap replace API for the zero result case.
+TEST(AffineMapTest, inferMapFromAffineExprs) {
+  MLIRContext ctx;
+  OpBuilder b(&ctx);
+  AffineMap map = b.getEmptyAffineMap();
+  DenseMap<AffineExpr, AffineExpr> replacements;
+  map.replace(replacements);
+  EXPECT_EQ(map, map);
+}
diff --git a/mlir/unittests/IR/CMakeLists.txt b/mlir/unittests/IR/CMakeLists.txt
index 1ed46869c2c8a9..e7e9c3b5651693 100644
--- a/mlir/unittests/IR/CMakeLists.txt
+++ b/mlir/unittests/IR/CMakeLists.txt
@@ -1,5 +1,6 @@
 add_mlir_unittest(MLIRIRTests
   AdaptorTest.cpp
+  AffineMapTest.cpp
   AttributeTest.cpp
   DialectTest.cpp
   InterfaceTest.cpp

>From 17be95d65d5e70055d5a1e8e1e66febea7f61e90 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= <tbaeder at redhat.com>
Date: Thu, 8 Feb 2024 15:29:44 +0100
Subject: [PATCH 31/72] [clang][ExprConst] Remove unnecessary cast

FD is a FunctionDecl, so no need to cast a FunctionDecl to a
CXXMethodDecl just to assign it to a FunctionDecl.
---
 clang/lib/AST/ExprConstant.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 089bc2094567f7..02e153ff10737c 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -8006,7 +8006,8 @@ class ExprEvaluatorBase
           assert(CorrespondingCallOpSpecialization &&
                  "We must always have a function call operator specialization "
                  "that corresponds to our static invoker specialization");
-          FD = cast<CXXMethodDecl>(CorrespondingCallOpSpecialization);
+          assert(isa<CXXMethodDecl>(CorrespondingCallOpSpecialization));
+          FD = CorrespondingCallOpSpecialization;
         } else
           FD = LambdaCallOp;
       } else if (FD->isReplaceableGlobalAllocationFunction()) {

>From d39b4796e493bc8c5d8277539d4e8ccbd2f0f332 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 8 Feb 2024 15:29:32 +0100
Subject: [PATCH 32/72] [PatternMatch] Add m_PtrAdd() matcher (NFC)

This matches a getelementptr i8 instruction or constant expression,
with a given pointer operand and index.
---
 llvm/include/llvm/IR/PatternMatch.h | 22 ++++++++++++++++++++++
 llvm/unittests/IR/PatternMatch.cpp  | 22 ++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/llvm/include/llvm/IR/PatternMatch.h b/llvm/include/llvm/IR/PatternMatch.h
index 3155e7dc38b64a..fed552414298ad 100644
--- a/llvm/include/llvm/IR/PatternMatch.h
+++ b/llvm/include/llvm/IR/PatternMatch.h
@@ -1614,6 +1614,21 @@ struct m_SplatOrUndefMask {
   }
 };
 
+template <typename PointerOpTy, typename OffsetOpTy> struct PtrAdd_match {
+  PointerOpTy PointerOp;
+  OffsetOpTy OffsetOp;
+
+  PtrAdd_match(const PointerOpTy &PointerOp, const OffsetOpTy &OffsetOp)
+      : PointerOp(PointerOp), OffsetOp(OffsetOp) {}
+
+  template <typename OpTy> bool match(OpTy *V) {
+    auto *GEP = dyn_cast<GEPOperator>(V);
+    return GEP && GEP->getSourceElementType()->isIntegerTy(8) &&
+           PointerOp.match(GEP->getPointerOperand()) &&
+           OffsetOp.match(GEP->idx_begin()->get());
+  }
+};
+
 /// Matches ShuffleVectorInst independently of mask value.
 template <typename V1_t, typename V2_t>
 inline TwoOps_match<V1_t, V2_t, Instruction::ShuffleVector>
@@ -1647,6 +1662,13 @@ inline auto m_GEP(const OperandTypes &...Ops) {
   return AnyOps_match<Instruction::GetElementPtr, OperandTypes...>(Ops...);
 }
 
+/// Matches GEP with i8 source element type
+template <typename PointerOpTy, typename OffsetOpTy>
+inline PtrAdd_match<PointerOpTy, OffsetOpTy>
+m_PtrAdd(const PointerOpTy &PointerOp, const OffsetOpTy &OffsetOp) {
+  return PtrAdd_match<PointerOpTy, OffsetOpTy>(PointerOp, OffsetOp);
+}
+
 //===----------------------------------------------------------------------===//
 // Matchers for CastInst classes
 //
diff --git a/llvm/unittests/IR/PatternMatch.cpp b/llvm/unittests/IR/PatternMatch.cpp
index 885b1346cde1eb..883149c686b42a 100644
--- a/llvm/unittests/IR/PatternMatch.cpp
+++ b/llvm/unittests/IR/PatternMatch.cpp
@@ -1889,4 +1889,26 @@ TEST_F(PatternMatchTest, ConstExpr) {
   EXPECT_TRUE(match(V, m_ConstantExpr()));
 }
 
+TEST_F(PatternMatchTest, PtrAdd) {
+  Type *PtrTy = PointerType::getUnqual(Ctx);
+  Type *IdxTy = Type::getInt64Ty(Ctx);
+  Constant *Null = Constant::getNullValue(PtrTy);
+  Constant *Offset = ConstantInt::get(IdxTy, 42);
+  Value *PtrAdd = IRB.CreatePtrAdd(Null, Offset);
+  Value *OtherGEP = IRB.CreateGEP(IdxTy, Null, Offset);
+  Value *PtrAddConst =
+      ConstantExpr::getGetElementPtr(Type::getInt8Ty(Ctx), Null, Offset);
+
+  Value *A, *B;
+  EXPECT_TRUE(match(PtrAdd, m_PtrAdd(m_Value(A), m_Value(B))));
+  EXPECT_EQ(A, Null);
+  EXPECT_EQ(B, Offset);
+
+  EXPECT_TRUE(match(PtrAddConst, m_PtrAdd(m_Value(A), m_Value(B))));
+  EXPECT_EQ(A, Null);
+  EXPECT_EQ(B, Offset);
+
+  EXPECT_FALSE(match(OtherGEP, m_PtrAdd(m_Value(A), m_Value(B))));
+}
+
 } // anonymous namespace.

>From e95d319698c9170eceff74ce303f05cac83368c4 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng <dtcxzyw2333 at gmail.com>
Date: Thu, 8 Feb 2024 22:34:52 +0800
Subject: [PATCH 33/72] [ConstantRange] Improve ConstantRange::binaryXor
 (#80146)

`ConstantRange::binaryXor` gives poor results as it currently depends on
`KnownBits::operator^`.
Since `sub A, B` is canonicalized into `xor A, B` if `B` is the subset
of `A`, this patch reverts the transform in `ConstantRange::binaryXor`,
which will give better results.

Alive2: https://alive2.llvm.org/ce/z/bmTMV9
Fixes #79696.
---
 llvm/lib/IR/ConstantRange.cpp           | 17 +++++++-
 llvm/test/Transforms/SCCP/pr79696.ll    | 55 +++++++++++++++++++++++++
 llvm/unittests/IR/ConstantRangeTest.cpp |  6 +++
 3 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/SCCP/pr79696.ll

diff --git a/llvm/lib/IR/ConstantRange.cpp b/llvm/lib/IR/ConstantRange.cpp
index cbb64b299e648e..3394a1ec8dc476 100644
--- a/llvm/lib/IR/ConstantRange.cpp
+++ b/llvm/lib/IR/ConstantRange.cpp
@@ -1467,7 +1467,22 @@ ConstantRange ConstantRange::binaryXor(const ConstantRange &Other) const {
   if (isSingleElement() && getSingleElement()->isAllOnes())
     return Other.binaryNot();
 
-  return fromKnownBits(toKnownBits() ^ Other.toKnownBits(), /*IsSigned*/false);
+  KnownBits LHSKnown = toKnownBits();
+  KnownBits RHSKnown = Other.toKnownBits();
+  KnownBits Known = LHSKnown ^ RHSKnown;
+  ConstantRange CR = fromKnownBits(Known, /*IsSigned*/ false);
+  // Typically the following code doesn't improve the result if BW = 1.
+  if (getBitWidth() == 1)
+    return CR;
+
+  // If LHS is known to be the subset of RHS, treat LHS ^ RHS as RHS -nuw/nsw
+  // LHS. If RHS is known to be the subset of LHS, treat LHS ^ RHS as LHS
+  // -nuw/nsw RHS.
+  if ((~LHSKnown.Zero).isSubsetOf(RHSKnown.One))
+    CR = CR.intersectWith(Other.sub(*this), PreferredRangeType::Unsigned);
+  else if ((~RHSKnown.Zero).isSubsetOf(LHSKnown.One))
+    CR = CR.intersectWith(this->sub(Other), PreferredRangeType::Unsigned);
+  return CR;
 }
 
 ConstantRange
diff --git a/llvm/test/Transforms/SCCP/pr79696.ll b/llvm/test/Transforms/SCCP/pr79696.ll
new file mode 100644
index 00000000000000..a860112d5ef36f
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/pr79696.ll
@@ -0,0 +1,55 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=ipsccp -S | FileCheck %s
+
+; Tests from PR79696
+
+define i1 @constant_range_xor(i64 %a) {
+; CHECK-LABEL: define i1 @constant_range_xor(
+; CHECK-SAME: i64 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i64 [[A]], 8192
+; CHECK-NEXT:    br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:       then:
+; CHECK-NEXT:    [[CTLZ:%.*]] = call i64 @llvm.ctlz.i64(i64 [[A]], i1 true)
+; CHECK-NEXT:    [[CONV:%.*]] = xor i64 [[CTLZ]], 63
+; CHECK-NEXT:    ret i1 false
+; CHECK:       else:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %cmp = icmp ugt i64 %a, 8192
+  br i1 %cmp, label %then, label %else
+then:
+  %ctlz = call i64 @llvm.ctlz.i64(i64 %a, i1 true) ;[0, 50]
+  %conv = xor i64 %ctlz, 63                        ;[13, 63]
+  %cmp1 = icmp ult i64 %conv, 13
+  ret i1 %cmp1
+else:
+  ret i1 false
+}
+
+define i1 @constant_range_xor_negative(i64 %a) {
+; CHECK-LABEL: define i1 @constant_range_xor_negative(
+; CHECK-SAME: i64 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i64 [[A]], 8192
+; CHECK-NEXT:    br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:       then:
+; CHECK-NEXT:    [[CTLZ:%.*]] = call i64 @llvm.ctlz.i64(i64 [[A]], i1 true)
+; CHECK-NEXT:    [[CONV:%.*]] = xor i64 [[CTLZ]], 62
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp ult i64 [[CONV]], 13
+; CHECK-NEXT:    ret i1 [[CMP1]]
+; CHECK:       else:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %cmp = icmp ugt i64 %a, 8192
+  br i1 %cmp, label %then, label %else
+then:
+  %ctlz = call i64 @llvm.ctlz.i64(i64 %a, i1 true) ;[0, 50]
+  %conv = xor i64 %ctlz, 62                        ;[12, 63]
+  %cmp1 = icmp ult i64 %conv, 13
+  ret i1 %cmp1
+else:
+  ret i1 false
+}
diff --git a/llvm/unittests/IR/ConstantRangeTest.cpp b/llvm/unittests/IR/ConstantRangeTest.cpp
index e505af5d3275ef..34a162a5514e95 100644
--- a/llvm/unittests/IR/ConstantRangeTest.cpp
+++ b/llvm/unittests/IR/ConstantRangeTest.cpp
@@ -2565,6 +2565,12 @@ TEST_F(ConstantRangeTest, binaryXor) {
   EXPECT_EQ(R16_35.binaryXor(R0_99), ConstantRange(APInt(8, 0), APInt(8, 128)));
   EXPECT_EQ(R0_99.binaryXor(R16_35), ConstantRange(APInt(8, 0), APInt(8, 128)));
 
+  // Treat xor A, B as sub nsw nuw A, B
+  ConstantRange R0_51(APInt(8, 0), APInt(8, 51));
+  ConstantRange R63(APInt(8, 63));
+  EXPECT_EQ(R0_51.binaryXor(R63), ConstantRange(APInt(8, 13), APInt(8, 64)));
+  EXPECT_EQ(R63.binaryXor(R0_51), ConstantRange(APInt(8, 13), APInt(8, 64)));
+
   TestBinaryOpExhaustive(
       [](const ConstantRange &CR1, const ConstantRange &CR2) {
         return CR1.binaryXor(CR2);

>From 017d971966307eec9f1706df3fae6acdbe063a20 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= <tbaeder at redhat.com>
Date: Thu, 8 Feb 2024 09:55:07 +0100
Subject: [PATCH 34/72] [clang][Interp] Handle CXXInheritedCtorInitExprs

We need to forward all arguments of the current function and
call the ctor function.
---
 clang/lib/AST/Interp/ByteCodeExprGen.cpp | 31 ++++++++++++
 clang/lib/AST/Interp/ByteCodeExprGen.h   |  1 +
 clang/test/AST/Interp/records.cpp        | 60 ++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index 59fddfc2da1957..21bc29ff8ee2e5 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -2020,6 +2020,37 @@ bool ByteCodeExprGen<Emitter>::VisitObjCBoolLiteralExpr(
   return this->emitConst(E->getValue(), E);
 }
 
+template <class Emitter>
+bool ByteCodeExprGen<Emitter>::VisitCXXInheritedCtorInitExpr(
+    const CXXInheritedCtorInitExpr *E) {
+  const CXXConstructorDecl *Ctor = E->getConstructor();
+  assert(!Ctor->isTrivial() &&
+         "Trivial CXXInheritedCtorInitExpr, implement. (possible?)");
+  const Function *F = this->getFunction(Ctor);
+  assert(F);
+  assert(!F->hasRVO());
+  assert(F->hasThisPointer());
+
+  if (!this->emitDupPtr(SourceInfo{}))
+    return false;
+
+  // Forward all arguments of the current function (which should be a
+  // constructor itself) to the inherited ctor.
+  // This is necessary because the calling code has pushed the pointer
+  // of the correct base for  us already, but the arguments need
+  // to come after.
+  unsigned Offset = align(primSize(PT_Ptr)); // instance pointer.
+  for (const ParmVarDecl *PD : Ctor->parameters()) {
+    PrimType PT = this->classify(PD->getType()).value_or(PT_Ptr);
+
+    if (!this->emitGetParam(PT, Offset, E))
+      return false;
+    Offset += align(primSize(PT));
+  }
+
+  return this->emitCall(F, E);
+}
+
 template <class Emitter> bool ByteCodeExprGen<Emitter>::discard(const Expr *E) {
   if (E->containsErrors())
     return false;
diff --git a/clang/lib/AST/Interp/ByteCodeExprGen.h b/clang/lib/AST/Interp/ByteCodeExprGen.h
index 2c9cca5082b121..c908a9bf1ef834 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.h
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.h
@@ -111,6 +111,7 @@ class ByteCodeExprGen : public ConstStmtVisitor<ByteCodeExprGen<Emitter>, bool>,
   bool VisitGenericSelectionExpr(const GenericSelectionExpr *E);
   bool VisitChooseExpr(const ChooseExpr *E);
   bool VisitObjCBoolLiteralExpr(const ObjCBoolLiteralExpr *E);
+  bool VisitCXXInheritedCtorInitExpr(const CXXInheritedCtorInitExpr *E);
 
 protected:
   bool visitExpr(const Expr *E) override;
diff --git a/clang/test/AST/Interp/records.cpp b/clang/test/AST/Interp/records.cpp
index 5ce1e6e09a0b74..1ef13f558b753c 100644
--- a/clang/test/AST/Interp/records.cpp
+++ b/clang/test/AST/Interp/records.cpp
@@ -1223,3 +1223,63 @@ namespace IndirectFieldInit {
 
 #endif
 }
+
+namespace InheritedConstructor {
+  namespace PR47555 {
+    struct A {
+      int c;
+      int d;
+      constexpr A(int c, int d) : c(c), d(d){}
+    };
+    struct B : A { using A::A; };
+
+    constexpr B b = {13, 1};
+    static_assert(b.c == 13, "");
+    static_assert(b.d == 1, "");
+  }
+
+  namespace PR47555_2 {
+    struct A {
+      int c;
+      int d;
+      double e;
+      constexpr A(int c, int &d, double e) : c(c), d(++d), e(e){}
+    };
+    struct B : A { using A::A; };
+
+    constexpr int f() {
+      int a = 10;
+      B b = {10, a, 40.0};
+      return a;
+    }
+    static_assert(f() == 11, "");
+  }
+
+  namespace AaronsTest {
+    struct T {
+      constexpr T(float) {}
+    };
+
+    struct Base {
+      constexpr Base(T t = 1.0f) {}
+      constexpr Base(float) {}
+    };
+
+    struct FirstMiddle : Base {
+      using Base::Base;
+      constexpr FirstMiddle() : Base(2.0f) {}
+    };
+
+    struct SecondMiddle : Base {
+      constexpr SecondMiddle() : Base(3.0f) {}
+      constexpr SecondMiddle(T t) : Base(t) {}
+    };
+
+    struct S : FirstMiddle, SecondMiddle {
+      using FirstMiddle::FirstMiddle;
+      constexpr S(int i) : S(4.0f) {}
+    };
+
+    constexpr S s(1);
+  }
+}

>From be9eebdce4b378a8eb9b6cb4b29ac342c81b1719 Mon Sep 17 00:00:00 2001
From: Shilei Tian <i at tianshilei.me>
Date: Thu, 8 Feb 2024 09:44:42 -0500
Subject: [PATCH 35/72] [Clang] Fix a non-effective assertion (#81083)

`PTy` here is literally `FTy->getParamType(i)`, which makes this
assertion not
work as expected.
---
 clang/lib/CodeGen/CGBuiltin.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e051cbc6486353..a7a410dab1a018 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -5908,7 +5908,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
           }
         }
 
-        assert(PTy->canLosslesslyBitCastTo(FTy->getParamType(i)) &&
+        assert(ArgValue->getType()->canLosslesslyBitCastTo(PTy) &&
                "Must be able to losslessly bit cast to param");
         // Cast vector type (e.g., v256i32) to x86_amx, this only happen
         // in amx intrinsics.

>From 3ebe9a32050c83460bb1741140e86ec09f2840af Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= <tbaeder at redhat.com>
Date: Thu, 8 Feb 2024 10:49:14 +0100
Subject: [PATCH 36/72] [clang][Interp][NFC] Convert records test to
 verify=expected,both style

---
 clang/test/AST/Interp/records.cpp | 187 +++++++++++-------------------
 1 file changed, 66 insertions(+), 121 deletions(-)

diff --git a/clang/test/AST/Interp/records.cpp b/clang/test/AST/Interp/records.cpp
index 1ef13f558b753c..fb50d1c6c5833a 100644
--- a/clang/test/AST/Interp/records.cpp
+++ b/clang/test/AST/Interp/records.cpp
@@ -1,11 +1,11 @@
-// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -verify %s
-// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -std=c++14 -verify %s
-// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -std=c++20 -verify %s
-// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -triple i686 -verify %s
-// RUN: %clang_cc1 -verify=ref %s
-// RUN: %clang_cc1 -verify=ref -std=c++14 %s
-// RUN: %clang_cc1 -verify=ref -std=c++20 %s
-// RUN: %clang_cc1 -verify=ref -triple i686 %s
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -verify=expected,both %s
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -std=c++14 -verify=expected,both %s
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -std=c++20 -verify=expected,both %s
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -triple i686 -verify=expected,both %s
+// RUN: %clang_cc1 -verify=ref,both %s
+// RUN: %clang_cc1 -verify=ref,both -std=c++14 %s
+// RUN: %clang_cc1 -verify=ref,both -std=c++20 %s
+// RUN: %clang_cc1 -verify=ref,both -triple i686 %s
 
 /// Used to crash.
 struct Empty {};
@@ -90,9 +90,8 @@ struct Ints2 {
   int a = 10;
   int b;
 };
-constexpr Ints2 ints22; // expected-error {{without a user-provided default constructor}} \
-                        // expected-error {{must be initialized by a constant expression}} \
-                        // ref-error {{without a user-provided default constructor}}
+constexpr Ints2 ints22; // both-error {{without a user-provided default constructor}} \
+                        // expected-error {{must be initialized by a constant expression}}
 
 constexpr Ints2 I2 = Ints2{12, 25};
 static_assert(I2.a == 12, "");
@@ -164,17 +163,13 @@ constexpr C RVOAndParams(int a) {
 }
 constexpr C RVOAndParamsResult2 = RVOAndParams(12);
 
-class Bar { // expected-note {{definition of 'Bar' is not complete}} \
-            // ref-note {{definition of 'Bar' is not complete}}
+class Bar { // both-note {{definition of 'Bar' is not complete}}
 public:
   constexpr Bar(){}
-  constexpr Bar b; // expected-error {{cannot be constexpr}} \
-                   // expected-error {{has incomplete type 'const Bar'}} \
-                   // ref-error {{cannot be constexpr}} \
-                   // ref-error {{has incomplete type 'const Bar'}}
+  constexpr Bar b; // both-error {{cannot be constexpr}} \
+                   // both-error {{has incomplete type 'const Bar'}}
 };
-constexpr Bar B; // expected-error {{must be initialized by a constant expression}} \
-                 // ref-error {{must be initialized by a constant expression}}
+constexpr Bar B; // both-error {{must be initialized by a constant expression}}
 constexpr Bar *pb = nullptr;
 
 constexpr int locals() {
@@ -198,17 +193,13 @@ namespace thisPointer {
     constexpr int get12() { return 12; }
   };
 
-  constexpr int foo() { // ref-error {{never produces a constant expression}} \
-                        // expected-error {{never produces a constant expression}}
+  constexpr int foo() { // both-error {{never produces a constant expression}}
     S *s = nullptr;
-    return s->get12(); // ref-note 2{{member call on dereferenced null pointer}} \
-                       // expected-note 2{{member call on dereferenced null pointer}}
+    return s->get12(); // both-note 2{{member call on dereferenced null pointer}}
 
   }
-  static_assert(foo() == 12, ""); // ref-error {{not an integral constant expression}} \
-                                  // ref-note {{in call to 'foo()'}} \
-                                  // expected-error {{not an integral constant expression}} \
-                                  // expected-note {{in call to 'foo()'}}
+  static_assert(foo() == 12, ""); // both-error {{not an integral constant expression}} \
+                                  // both-note {{in call to 'foo()'}}
 };
 
 struct FourBoolPairs {
@@ -244,20 +235,16 @@ constexpr A a{};
 static_assert(a.i == 100, "");
 constexpr A a2{12};
 static_assert(a2.i == 12, "");
-static_assert(a2.i == 200, ""); // ref-error {{static assertion failed}} \
-                                // ref-note {{evaluates to '12 == 200'}} \
-                                // expected-error {{static assertion failed}} \
-                                // expected-note {{evaluates to '12 == 200'}}
+static_assert(a2.i == 200, ""); // both-error {{static assertion failed}} \
+                                // both-note {{evaluates to '12 == 200'}}
 
 
 struct S {
   int a = 0;
   constexpr int get5() const { return 5; }
   constexpr void fo() const {
-    this; // expected-warning {{expression result unused}} \
-          // ref-warning {{expression result unused}}
-    this->a; // expected-warning {{expression result unused}} \
-             // ref-warning {{expression result unused}}
+    this; // both-warning {{expression result unused}}
+    this->a; // both-warning {{expression result unused}}
     get5();
     getInts();
   }
@@ -342,12 +329,9 @@ namespace InitializerTemporaries {
   // Invalid destructor.
   struct S {
     constexpr S() {}
-    constexpr ~S() noexcept(false) { throw 12; } // expected-error {{cannot use 'throw'}} \
-                                                 // expected-error {{never produces a constant expression}} \
-                                                 // expected-note 2{{subexpression not valid}} \
-                                                 // ref-error {{cannot use 'throw'}} \
-                                                 // ref-error {{never produces a constant expression}} \
-                                                 // ref-note 2{{subexpression not valid}}
+    constexpr ~S() noexcept(false) { throw 12; } // both-error {{cannot use 'throw'}} \
+                                                 // both-error {{never produces a constant expression}} \
+                                                 // both-note 2{{subexpression not valid}}
   };
 
   constexpr int f() {
@@ -355,10 +339,8 @@ namespace InitializerTemporaries {
     /// FIXME: Wrong source location below.
     return 12; // expected-note {{in call to '&S{}->~S()'}}
   }
-  static_assert(f() == 12); // expected-error {{not an integral constant expression}} \
-                            // expected-note {{in call to 'f()'}} \
-                            // ref-error {{not an integral constant expression}} \
-                            // ref-note {{in call to 'f()'}}
+  static_assert(f() == 12); // both-error {{not an integral constant expression}} \
+                            // both-note {{in call to 'f()'}}
 
 
 #endif
@@ -423,7 +405,8 @@ namespace MI {
 
 namespace DeriveFailures {
 #if __cplusplus < 202002L
-  struct Base { // ref-note 2{{declared here}} expected-note {{declared here}}
+  struct Base { // both-note {{declared here}} \
+                // ref-note {{declared here}}
     int Val;
   };
 
@@ -431,35 +414,29 @@ namespace DeriveFailures {
     int OtherVal;
 
     constexpr Derived(int i) : OtherVal(i) {} // ref-error {{never produces a constant expression}} \
-                                              // ref-note 2{{non-constexpr constructor 'Base' cannot be used in a constant expression}} \
-                                              // expected-note {{non-constexpr constructor 'Base' cannot be used in a constant expression}}
+                                              // both-note {{non-constexpr constructor 'Base' cannot be used in a constant expression}} \
+                                              // ref-note {{non-constexpr constructor 'Base' cannot be used in a constant expression}} 
   };
 
-  constexpr Derived D(12); // ref-error {{must be initialized by a constant expression}} \
-                           // ref-note {{in call to 'Derived(12)'}} \
-                           // ref-note {{declared here}} \
-                           // expected-error {{must be initialized by a constant expression}} \
-                           // expected-note {{in call to 'Derived(12)'}}
+  constexpr Derived D(12); // both-error {{must be initialized by a constant expression}} \
+                           // both-note {{in call to 'Derived(12)'}} \
+                           // ref-note {{declared here}}
 
-  static_assert(D.Val == 0, ""); // ref-error {{not an integral constant expression}} \
+  static_assert(D.Val == 0, ""); // both-error {{not an integral constant expression}} \
                                  // ref-note {{initializer of 'D' is not a constant expression}} \
-                                 // expected-error {{not an integral constant expression}} \
                                  // expected-note {{read of uninitialized object}}
 #endif
 
   struct AnotherBase {
     int Val;
-    constexpr AnotherBase(int i) : Val(12 / i) {} //ref-note {{division by zero}} \
-                                                  //expected-note {{division by zero}}
+    constexpr AnotherBase(int i) : Val(12 / i) {} // both-note {{division by zero}}
   };
 
   struct AnotherDerived : AnotherBase {
     constexpr AnotherDerived(int i) : AnotherBase(i) {}
   };
-  constexpr AnotherBase Derp(0); // ref-error {{must be initialized by a constant expression}} \
-                                 // ref-note {{in call to 'AnotherBase(0)'}} \
-                                 // expected-error {{must be initialized by a constant expression}} \
-                                 // expected-note {{in call to 'AnotherBase(0)'}}
+  constexpr AnotherBase Derp(0); // both-error {{must be initialized by a constant expression}} \
+                                 // both-note {{in call to 'AnotherBase(0)'}}
 
   struct YetAnotherBase {
     int Val;
@@ -467,17 +444,14 @@ namespace DeriveFailures {
   };
 
   struct YetAnotherDerived : YetAnotherBase {
-    using YetAnotherBase::YetAnotherBase; // ref-note {{declared here}} \
-                                          // expected-note {{declared here}}
+    using YetAnotherBase::YetAnotherBase; // both-note {{declared here}}
     int OtherVal;
 
     constexpr bool doit() const { return Val == OtherVal; }
   };
 
-  constexpr YetAnotherDerived Oops(0); // ref-error {{must be initialized by a constant expression}} \
-                                       // ref-note {{constructor inherited from base class 'YetAnotherBase' cannot be used in a constant expression}} \
-                                       // expected-error {{must be initialized by a constant expression}} \
-                                       // expected-note {{constructor inherited from base class 'YetAnotherBase' cannot be used in a constant expression}}
+  constexpr YetAnotherDerived Oops(0); // both-error {{must be initialized by a constant expression}} \
+                                       // both-note {{constructor inherited from base class 'YetAnotherBase' cannot be used in a constant expression}}
 };
 
 namespace EmptyCtor {
@@ -543,18 +517,10 @@ namespace PointerArith {
   constexpr B *b1 = &b + 1;
   constexpr B *b2 = &b + 0;
 
-#if 0
-  constexpr A *a2 = &b + 1; // expected-error {{must be initialized by a constant expression}} \
-                            // expected-note {{cannot access base class of pointer past the end of object}} \
-                            // ref-error {{must be initialized by a constant expression}} \
-                            // ref-note {{cannot access base class of pointer past the end of object}}
-
-#endif
-  constexpr const int *pn = &(&b + 1)->n; // expected-error {{must be initialized by a constant expression}} \
-                                          // expected-note {{cannot access field of pointer past the end of object}} \
-                                          // ref-error {{must be initialized by a constant expression}} \
-                                          // ref-note {{cannot access field of pointer past the end of object}}
-
+  constexpr A *a2 = &b + 1; // both-error {{must be initialized by a constant expression}} \
+                            // both-note {{cannot access base class of pointer past the end of object}}
+  constexpr const int *pn = &(&b + 1)->n; // both-error {{must be initialized by a constant expression}} \
+                                          // both-note {{cannot access field of pointer past the end of object}}
 }
 
 #if __cplusplus >= 202002L
@@ -632,12 +598,9 @@ namespace Destructors {
 
   struct S {
     constexpr S() {}
-    constexpr ~S() { // expected-error {{never produces a constant expression}} \
-                     // ref-error {{never produces a constant expression}}
-      int i = 1 / 0; // expected-warning {{division by zero}} \
-                     // expected-note 2{{division by zero}} \
-                     // ref-warning {{division by zero}} \
-                     // ref-note 2{{division by zero}}
+    constexpr ~S() { // both-error {{never produces a constant expression}}
+      int i = 1 / 0; // both-warning {{division by zero}} \
+                     // both-note 2{{division by zero}}
     }
   };
   constexpr int testS() {
@@ -645,10 +608,8 @@ namespace Destructors {
     return 1; // expected-note {{in call to '&S{}->~S()'}}
               // FIXME: ^ Wrong line
   }
-  static_assert(testS() == 1); // expected-error {{not an integral constant expression}} \
-                               // expected-note {{in call to 'testS()'}} \
-                               // ref-error {{not an integral constant expression}} \
-                               // ref-note {{in call to 'testS()'}}
+  static_assert(testS() == 1); // both-error {{not an integral constant expression}} \
+                               // both-note {{in call to 'testS()'}}
 }
 
 namespace BaseToDerived {
@@ -657,10 +618,8 @@ namespace A {
   struct B : A { int n; };
   struct C : B {};
   C c = {};
-  constexpr C *pb = (C*)((A*)&c + 1); // expected-error {{must be initialized by a constant expression}} \
-                                      // expected-note {{cannot access derived class of pointer past the end of object}} \
-                                      // ref-error {{must be initialized by a constant expression}} \
-                                      // ref-note {{cannot access derived class of pointer past the end of object}}
+  constexpr C *pb = (C*)((A*)&c + 1); // both-error {{must be initialized by a constant expression}} \
+                                      // both-note {{cannot access derived class of pointer past the end of object}}
 }
 namespace B {
   struct A {};
@@ -894,10 +853,8 @@ namespace VirtualFromBase {
   // Virtual f(), not OK.
   constexpr X<X<S2>> xxs2;
   constexpr X<S2> *q = const_cast<X<X<S2>>*>(&xxs2);
-  static_assert(q->f() == sizeof(X<S2>), ""); // ref-error {{not an integral constant expression}} \
-                                              // ref-note {{cannot evaluate call to virtual function}} \
-                                              // expected-error {{not an integral constant expression}} \
-                                              // expected-note {{cannot evaluate call to virtual function}}
+  static_assert(q->f() == sizeof(X<S2>), ""); // both-error {{not an integral constant expression}} \
+                                              // both-note {{cannot evaluate call to virtual function}}
 }
 #endif
 
@@ -1070,14 +1027,10 @@ namespace ParenInit {
 
   /// Not constexpr!
   O o1(0);
-  constinit O o2(0); // ref-error {{variable does not have a constant initializer}} \
-                     // ref-note {{required by 'constinit' specifier}} \
-                     // ref-note {{reference to temporary is not a constant expression}} \
-                     // ref-note {{temporary created here}} \
-                     // expected-error {{variable does not have a constant initializer}} \
-                     // expected-note {{required by 'constinit' specifier}} \
-                     // expected-note {{reference to temporary is not a constant expression}} \
-                     // expected-note {{temporary created here}}
+  constinit O o2(0); // both-error {{variable does not have a constant initializer}} \
+                     // both-note {{required by 'constinit' specifier}} \
+                     // both-note {{reference to temporary is not a constant expression}} \
+                     // both-note {{temporary created here}}
 }
 #endif
 
@@ -1109,32 +1062,24 @@ namespace AccessOnNullptr {
     int a;
   };
 
-  constexpr int a() { // expected-error {{never produces a constant expression}} \
-                      // ref-error {{never produces a constant expression}}
+  constexpr int a() { // both-error {{never produces a constant expression}}
     F *f = nullptr;
 
-    f->a = 0; // expected-note 2{{cannot access field of null pointer}} \
-              // ref-note 2{{cannot access field of null pointer}}
+    f->a = 0; // both-note 2{{cannot access field of null pointer}}
     return f->a;
   }
-  static_assert(a() == 0, ""); // expected-error {{not an integral constant expression}} \
-                               // expected-note {{in call to 'a()'}} \
-                               // ref-error {{not an integral constant expression}} \
-                               // ref-note {{in call to 'a()'}}
+  static_assert(a() == 0, ""); // both-error {{not an integral constant expression}} \
+                               // both-note {{in call to 'a()'}}
 
-  constexpr int a2() { // expected-error {{never produces a constant expression}} \
-                      // ref-error {{never produces a constant expression}}
+  constexpr int a2() { // both-error {{never produces a constant expression}}
     F *f = nullptr;
 
 
-    const int *a = &(f->a); // expected-note 2{{cannot access field of null pointer}} \
-                            // ref-note 2{{cannot access field of null pointer}}
+    const int *a = &(f->a); // both-note 2{{cannot access field of null pointer}}
     return f->a;
   }
-  static_assert(a2() == 0, ""); // expected-error {{not an integral constant expression}} \
-                               // expected-note {{in call to 'a2()'}} \
-                               // ref-error {{not an integral constant expression}} \
-                               // ref-note {{in call to 'a2()'}}
+  static_assert(a2() == 0, ""); // both-error {{not an integral constant expression}} \
+                                // both-note {{in call to 'a2()'}}
 }
 
 namespace IndirectFieldInit {

>From d781a49e588803b1250af0e12c5e6498ceb55d42 Mon Sep 17 00:00:00 2001
From: Tarun Prabhu <tarun at lanl.gov>
Date: Thu, 8 Feb 2024 07:56:16 -0700
Subject: [PATCH 37/72] [flang][docs] Update flang documentation regarding the
 test suite (#80755)

Remove redundant reference to flang not being able to generate code. Add
a reference to the gfortran tests that are part of the LLVM Test Suite.
---
 flang/docs/FortranLLVMTestSuite.md | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/flang/docs/FortranLLVMTestSuite.md b/flang/docs/FortranLLVMTestSuite.md
index 62459e6a7b7020..f07d415520a874 100644
--- a/flang/docs/FortranLLVMTestSuite.md
+++ b/flang/docs/FortranLLVMTestSuite.md
@@ -12,12 +12,6 @@ first-time users read through [LLVM Test Suite
 Guide](https://llvm.org/docs/TestSuiteGuide.html) which describes the
 organizational structure of the test suite and how to run it.
 
-Although the Flang driver is unable to generate code at this time, we
-are neverthelesss incrementally adding Fortran tests into the LLVM
-Test Suite. We are currently testing against GFortran while we make
-progress towards completing the new Flang driver with full
-code-generation capabilities.
-
 ## Running the LLVM test-suite with Fortran
 
 Fortran support can be enabled by setting the following CMake variables:
@@ -63,3 +57,12 @@ cmake -G "Ninja" -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ \
     -DTEST_SUITE_FORTRAN:STRING=ON \
     -DTEST_SUITE_SPEC2017_ROOT=<path to SPEC directory>  ..
 ```
+
+## Running the gfortran tests
+
+Tests from the gfortran test suite have been imported into the LLVM Test Suite.
+The tests will be run automatically if the test suite is built following the
+instructions described [above](#running-the-LLVM-test-suite-with-fortran).
+There are additional configure-time options that can be used with the gfortran 
+tests. More details about those options and their purpose can be found in 
+[`Fortran/gfortran/README.md`](https://github.com/llvm/llvm-test-suite/tree/main/Fortran/gfortran/README.md)`.

>From cd8f7d2e0dac8566307b4e7c5162c508cd927888 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= <tbaeder at redhat.com>
Date: Thu, 8 Feb 2024 15:15:14 +0100
Subject: [PATCH 38/72] [clang][Interp] Fix handling of generic lambdas

When compiling their static invoker, we need to get the
right specialization.
---
 clang/lib/AST/Interp/ByteCodeEmitter.cpp | 30 +++++++++++++++++++++++-
 clang/test/AST/Interp/lambda.cpp         | 13 ++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/clang/lib/AST/Interp/ByteCodeEmitter.cpp b/clang/lib/AST/Interp/ByteCodeEmitter.cpp
index 8bbfa928bd6457..e697e24fb341d2 100644
--- a/clang/lib/AST/Interp/ByteCodeEmitter.cpp
+++ b/clang/lib/AST/Interp/ByteCodeEmitter.cpp
@@ -23,6 +23,34 @@ using namespace clang;
 using namespace clang::interp;
 
 Function *ByteCodeEmitter::compileFunc(const FunctionDecl *FuncDecl) {
+  bool IsLambdaStaticInvoker = false;
+  if (const auto *MD = dyn_cast<CXXMethodDecl>(FuncDecl);
+      MD && MD->isLambdaStaticInvoker()) {
+    // For a lambda static invoker, we might have to pick a specialized
+    // version if the lambda is generic. In that case, the picked function
+    // will *NOT* be a static invoker anymore. However, it will still
+    // be a non-static member function, this (usually) requiring an
+    // instance pointer. We suppress that later in this function.
+    IsLambdaStaticInvoker = true;
+
+    const CXXRecordDecl *ClosureClass = MD->getParent();
+    assert(ClosureClass->captures_begin() == ClosureClass->captures_end());
+    if (ClosureClass->isGenericLambda()) {
+      const CXXMethodDecl *LambdaCallOp = ClosureClass->getLambdaCallOperator();
+      assert(MD->isFunctionTemplateSpecialization() &&
+             "A generic lambda's static-invoker function must be a "
+             "template specialization");
+      const TemplateArgumentList *TAL = MD->getTemplateSpecializationArgs();
+      FunctionTemplateDecl *CallOpTemplate =
+          LambdaCallOp->getDescribedFunctionTemplate();
+      void *InsertPos = nullptr;
+      const FunctionDecl *CorrespondingCallOpSpecialization =
+          CallOpTemplate->findSpecialization(TAL->asArray(), InsertPos);
+      assert(CorrespondingCallOpSpecialization);
+      FuncDecl = cast<CXXMethodDecl>(CorrespondingCallOpSpecialization);
+    }
+  }
+
   // Set up argument indices.
   unsigned ParamOffset = 0;
   SmallVector<PrimType, 8> ParamTypes;
@@ -46,7 +74,7 @@ Function *ByteCodeEmitter::compileFunc(const FunctionDecl *FuncDecl) {
   // InterpStack when calling the function.
   bool HasThisPointer = false;
   if (const auto *MD = dyn_cast<CXXMethodDecl>(FuncDecl)) {
-    if (MD->isImplicitObjectMemberFunction()) {
+    if (MD->isImplicitObjectMemberFunction() && !IsLambdaStaticInvoker) {
       HasThisPointer = true;
       ParamTypes.push_back(PT_Ptr);
       ParamOffsets.push_back(ParamOffset);
diff --git a/clang/test/AST/Interp/lambda.cpp b/clang/test/AST/Interp/lambda.cpp
index f8400898acc0c0..a433e5666e4f4c 100644
--- a/clang/test/AST/Interp/lambda.cpp
+++ b/clang/test/AST/Interp/lambda.cpp
@@ -155,6 +155,19 @@ namespace StaticInvoker {
     return fp(i).a;
   }
   static_assert(sv6(12) == 12);
+
+
+  /// A generic lambda.
+  auto GL = [](auto a) { return a; };
+  constexpr char (*fp2)(char) = GL;
+  static_assert(fp2('3') == '3', "");
+
+  struct GLS {
+    int a;
+  };
+  auto GL2 = [](auto a) { return GLS{a}; };
+  constexpr GLS (*fp3)(char) = GL2;
+  static_assert(fp3('3').a == '3', "");
 }
 
 namespace LambdasAsParams {

>From ebab50a0402b7d1d59fd623f62d6b8227c4d70e9 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Thu, 8 Feb 2024 10:11:39 -0500
Subject: [PATCH 39/72] [libc++][NFC] Reformat a few files that had gotten
 mis-formatted

Those appear to be oversights when committing patches
in the last few months.
---
 libcxx/include/ostream          | 36 +++++++++++++++------------------
 libcxx/include/scoped_allocator |  4 ++--
 libcxx/include/shared_mutex     |  6 +++---
 libcxx/include/string           | 16 ++++++++-------
 libcxx/include/valarray         |  4 ++--
 libcxx/include/vector           |  4 ++--
 6 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/libcxx/include/ostream b/libcxx/include/ostream
index 180adda201d830..2e2607340a5de1 100644
--- a/libcxx/include/ostream
+++ b/libcxx/include/ostream
@@ -1090,11 +1090,10 @@ _LIBCPP_EXPORTED_FROM_ABI FILE* __get_ostream_file(ostream& __os);
 
 #  ifndef _LIBCPP_HAS_NO_UNICODE
 template <class = void> // TODO PRINT template or availability markup fires too eagerly (http://llvm.org/PR61563).
-_LIBCPP_HIDE_FROM_ABI void
-__vprint_unicode(ostream& __os, string_view __fmt, format_args __args, bool __write_nl) {
-#if _LIBCPP_AVAILABILITY_HAS_PRINT == 0
+_LIBCPP_HIDE_FROM_ABI void __vprint_unicode(ostream& __os, string_view __fmt, format_args __args, bool __write_nl) {
+#    if _LIBCPP_AVAILABILITY_HAS_PRINT == 0
   return std::__vprint_nonunicode(__os, __fmt, __args, __write_nl);
-#else
+#    else
   FILE* __file = std::__get_ostream_file(__os);
   if (!__file || !__print::__is_terminal(__file))
     return std::__vprint_nonunicode(__os, __fmt, __args, __write_nl);
@@ -1110,38 +1109,36 @@ __vprint_unicode(ostream& __os, string_view __fmt, format_args __args, bool __wr
   // This is the path for the native API, start with flushing.
   __os.flush();
 
-#    ifndef _LIBCPP_HAS_NO_EXCEPTIONS
+#      ifndef _LIBCPP_HAS_NO_EXCEPTIONS
   try {
-#    endif // _LIBCPP_HAS_NO_EXCEPTIONS
+#      endif // _LIBCPP_HAS_NO_EXCEPTIONS
     ostream::sentry __s(__os);
     if (__s) {
-#    ifndef _LIBCPP_WIN32API
+#      ifndef _LIBCPP_WIN32API
       __print::__vprint_unicode_posix(__file, __fmt, __args, __write_nl, true);
-#    elif !defined(_LIBCPP_HAS_NO_WIDE_CHARACTERS)
+#      elif !defined(_LIBCPP_HAS_NO_WIDE_CHARACTERS)
     __print::__vprint_unicode_windows(__file, __fmt, __args, __write_nl, true);
-#    else
-#      error "Windows builds with wchar_t disabled are not supported."
-#    endif
+#      else
+#        error "Windows builds with wchar_t disabled are not supported."
+#      endif
     }
 
-#    ifndef _LIBCPP_HAS_NO_EXCEPTIONS
+#      ifndef _LIBCPP_HAS_NO_EXCEPTIONS
   } catch (...) {
     __os.__set_badbit_and_consider_rethrow();
   }
-#    endif // _LIBCPP_HAS_NO_EXCEPTIONS
-#endif // _LIBCPP_AVAILABILITY_HAS_PRINT
+#      endif // _LIBCPP_HAS_NO_EXCEPTIONS
+#    endif   // _LIBCPP_AVAILABILITY_HAS_PRINT
 }
 
 template <class = void> // TODO PRINT template or availability markup fires too eagerly (http://llvm.org/PR61563).
-_LIBCPP_HIDE_FROM_ABI inline void
-vprint_unicode(ostream& __os, string_view __fmt, format_args __args) {
+_LIBCPP_HIDE_FROM_ABI inline void vprint_unicode(ostream& __os, string_view __fmt, format_args __args) {
   std::__vprint_unicode(__os, __fmt, __args, false);
 }
 #  endif // _LIBCPP_HAS_NO_UNICODE
 
 template <class... _Args>
-_LIBCPP_HIDE_FROM_ABI void
-print(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) {
+_LIBCPP_HIDE_FROM_ABI void print(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) {
 #  ifndef _LIBCPP_HAS_NO_UNICODE
   if constexpr (__print::__use_unicode_execution_charset)
     std::__vprint_unicode(__os, __fmt.get(), std::make_format_args(__args...), false);
@@ -1153,8 +1150,7 @@ print(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) {
 }
 
 template <class... _Args>
-_LIBCPP_HIDE_FROM_ABI void
-println(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) {
+_LIBCPP_HIDE_FROM_ABI void println(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) {
 #  ifndef _LIBCPP_HAS_NO_UNICODE
   // Note the wording in the Standard is inefficient. The output of
   // std::format is a std::string which is then copied. This solution
diff --git a/libcxx/include/scoped_allocator b/libcxx/include/scoped_allocator
index 1626453a698ff4..eff6fbdf6edd80 100644
--- a/libcxx/include/scoped_allocator
+++ b/libcxx/include/scoped_allocator
@@ -476,8 +476,8 @@ public:
   }
 
 private:
-  _LIBCPP_HIDE_FROM_ABI explicit scoped_allocator_adaptor(outer_allocator_type&& __o, inner_allocator_type&& __i) _NOEXCEPT
-      : base(std::move(__o), std::move(__i)) {}
+  _LIBCPP_HIDE_FROM_ABI explicit scoped_allocator_adaptor(
+      outer_allocator_type&& __o, inner_allocator_type&& __i) _NOEXCEPT : base(std::move(__o), std::move(__i)) {}
 
   template <class _Tp, class... _Args>
   _LIBCPP_HIDE_FROM_ABI void __construct(integral_constant<int, 0>, _Tp* __p, _Args&&... __args) {
diff --git a/libcxx/include/shared_mutex b/libcxx/include/shared_mutex
index ac66b3a568bf2d..57f385b5435eb2 100644
--- a/libcxx/include/shared_mutex
+++ b/libcxx/include/shared_mutex
@@ -124,9 +124,9 @@ template <class Mutex>
 
 #include <__config>
 
-#  ifdef _LIBCPP_HAS_NO_THREADS
-#    error "<shared_mutex> is not supported since libc++ has been configured without support for threads."
-#  endif
+#ifdef _LIBCPP_HAS_NO_THREADS
+#  error "<shared_mutex> is not supported since libc++ has been configured without support for threads."
+#endif
 
 #include <__assert> // all public C++ headers provide the assertion handler
 #include <__availability>
diff --git a/libcxx/include/string b/libcxx/include/string
index ed4fdbe6864c20..530a2233860434 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -938,7 +938,11 @@ public:
       // Turning off ASan instrumentation for variable initialization with _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS
       // does not work consistently during initialization of __r_, so we instead unpoison __str's memory manually first.
       // __str's memory needs to be unpoisoned only in the case where it's a short string.
-      : __r_([](basic_string &__s) -> decltype(__s.__r_)&& { if(!__s.__is_long()) __s.__annotate_delete(); return std::move(__s.__r_); }(__str)) {
+      : __r_([](basic_string& __s) -> decltype(__s.__r_)&& {
+          if (!__s.__is_long())
+            __s.__annotate_delete();
+          return std::move(__s.__r_);
+        }(__str)) {
     __str.__r_.first() = __rep();
     __str.__annotate_new(0);
     if (!__is_long())
@@ -1918,7 +1922,7 @@ private:
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_new(size_type __current_size) const _NOEXCEPT {
-    (void) __current_size;
+    (void)__current_size;
 #if !defined(_LIBCPP_HAS_NO_ASAN) && defined(_LIBCPP_INSTRUMENTED_WITH_ASAN)
     if (!__libcpp_is_constant_evaluated() && (__asan_short_string_is_annotated() || __is_long()))
       __annotate_contiguous_container(data() + capacity() + 1, data() + __current_size + 1);
@@ -1933,7 +1937,7 @@ private:
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_increase(size_type __n) const _NOEXCEPT {
-    (void) __n;
+    (void)__n;
 #if !defined(_LIBCPP_HAS_NO_ASAN) && defined(_LIBCPP_INSTRUMENTED_WITH_ASAN)
     if (!__libcpp_is_constant_evaluated() && (__asan_short_string_is_annotated() || __is_long()))
       __annotate_contiguous_container(data() + size() + 1, data() + size() + 1 + __n);
@@ -1941,7 +1945,7 @@ private:
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __annotate_shrink(size_type __old_size) const _NOEXCEPT {
-    (void) __old_size;
+    (void)__old_size;
 #if !defined(_LIBCPP_HAS_NO_ASAN) && defined(_LIBCPP_INSTRUMENTED_WITH_ASAN)
     if (!__libcpp_is_constant_evaluated() && (__asan_short_string_is_annotated() || __is_long()))
       __annotate_contiguous_container(data() + __old_size + 1, data() + size() + 1);
@@ -1952,9 +1956,7 @@ private:
   static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type __align_it(size_type __s) _NOEXCEPT {
     return (__s + (__a - 1)) & ~(__a - 1);
   }
-  enum {
-    __alignment = 8
-  };
+  enum { __alignment = 8 };
   static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 size_type __recommend(size_type __s) _NOEXCEPT {
     if (__s < __min_cap) {
       return static_cast<size_type>(__min_cap) - 1;
diff --git a/libcxx/include/valarray b/libcxx/include/valarray
index 44adcd71ec6167..88b161eccd332f 100644
--- a/libcxx/include/valarray
+++ b/libcxx/include/valarray
@@ -2435,7 +2435,7 @@ template <class _Expr, __enable_if_t<__is_val_expr<_Expr>::value, int> >
 inline valarray<_Tp>& valarray<_Tp>::operator*=(const _Expr& __v) {
   size_t __i = 0;
   for (value_type* __t = __begin_; __t != __end_; ++__t, ++__i)
-    *__t *= std::__get(__v,__i);
+    *__t *= std::__get(__v, __i);
   return *this;
 }
 
@@ -2444,7 +2444,7 @@ template <class _Expr, __enable_if_t<__is_val_expr<_Expr>::value, int> >
 inline valarray<_Tp>& valarray<_Tp>::operator/=(const _Expr& __v) {
   size_t __i = 0;
   for (value_type* __t = __begin_; __t != __end_; ++__t, ++__i)
-    *__t /= std::__get(__v,__i);
+    *__t /= std::__get(__v, __i);
   return *this;
 }
 
diff --git a/libcxx/include/vector b/libcxx/include/vector
index e9615ab4c9a30f..3934361e98cf69 100644
--- a/libcxx/include/vector
+++ b/libcxx/include/vector
@@ -831,8 +831,8 @@ private:
   // For more details, see the "Using libc++" documentation page or
   // the documentation for __sanitizer_annotate_contiguous_container.
 
-  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __annotate_contiguous_container(
-      const void* __old_mid, const void* __new_mid) const {
+  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
+  __annotate_contiguous_container(const void* __old_mid, const void* __new_mid) const {
     (void)__old_mid;
     (void)__new_mid;
 #ifndef _LIBCPP_HAS_NO_ASAN

>From 2a0fdd1dd4f026b3169df6ed47e07174e8629d28 Mon Sep 17 00:00:00 2001
From: ostannard <oliver.stannard at arm.com>
Date: Thu, 8 Feb 2024 15:31:54 +0000
Subject: [PATCH 40/72] [AArch64] Indirect tail-calls cannot use x16 with
 pac-ret+pc (#81020)

When using -mbranch-protection=pac-ret+pc, x16 is used in the function
epilogue to hold the address of the signing instruction. This is used by
a HINT instruction which can only use x16, so we can't change this. This
means that we can't use it to hold the function pointer for an indirect
tail-call.

There is existing code to force indirect tail-calls to use x16 or x17
when BTI is enabled, so there are now 4 combinations:

bti  pac-ret+pc  Valid function pointer registers
off  off         Any non callee-saved register
on   off         x16 or x17
off  on          Any non callee-saved register except x16
on   on          x17
---
 llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp |  4 +-
 .../Target/AArch64/AArch64ISelLowering.cpp    |  4 +-
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |  4 +-
 llvm/lib/Target/AArch64/AArch64InstrInfo.td   | 47 +++++++++++---
 .../lib/Target/AArch64/AArch64RegisterInfo.td | 15 +++--
 .../AArch64/GISel/AArch64CallLowering.cpp     | 15 +++--
 .../AArch64/GISel/AArch64RegisterBankInfo.cpp |  4 +-
 ...ranch-target-enforcement-indirect-calls.ll | 65 +++++++++++++++++++
 llvm/test/CodeGen/AArch64/kcfi-bti.ll         |  4 +-
 9 files changed, 138 insertions(+), 24 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index de247253eb18a5..5b5ffd7b2feb06 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -1602,7 +1602,9 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
   // attributes (isCall, isReturn, etc.). We lower them to the real
   // instruction here.
   case AArch64::TCRETURNri:
-  case AArch64::TCRETURNriBTI:
+  case AArch64::TCRETURNrix16x17:
+  case AArch64::TCRETURNrix17:
+  case AArch64::TCRETURNrinotx16:
   case AArch64::TCRETURNriALL: {
     MCInst TmpInst;
     TmpInst.setOpcode(AArch64::BR);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 8573939b04389f..20290c958a70e9 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -25700,7 +25700,9 @@ AArch64TargetLowering::EmitKCFICheck(MachineBasicBlock &MBB,
   case AArch64::BLR:
   case AArch64::BLRNoIP:
   case AArch64::TCRETURNri:
-  case AArch64::TCRETURNriBTI:
+  case AArch64::TCRETURNrix16x17:
+  case AArch64::TCRETURNrix17:
+  case AArch64::TCRETURNrinotx16:
     break;
   default:
     llvm_unreachable("Unexpected CFI call opcode");
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 9add7d87017a73..39c96092f10319 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -2503,7 +2503,9 @@ bool AArch64InstrInfo::isTailCallReturnInst(const MachineInstr &MI) {
     return false;
   case AArch64::TCRETURNdi:
   case AArch64::TCRETURNri:
-  case AArch64::TCRETURNriBTI:
+  case AArch64::TCRETURNrix16x17:
+  case AArch64::TCRETURNrix17:
+  case AArch64::TCRETURNrinotx16:
   case AArch64::TCRETURNriALL:
     return true;
   }
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 77fdb688d0422e..9c3a6927d043ba 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -928,8 +928,25 @@ let RecomputePerFunction = 1 in {
   // Avoid generating STRQro if it is slow, unless we're optimizing for code size.
   def UseSTRQro : Predicate<"!Subtarget->isSTRQroSlow() || shouldOptForSize(MF)">;
 
-  def UseBTI : Predicate<[{ MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() }]>;
-  def NotUseBTI : Predicate<[{ !MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() }]>;
+  // Register restrictions for indirect tail-calls:
+  // - If branch target enforcement is enabled, indirect calls must use x16 or
+  //   x17, because these are the only registers which can target the BTI C
+  //   instruction.
+  // - If PAuthLR is enabled, x16 is used in the epilogue to hold the address
+  //   of the signing instruction. This can't be changed because it is used by a
+  //   HINT instruction which only accepts x16. We can't load anything from the
+  //   stack after this because the authentication instruction checks that SP is
+  //   the same as it was at function entry, so we can't have anything on the
+  //   stack.
+
+  // BTI on, PAuthLR off: x16 or x17
+  def TailCallX16X17 : Predicate<[{  MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() && !MF->getInfo<AArch64FunctionInfo>()->branchProtectionPAuthLR() }]>;
+  // BTI on, PAuthLR on: x17 only
+  def TailCallX17 : Predicate<[{ MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() && MF->getInfo<AArch64FunctionInfo>()->branchProtectionPAuthLR() }]>;
+  // BTI off, PAuthLR on: Any non-callee-saved register except x16
+  def TailCallNotX16 : Predicate<[{ !MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() && MF->getInfo<AArch64FunctionInfo>()->branchProtectionPAuthLR() }]>;
+  // BTI off, PAuthLR off: Any non-callee-saved register
+  def TailCallAny : Predicate<[{ !MF->getInfo<AArch64FunctionInfo>()->branchTargetEnforcement() && !MF->getInfo<AArch64FunctionInfo>()->branchProtectionPAuthLR() }]>;
 
   def SLSBLRMitigation : Predicate<[{ MF->getSubtarget<AArch64Subtarget>().hardenSlsBlr() }]>;
   def NoSLSBLRMitigation : Predicate<[{ !MF->getSubtarget<AArch64Subtarget>().hardenSlsBlr() }]>;
@@ -9121,18 +9138,30 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1, Uses = [SP] in {
   // some verifier checks for outlined functions.
   def TCRETURNriALL : Pseudo<(outs), (ins GPR64:$dst, i32imm:$FPDiff), []>,
                       Sched<[WriteBrReg]>;
-  // Indirect tail-call limited to only use registers (x16 and x17) which are
-  // allowed to tail-call a "BTI c" instruction.
-  def TCRETURNriBTI : Pseudo<(outs), (ins rtcGPR64:$dst, i32imm:$FPDiff), []>,
+
+  // Indirect tail-calls with reduced register classes, needed for BTI and
+  // PAuthLR.
+  def TCRETURNrix16x17 : Pseudo<(outs), (ins tcGPRx16x17:$dst, i32imm:$FPDiff), []>,
+                      Sched<[WriteBrReg]>;
+  def TCRETURNrix17 : Pseudo<(outs), (ins tcGPRx17:$dst, i32imm:$FPDiff), []>,
+                      Sched<[WriteBrReg]>;
+  def TCRETURNrinotx16 : Pseudo<(outs), (ins tcGPRnotx16:$dst, i32imm:$FPDiff), []>,
                       Sched<[WriteBrReg]>;
 }
 
 def : Pat<(AArch64tcret tcGPR64:$dst, (i32 timm:$FPDiff)),
           (TCRETURNri tcGPR64:$dst, imm:$FPDiff)>,
-      Requires<[NotUseBTI]>;
-def : Pat<(AArch64tcret rtcGPR64:$dst, (i32 timm:$FPDiff)),
-          (TCRETURNriBTI rtcGPR64:$dst, imm:$FPDiff)>,
-      Requires<[UseBTI]>;
+      Requires<[TailCallAny]>;
+def : Pat<(AArch64tcret tcGPRx16x17:$dst, (i32 timm:$FPDiff)),
+          (TCRETURNrix16x17 tcGPRx16x17:$dst, imm:$FPDiff)>,
+      Requires<[TailCallX16X17]>;
+def : Pat<(AArch64tcret tcGPRx17:$dst, (i32 timm:$FPDiff)),
+          (TCRETURNrix17 tcGPRx17:$dst, imm:$FPDiff)>,
+      Requires<[TailCallX17]>;
+def : Pat<(AArch64tcret tcGPRnotx16:$dst, (i32 timm:$FPDiff)),
+          (TCRETURNrinotx16 tcGPRnotx16:$dst, imm:$FPDiff)>,
+      Requires<[TailCallNotX16]>;
+
 def : Pat<(AArch64tcret tglobaladdr:$dst, (i32 timm:$FPDiff)),
           (TCRETURNdi texternalsym:$dst, imm:$FPDiff)>;
 def : Pat<(AArch64tcret texternalsym:$dst, (i32 timm:$FPDiff)),
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index b70ab856888478..569944e0e660b7 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -217,11 +217,16 @@ def tcGPR64 : RegisterClass<"AArch64", [i64], 64, (sub GPR64common, X19, X20, X2
                                                      X22, X23, X24, X25, X26,
                                                      X27, X28, FP, LR)>;
 
-// Restricted set of tail call registers, for use when branch target
-// enforcement is enabled. These are the only registers which can be used to
-// indirectly branch (not call) to the "BTI c" instruction at the start of a
-// BTI-protected function.
-def rtcGPR64 : RegisterClass<"AArch64", [i64], 64, (add X16, X17)>;
+// Restricted sets of tail call registers, for use when branch target
+// enforcement or PAuthLR are enabled.
+// For BTI, x16 and x17 are the only registers which can be used to indirectly
+// branch (not call) to the "BTI c" instruction at the start of a BTI-protected
+// function.
+// For PAuthLR, x16 must be used in the function epilogue for other purposes,
+// so cannot hold the function pointer.
+def tcGPRx17 : RegisterClass<"AArch64", [i64], 64, (add X17)>;
+def tcGPRx16x17 : RegisterClass<"AArch64", [i64], 64, (add X16, X17)>;
+def tcGPRnotx16 : RegisterClass<"AArch64", [i64], 64, (sub tcGPR64, X16)>;
 
 // Register set that excludes registers that are reserved for procedure calls.
 // This is used for pseudo-instructions that are actually implemented using a
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
index 55cad848393aed..3dc3d31a34e84e 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
@@ -1012,16 +1012,23 @@ bool AArch64CallLowering::isEligibleForTailCallOptimization(
 
 static unsigned getCallOpcode(const MachineFunction &CallerF, bool IsIndirect,
                               bool IsTailCall) {
+  const AArch64FunctionInfo *FuncInfo = CallerF.getInfo<AArch64FunctionInfo>();
+
   if (!IsTailCall)
     return IsIndirect ? getBLRCallOpcode(CallerF) : (unsigned)AArch64::BL;
 
   if (!IsIndirect)
     return AArch64::TCRETURNdi;
 
-  // When BTI is enabled, we need to use TCRETURNriBTI to make sure that we use
-  // x16 or x17.
-  if (CallerF.getInfo<AArch64FunctionInfo>()->branchTargetEnforcement())
-    return AArch64::TCRETURNriBTI;
+  // When BTI or PAuthLR are enabled, there are restrictions on using x16 and
+  // x17 to hold the function pointer.
+  if (FuncInfo->branchTargetEnforcement()) {
+    if (FuncInfo->branchProtectionPAuthLR())
+      return AArch64::TCRETURNrix17;
+    else
+      return AArch64::TCRETURNrix16x17;
+  } else if (FuncInfo->branchProtectionPAuthLR())
+    return AArch64::TCRETURNrinotx16;
 
   return AArch64::TCRETURNri;
 }
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
index b8e5e7bbdaba77..0fc4d7f1991061 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
@@ -273,7 +273,9 @@ AArch64RegisterBankInfo::getRegBankFromRegClass(const TargetRegisterClass &RC,
   case AArch64::GPR64common_and_GPR64noipRegClassID:
   case AArch64::GPR64noip_and_tcGPR64RegClassID:
   case AArch64::tcGPR64RegClassID:
-  case AArch64::rtcGPR64RegClassID:
+  case AArch64::tcGPRx16x17RegClassID:
+  case AArch64::tcGPRx17RegClassID:
+  case AArch64::tcGPRnotx16RegClassID:
   case AArch64::WSeqPairsClassRegClassID:
   case AArch64::XSeqPairsClassRegClassID:
   case AArch64::MatrixIndexGPR32_8_11RegClassID:
diff --git a/llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll b/llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll
index de543f4e4d9424..833a6d5b1d1da0 100644
--- a/llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll
+++ b/llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll
@@ -26,3 +26,68 @@ entry:
 ; CHECK: br {{x16|x17}}
   ret void
 }
+define void @bti_enabled_force_x10(ptr %p) "branch-target-enforcement"="true" {
+entry:
+  %p_x10 = tail call ptr asm "", "={x10},{x10},~{lr}"(ptr %p)
+  tail call void %p_x10()
+; CHECK: br {{x16|x17}}
+  ret void
+}
+
+; sign-return-address places no further restrictions on the tail-call register.
+
+define void @bti_enabled_pac_enabled(ptr %p) "branch-target-enforcement"="true" "sign-return-address"="all" {
+entry:
+  tail call void %p()
+; CHECK: br {{x16|x17}}
+  ret void
+}
+define void @bti_enabled_pac_enabled_force_x10(ptr %p) "branch-target-enforcement"="true" "sign-return-address"="all" {
+entry:
+  %p_x10 = tail call ptr asm "", "={x10},{x10},~{lr}"(ptr %p)
+  tail call void %p_x10()
+; CHECK: br {{x16|x17}}
+  ret void
+}
+
+; PAuthLR needs to use x16 to hold the address of the signing instruction. That
+; can't be changed because the hint instruction only uses that register, so the
+; only choice for the tail-call function pointer is x17.
+
+define void @bti_enabled_pac_pc_enabled(ptr %p) "branch-target-enforcement"="true" "sign-return-address"="all" "branch-protection-pauth-lr"="true" {
+entry:
+  tail call void %p()
+; CHECK: br x17
+  ret void
+}
+define void @bti_enabled_pac_pc_enabled_force_x16(ptr %p) "branch-target-enforcement"="true" "sign-return-address"="all" "branch-protection-pauth-lr"="true" {
+entry:
+  %p_x16 = tail call ptr asm "", "={x16},{x16},~{lr}"(ptr %p)
+  tail call void %p_x16()
+; CHECK: br x17
+  ret void
+}
+
+; PAuthLR by itself prevents x16 from being used, but any other
+; non-callee-saved register can be used.
+
+define void @pac_pc_enabled(ptr %p) "sign-return-address"="all" "branch-protection-pauth-lr"="true" {
+entry:
+  tail call void %p()
+; CHECK: br {{(x[0-9]|x1[0-578])$}}
+  ret void
+}
+define void @pac_pc_enabled_force_x16(ptr %p) "sign-return-address"="all" "branch-protection-pauth-lr"="true" {
+entry:
+  %p_x16 = tail call ptr asm "", "={x16},{x16},~{lr}"(ptr %p)
+  tail call void %p_x16()
+; CHECK: br {{(x[0-9]|x1[0-578])$}}
+  ret void
+}
+define void @pac_pc_enabled_force_x17(ptr %p) "sign-return-address"="all" "branch-protection-pauth-lr"="true" {
+entry:
+  %p_x17 = tail call ptr asm "", "={x17},{x17},~{lr}"(ptr %p)
+  tail call void %p_x17()
+; CHECK: br x17
+  ret void
+}
diff --git a/llvm/test/CodeGen/AArch64/kcfi-bti.ll b/llvm/test/CodeGen/AArch64/kcfi-bti.ll
index 12cde4371e15b1..d3febb536824e3 100644
--- a/llvm/test/CodeGen/AArch64/kcfi-bti.ll
+++ b/llvm/test/CodeGen/AArch64/kcfi-bti.ll
@@ -73,11 +73,11 @@ define void @f3(ptr noundef %x) {
 ; MIR-LABEL: name: f3
 ; MIR: body:
 
-; ISEL: TCRETURNriBTI %1, 0, csr_aarch64_aapcs, implicit $sp, cfi-type 12345678
+; ISEL: TCRETURNrix16x17 %1, 0, csr_aarch64_aapcs, implicit $sp, cfi-type 12345678
 
 ; KCFI:       BUNDLE{{.*}} {
 ; KCFI-NEXT:    KCFI_CHECK $x16, 12345678, implicit-def $x9, implicit-def $x16, implicit-def $x17, implicit-def $nzcv
-; KCFI-NEXT:    TCRETURNriBTI internal killed $x16, 0, csr_aarch64_aapcs, implicit $sp
+; KCFI-NEXT:    TCRETURNrix16x17 internal killed $x16, 0, csr_aarch64_aapcs, implicit $sp
 ; KCFI-NEXT:  }
 
   tail call void %x() [ "kcfi"(i32 12345678) ]

>From bc1c831d93eec9bfdef05401836316948849e8a1 Mon Sep 17 00:00:00 2001
From: Daniel Chen <cdchen at ca.ibm.com>
Date: Thu, 8 Feb 2024 10:38:50 -0500
Subject: [PATCH 41/72] [Flang] Update the fix of PR 80738 to cover generic
 interface inside modules (#81087)

The following test cases crashes. The problem is that the fix for PR
https://github.com/llvm/llvm-project/pull/80738 is not quite complete.
It should `GetUltimate()` of the `interface_` before check if it is
generic.


```
  MODULE M

    CONTAINS

    FUNCTION Int(Arg)
    INTEGER :: Int, Arg
      Int = Arg
    END FUNCTION

    FUNCTION Int8(Arg)
    INTEGER(8) :: Int8, Arg
      Int8 = 8_8
    END FUNCTION

  END MODULE

  MODULE M1
  USE M

    INTERFACE Int8
      MODULE PROCEDURE  Int
      MODULE PROCEDURE  Int8
    END INTERFACE

  END MODULE

  PROGRAM PtrAssignGen
  USE M
  USE M1
  IMPLICIT NONE

  INTERFACE Int
    MODULE PROCEDURE  Int
    MODULE PROCEDURE  Int8
  END INTERFACE

  PROCEDURE(Int8),   POINTER :: PtrInt8

  PtrInt8 => Int8
  IF ( PtrInt8(100_8) .NE. 8_8 ) ERROR STOP 12

  END
  ```
---
 flang/lib/Semantics/resolve-names.cpp | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp
index 36deab969456d0..2a42c79161468a 100644
--- a/flang/lib/Semantics/resolve-names.cpp
+++ b/flang/lib/Semantics/resolve-names.cpp
@@ -5648,9 +5648,10 @@ void DeclarationVisitor::Post(const parser::ProcDecl &x) {
   const auto &name{std::get<parser::Name>(x.t)};
   const Symbol *procInterface{nullptr};
   if (interfaceName_) {
-    procInterface = interfaceName_->symbol->has<GenericDetails>()
-        ? interfaceName_->symbol->get<GenericDetails>().specific()
-        : interfaceName_->symbol;
+    Symbol *ultimate{&interfaceName_->symbol->GetUltimate()};
+    procInterface = ultimate->has<GenericDetails>()
+        ? ultimate->get<GenericDetails>().specific()
+        : ultimate;
   }
   auto attrs{HandleSaveName(name.source, GetAttrs())};
   DerivedTypeDetails *dtDetails{nullptr};

>From bf6695827fe76f56fecb47fdf9ef089e7d4f4ce2 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 8 Feb 2024 16:41:02 +0100
Subject: [PATCH 42/72] [InstCombine] Add tests for #77108 (NFC)

---
 .../Transforms/InstCombine/dependent-ivs.ll   | 374 ++++++++++++++++++
 1 file changed, 374 insertions(+)
 create mode 100644 llvm/test/Transforms/InstCombine/dependent-ivs.ll

diff --git a/llvm/test/Transforms/InstCombine/dependent-ivs.ll b/llvm/test/Transforms/InstCombine/dependent-ivs.ll
new file mode 100644
index 00000000000000..bd6679121dcac6
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/dependent-ivs.ll
@@ -0,0 +1,374 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+define void @int_iv_nuw(i64 %base, i64 %end) {
+; CHECK-LABEL: define void @int_iv_nuw(
+; CHECK-SAME: i64 [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV2_NEXT]] = add nuw i64 [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv2 = phi i64 [ %iv2.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.i64(i64 %iv2)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv2.next = add nuw i64 %iv.next, %base
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @int_iv_nsw(i64 %base, i64 %end) {
+; CHECK-LABEL: define void @int_iv_nsw(
+; CHECK-SAME: i64 [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV2_NEXT]] = add nsw i64 [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv2 = phi i64 [ %iv2.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.i64(i64 %iv2)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv2.next = add nsw i64 %iv.next, %base
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @int_iv_commuted(i64 %base, i64 %end) {
+; CHECK-LABEL: define void @int_iv_commuted(
+; CHECK-SAME: i64 [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[BASE2:%.*]] = mul i64 [[BASE]], 42
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE2]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV2_NEXT]] = add i64 [[BASE2]], [[IV_NEXT]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %base2 = mul i64 %base, 42 ; thwart complexity-based canonicalization
+  br label %loop
+
+loop:
+  %iv2 = phi i64 [ %iv2.next, %loop ], [ %base2, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.i64(i64 %iv2)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv2.next = add i64 %base2, %iv.next
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @int_iv_vector(<2 x i64> %base) {
+; CHECK-LABEL: define void @int_iv_vector(
+; CHECK-SAME: <2 x i64> [[BASE:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi <2 x i64> [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi <2 x i64> [ [[IV_NEXT:%.*]], [[LOOP]] ], [ zeroinitializer, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.v2i64(<2 x i64> [[IV2]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw <2 x i64> [[IV]], <i64 4, i64 4>
+; CHECK-NEXT:    [[IV2_NEXT]] = add <2 x i64> [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP:%.*]] = call i1 @get.i1()
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv2 = phi <2 x i64> [ %iv2.next, %loop ], [ %base, %entry ]
+  %iv = phi <2 x i64> [ %iv.next, %loop ], [ zeroinitializer, %entry ]
+  call void @use.v2i64(<2 x i64> %iv2)
+  %iv.next = add nuw nsw <2 x i64> %iv, <i64 4, i64 4>
+  %iv2.next = add <2 x i64> %iv.next, %base
+  %cmp = call i1 @get.i1()
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @int_iv_loop_variant_step(i64 %base, i64 %end) {
+; CHECK-LABEL: define void @int_iv_loop_variant_step(
+; CHECK-SAME: i64 [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[STEP:%.*]] = call i64 @get.i64()
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], [[STEP]]
+; CHECK-NEXT:    [[IV2_NEXT]] = add nuw i64 [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv2 = phi i64 [ %iv2.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.i64(i64 %iv2)
+  %step = call i64 @get.i64()
+  %iv.next = add nuw nsw i64 %iv, %step
+  %iv2.next = add nuw i64 %iv.next, %base
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @ptr_iv_inbounds(ptr %base, i64 %end) {
+; CHECK-LABEL: define void @ptr_iv_inbounds(
+; CHECK-SAME: ptr [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV_PTR:%.*]] = phi ptr [ [[IV_PTR_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.p0(ptr [[IV_PTR]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV_PTR_NEXT]] = getelementptr inbounds i8, ptr [[BASE]], i64 [[IV_NEXT]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv.ptr = phi ptr [ %iv.ptr.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.p0(ptr %iv.ptr)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv.ptr.next = getelementptr inbounds i8, ptr %base, i64 %iv.next
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @ptr_iv_no_inbounds(ptr %base, i64 %end) {
+; CHECK-LABEL: define void @ptr_iv_no_inbounds(
+; CHECK-SAME: ptr [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV_PTR:%.*]] = phi ptr [ [[IV_PTR_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.p0(ptr [[IV_PTR]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV_PTR_NEXT]] = getelementptr i8, ptr [[BASE]], i64 [[IV_NEXT]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv.ptr = phi ptr [ %iv.ptr.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.p0(ptr %iv.ptr)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv.ptr.next = getelementptr i8, ptr %base, i64 %iv.next
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @ptr_iv_vector(<2 x ptr> %base, i64 %end) {
+; CHECK-LABEL: define void @ptr_iv_vector(
+; CHECK-SAME: <2 x ptr> [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV_PTR:%.*]] = phi <2 x ptr> [ [[IV_PTR_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 0, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.v2p0(<2 x ptr> [[IV_PTR]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV_PTR_NEXT]] = getelementptr inbounds i8, <2 x ptr> [[BASE]], i64 [[IV_NEXT]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv.ptr = phi <2 x ptr> [ %iv.ptr.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+  call void @use.v2p0(<2 x ptr> %iv.ptr)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv.ptr.next = getelementptr inbounds i8, <2 x ptr> %base, i64 %iv.next
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @ptr_iv_vector2(<2 x ptr> %base) {
+; CHECK-LABEL: define void @ptr_iv_vector2(
+; CHECK-SAME: <2 x ptr> [[BASE:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV_PTR:%.*]] = phi <2 x ptr> [ [[IV_PTR_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi <2 x i64> [ [[IV_NEXT:%.*]], [[LOOP]] ], [ zeroinitializer, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.v2p0(<2 x ptr> [[IV_PTR]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw <2 x i64> [[IV]], <i64 4, i64 4>
+; CHECK-NEXT:    [[IV_PTR_NEXT]] = getelementptr i8, <2 x ptr> [[BASE]], <2 x i64> [[IV_NEXT]]
+; CHECK-NEXT:    [[CMP:%.*]] = call i1 @get.i1()
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv.ptr = phi <2 x ptr> [ %iv.ptr.next, %loop ], [ %base, %entry ]
+  %iv = phi <2 x i64> [ %iv.next, %loop ], [ zeroinitializer, %entry ]
+  call void @use.v2p0(<2 x ptr> %iv.ptr)
+  %iv.next = add nuw nsw <2 x i64> %iv, <i64 4, i64 4>
+  %iv.ptr.next = getelementptr i8, <2 x ptr> %base, <2 x i64> %iv.next
+  %cmp = call i1 @get.i1()
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @wrong_start_value(i64 %base, i64 %end) {
+; CHECK-LABEL: define void @wrong_start_value(
+; CHECK-SAME: i64 [[BASE:%.*]], i64 [[END:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP]] ], [ [[BASE]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP]] ], [ 1, [[ENTRY]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[IV2_NEXT]] = add i64 [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[IV_NEXT]], [[END]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv2 = phi i64 [ %iv2.next, %loop ], [ %base, %entry ]
+  %iv = phi i64 [ %iv.next, %loop ], [ 1, %entry ]
+  call void @use.i64(i64 %iv2)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %iv2.next = add i64 %base, %iv.next
+  %cmp = icmp eq i64 %iv.next, %end
+  br i1 %cmp, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @different_loops(i64 %base) {
+; CHECK-LABEL: define void @different_loops(
+; CHECK-SAME: i64 [[BASE:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP1:%.*]]
+; CHECK:       loop1:
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[LOOP1]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV]])
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 4
+; CHECK-NEXT:    [[CMP:%.*]] = call i1 @get.i1()
+; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP2:%.*]], label [[LOOP1]]
+; CHECK:       loop2:
+; CHECK-NEXT:    [[IV2:%.*]] = phi i64 [ [[IV2_NEXT:%.*]], [[LOOP2]] ], [ [[BASE]], [[LOOP1]] ]
+; CHECK-NEXT:    call void @use.i64(i64 [[IV2]])
+; CHECK-NEXT:    [[IV2_NEXT]] = add nuw i64 [[IV_NEXT]], [[BASE]]
+; CHECK-NEXT:    [[CMP2:%.*]] = call i1 @get.i1()
+; CHECK-NEXT:    br i1 [[CMP2]], label [[EXIT:%.*]], label [[LOOP2]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop1
+
+loop1:
+  %iv = phi i64 [ %iv.next, %loop1 ], [ 0, %entry ]
+  call void @use.i64(i64 %iv)
+  %iv.next = add nuw nsw i64 %iv, 4
+  %cmp = call i1 @get.i1()
+  br i1 %cmp, label %loop2, label %loop1
+
+loop2:
+  %iv2 = phi i64 [ %iv2.next, %loop2 ], [ %base, %loop1 ]
+  call void @use.i64(i64 %iv2)
+  %iv2.next = add nuw i64 %base, %iv.next
+  %cmp2 = call i1 @get.i1()
+  br i1 %cmp2, label %exit, label %loop2
+
+exit:
+  ret void
+}
+
+declare void @use.p0(ptr)
+declare void @use.v2p0(<2 x ptr>)
+declare void @use.i64(i64)
+declare void @use.v2i64(<2 x i64>)
+declare i1 @get.i1()
+declare i64 @get.i64()

>From ffd821f6c7d5ed5ccb6ce20c3f83a5c1163bc176 Mon Sep 17 00:00:00 2001
From: Francesco Petrogalli <francesco.petrogalli at apple.com>
Date: Thu, 8 Feb 2024 16:54:12 +0100
Subject: [PATCH 43/72] [CodeGen] Add ValueType v3i8 (NFCI). (#80826)

---
 llvm/include/llvm/CodeGen/ValueTypes.td       | 363 +++++++++---------
 llvm/lib/CodeGen/ValueTypes.cpp               |   2 +
 llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp |   3 +
 3 files changed, 187 insertions(+), 181 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/ValueTypes.td b/llvm/include/llvm/CodeGen/ValueTypes.td
index 55baaf867e7325..10547383065309 100644
--- a/llvm/include/llvm/CodeGen/ValueTypes.td
+++ b/llvm/include/llvm/CodeGen/ValueTypes.td
@@ -97,192 +97,193 @@ def v128i4  : VTVec<128,  i4, 32>;   //  128 x i4 vector value
 
 def v1i8    : VTVec<1,    i8, 33>;  //    1 x i8 vector value
 def v2i8    : VTVec<2,    i8, 34>;  //    2 x i8 vector value
-def v4i8    : VTVec<4,    i8, 35>;  //    4 x i8 vector value
-def v8i8    : VTVec<8,    i8, 36>;  //    8 x i8 vector value
-def v16i8   : VTVec<16,   i8, 37>;  //   16 x i8 vector value
-def v32i8   : VTVec<32,   i8, 38>;  //   32 x i8 vector value
-def v64i8   : VTVec<64,   i8, 39>;  //   64 x i8 vector value
-def v128i8  : VTVec<128,  i8, 40>;  //  128 x i8 vector value
-def v256i8  : VTVec<256,  i8, 41>;  //  256 x i8 vector value
-def v512i8  : VTVec<512,  i8, 42>;  //  512 x i8 vector value
-def v1024i8 : VTVec<1024, i8, 43>;  // 1024 x i8 vector value
-
-def v1i16   : VTVec<1,   i16, 44>;  //   1 x i16 vector value
-def v2i16   : VTVec<2,   i16, 45>;  //   2 x i16 vector value
-def v3i16   : VTVec<3,   i16, 46>;  //   3 x i16 vector value
-def v4i16   : VTVec<4,   i16, 47>;  //   4 x i16 vector value
-def v8i16   : VTVec<8,   i16, 48>;  //   8 x i16 vector value
-def v16i16  : VTVec<16,  i16, 49>;  //  16 x i16 vector value
-def v32i16  : VTVec<32,  i16, 50>;  //  32 x i16 vector value
-def v64i16  : VTVec<64,  i16, 51>;  //  64 x i16 vector value
-def v128i16 : VTVec<128, i16, 52>;  // 128 x i16 vector value
-def v256i16 : VTVec<256, i16, 53>;  // 256 x i16 vector value
-def v512i16 : VTVec<512, i16, 54>;  // 512 x i16 vector value
-
-def v1i32    : VTVec<1,    i32, 55>;  //    1 x i32 vector value
-def v2i32    : VTVec<2,    i32, 56>;  //    2 x i32 vector value
-def v3i32    : VTVec<3,    i32, 57>;  //    3 x i32 vector value
-def v4i32    : VTVec<4,    i32, 58>;  //    4 x i32 vector value
-def v5i32    : VTVec<5,    i32, 59>;  //    5 x i32 vector value
-def v6i32    : VTVec<6,    i32, 60>;  //    6 x f32 vector value
-def v7i32    : VTVec<7,    i32, 61>;  //    7 x f32 vector value
-def v8i32    : VTVec<8,    i32, 62>;  //    8 x i32 vector value
-def v9i32    : VTVec<9,    i32, 63>;  //    9 x i32 vector value
-def v10i32   : VTVec<10,   i32, 64>;  //   10 x i32 vector value
-def v11i32   : VTVec<11,   i32, 65>;  //   11 x i32 vector value
-def v12i32   : VTVec<12,   i32, 66>;  //   12 x i32 vector value
-def v16i32   : VTVec<16,   i32, 67>;  //   16 x i32 vector value
-def v32i32   : VTVec<32,   i32, 68>;  //   32 x i32 vector value
-def v64i32   : VTVec<64,   i32, 69>;  //   64 x i32 vector value
-def v128i32  : VTVec<128,  i32, 70>;  //  128 x i32 vector value
-def v256i32  : VTVec<256,  i32, 71>;  //  256 x i32 vector value
-def v512i32  : VTVec<512,  i32, 72>;  //  512 x i32 vector value
-def v1024i32 : VTVec<1024, i32, 73>;  // 1024 x i32 vector value
-def v2048i32 : VTVec<2048, i32, 74>;  // 2048 x i32 vector value
-
-def v1i64   : VTVec<1,   i64, 75>;  //   1 x i64 vector value
-def v2i64   : VTVec<2,   i64, 76>;  //   2 x i64 vector value
-def v3i64   : VTVec<3,   i64, 77>;  //   3 x i64 vector value
-def v4i64   : VTVec<4,   i64, 78>;  //   4 x i64 vector value
-def v8i64   : VTVec<8,   i64, 79>;  //   8 x i64 vector value
-def v16i64  : VTVec<16,  i64, 80>;  //  16 x i64 vector value
-def v32i64  : VTVec<32,  i64, 81>;  //  32 x i64 vector value
-def v64i64  : VTVec<64,  i64, 82>;  //  64 x i64 vector value
-def v128i64 : VTVec<128, i64, 83>;  // 128 x i64 vector value
-def v256i64 : VTVec<256, i64, 84>;  // 256 x i64 vector value
-
-def v1i128  : VTVec<1,  i128, 85>;  //  1 x i128 vector value
-
-def v1f16    : VTVec<1,    f16,  86>;  //    1 x f16 vector value
-def v2f16    : VTVec<2,    f16,  87>;  //    2 x f16 vector value
-def v3f16    : VTVec<3,    f16,  88>;  //    3 x f16 vector value
-def v4f16    : VTVec<4,    f16,  89>;  //    4 x f16 vector value
-def v8f16    : VTVec<8,    f16,  90>;  //    8 x f16 vector value
-def v16f16   : VTVec<16,   f16,  91>;  //   16 x f16 vector value
-def v32f16   : VTVec<32,   f16,  92>;  //   32 x f16 vector value
-def v64f16   : VTVec<64,   f16,  93>;  //   64 x f16 vector value
-def v128f16  : VTVec<128,  f16,  94>;  //  128 x f16 vector value
-def v256f16  : VTVec<256,  f16,  95>;  //  256 x f16 vector value
-def v512f16  : VTVec<512,  f16,  96>;  //  512 x f16 vector value
-
-def v2bf16   : VTVec<2,   bf16,  97>;  //    2 x bf16 vector value
-def v3bf16   : VTVec<3,   bf16,  98>;  //    3 x bf16 vector value
-def v4bf16   : VTVec<4,   bf16,  99>;  //    4 x bf16 vector value
-def v8bf16   : VTVec<8,   bf16, 100>;  //    8 x bf16 vector value
-def v16bf16  : VTVec<16,  bf16, 101>;  //   16 x bf16 vector value
-def v32bf16  : VTVec<32,  bf16, 102>;  //   32 x bf16 vector value
-def v64bf16  : VTVec<64,  bf16, 103>;  //   64 x bf16 vector value
-def v128bf16 : VTVec<128, bf16, 104>;  //  128 x bf16 vector value
-
-def v1f32    : VTVec<1,    f32, 105>;  //    1 x f32 vector value
-def v2f32    : VTVec<2,    f32, 106>;  //    2 x f32 vector value
-def v3f32    : VTVec<3,    f32, 107>;  //    3 x f32 vector value
-def v4f32    : VTVec<4,    f32, 108>;  //    4 x f32 vector value
-def v5f32    : VTVec<5,    f32, 109>;  //    5 x f32 vector value
-def v6f32    : VTVec<6,    f32, 110>;  //    6 x f32 vector value
-def v7f32    : VTVec<7,    f32, 111>;  //    7 x f32 vector value
-def v8f32    : VTVec<8,    f32, 112>;  //    8 x f32 vector value
-def v9f32    : VTVec<9,    f32, 113>;  //    9 x f32 vector value
-def v10f32   : VTVec<10,   f32, 114>;  //   10 x f32 vector value
-def v11f32   : VTVec<11,   f32, 115>;  //   11 x f32 vector value
-def v12f32   : VTVec<12,   f32, 116>;  //   12 x f32 vector value
-def v16f32   : VTVec<16,   f32, 117>;  //   16 x f32 vector value
-def v32f32   : VTVec<32,   f32, 118>;  //   32 x f32 vector value
-def v64f32   : VTVec<64,   f32, 119>;  //   64 x f32 vector value
-def v128f32  : VTVec<128,  f32, 120>;  //  128 x f32 vector value
-def v256f32  : VTVec<256,  f32, 121>;  //  256 x f32 vector value
-def v512f32  : VTVec<512,  f32, 122>;  //  512 x f32 vector value
-def v1024f32 : VTVec<1024, f32, 123>;  // 1024 x f32 vector value
-def v2048f32 : VTVec<2048, f32, 124>;  // 2048 x f32 vector value
-
-def v1f64    : VTVec<1,    f64, 125>;  //    1 x f64 vector value
-def v2f64    : VTVec<2,    f64, 126>;  //    2 x f64 vector value
-def v3f64    : VTVec<3,    f64, 127>;  //    3 x f64 vector value
-def v4f64    : VTVec<4,    f64, 128>;  //    4 x f64 vector value
-def v8f64    : VTVec<8,    f64, 129>;  //    8 x f64 vector value
-def v16f64   : VTVec<16,   f64, 130>;  //   16 x f64 vector value
-def v32f64   : VTVec<32,   f64, 131>;  //   32 x f64 vector value
-def v64f64   : VTVec<64,   f64, 132>;  //   64 x f64 vector value
-def v128f64  : VTVec<128,  f64, 133>;  //  128 x f64 vector value
-def v256f64  : VTVec<256,  f64, 134>;  //  256 x f64 vector value
-
-def nxv1i1  : VTScalableVec<1,  i1, 135>;  // n x  1 x i1  vector value
-def nxv2i1  : VTScalableVec<2,  i1, 136>;  // n x  2 x i1  vector value
-def nxv4i1  : VTScalableVec<4,  i1, 137>;  // n x  4 x i1  vector value
-def nxv8i1  : VTScalableVec<8,  i1, 138>;  // n x  8 x i1  vector value
-def nxv16i1 : VTScalableVec<16, i1, 139>;  // n x 16 x i1  vector value
-def nxv32i1 : VTScalableVec<32, i1, 140>;  // n x 32 x i1  vector value
-def nxv64i1 : VTScalableVec<64, i1, 141>;  // n x 64 x i1  vector value
-
-def nxv1i8  : VTScalableVec<1,  i8, 142>;  // n x  1 x i8  vector value
-def nxv2i8  : VTScalableVec<2,  i8, 143>;  // n x  2 x i8  vector value
-def nxv4i8  : VTScalableVec<4,  i8, 144>;  // n x  4 x i8  vector value
-def nxv8i8  : VTScalableVec<8,  i8, 145>;  // n x  8 x i8  vector value
-def nxv16i8 : VTScalableVec<16, i8, 146>;  // n x 16 x i8  vector value
-def nxv32i8 : VTScalableVec<32, i8, 147>;  // n x 32 x i8  vector value
-def nxv64i8 : VTScalableVec<64, i8, 148>;  // n x 64 x i8  vector value
-
-def nxv1i16  : VTScalableVec<1,  i16, 149>;  // n x  1 x i16 vector value
-def nxv2i16  : VTScalableVec<2,  i16, 150>;  // n x  2 x i16 vector value
-def nxv4i16  : VTScalableVec<4,  i16, 151>;  // n x  4 x i16 vector value
-def nxv8i16  : VTScalableVec<8,  i16, 152>;  // n x  8 x i16 vector value
-def nxv16i16 : VTScalableVec<16, i16, 153>;  // n x 16 x i16 vector value
-def nxv32i16 : VTScalableVec<32, i16, 154>;  // n x 32 x i16 vector value
-
-def nxv1i32  : VTScalableVec<1,  i32, 155>;  // n x  1 x i32 vector value
-def nxv2i32  : VTScalableVec<2,  i32, 156>;  // n x  2 x i32 vector value
-def nxv4i32  : VTScalableVec<4,  i32, 157>;  // n x  4 x i32 vector value
-def nxv8i32  : VTScalableVec<8,  i32, 158>;  // n x  8 x i32 vector value
-def nxv16i32 : VTScalableVec<16, i32, 159>;  // n x 16 x i32 vector value
-def nxv32i32 : VTScalableVec<32, i32, 160>;  // n x 32 x i32 vector value
-
-def nxv1i64  : VTScalableVec<1,  i64, 161>;  // n x  1 x i64 vector value
-def nxv2i64  : VTScalableVec<2,  i64, 162>;  // n x  2 x i64 vector value
-def nxv4i64  : VTScalableVec<4,  i64, 163>;  // n x  4 x i64 vector value
-def nxv8i64  : VTScalableVec<8,  i64, 164>;  // n x  8 x i64 vector value
-def nxv16i64 : VTScalableVec<16, i64, 165>;  // n x 16 x i64 vector value
-def nxv32i64 : VTScalableVec<32, i64, 166>;  // n x 32 x i64 vector value
-
-def nxv1f16  : VTScalableVec<1,  f16, 167>;  // n x  1 x  f16 vector value
-def nxv2f16  : VTScalableVec<2,  f16, 168>;  // n x  2 x  f16 vector value
-def nxv4f16  : VTScalableVec<4,  f16, 169>;  // n x  4 x  f16 vector value
-def nxv8f16  : VTScalableVec<8,  f16, 170>;  // n x  8 x  f16 vector value
-def nxv16f16 : VTScalableVec<16, f16, 171>;  // n x 16 x  f16 vector value
-def nxv32f16 : VTScalableVec<32, f16, 172>;  // n x 32 x  f16 vector value
-
-def nxv1bf16  : VTScalableVec<1,  bf16, 173>;  // n x  1 x bf16 vector value
-def nxv2bf16  : VTScalableVec<2,  bf16, 174>;  // n x  2 x bf16 vector value
-def nxv4bf16  : VTScalableVec<4,  bf16, 175>;  // n x  4 x bf16 vector value
-def nxv8bf16  : VTScalableVec<8,  bf16, 176>;  // n x  8 x bf16 vector value
-def nxv16bf16 : VTScalableVec<16, bf16, 177>;  // n x 16 x bf16 vector value
-def nxv32bf16 : VTScalableVec<32, bf16, 178>;  // n x 32 x bf16 vector value
-
-def nxv1f32  : VTScalableVec<1,  f32, 179>;  // n x  1 x  f32 vector value
-def nxv2f32  : VTScalableVec<2,  f32, 180>;  // n x  2 x  f32 vector value
-def nxv4f32  : VTScalableVec<4,  f32, 181>;  // n x  4 x  f32 vector value
-def nxv8f32  : VTScalableVec<8,  f32, 182>;  // n x  8 x  f32 vector value
-def nxv16f32 : VTScalableVec<16, f32, 183>;  // n x 16 x  f32 vector value
-
-def nxv1f64  : VTScalableVec<1,  f64, 184>;  // n x  1 x  f64 vector value
-def nxv2f64  : VTScalableVec<2,  f64, 185>;  // n x  2 x  f64 vector value
-def nxv4f64  : VTScalableVec<4,  f64, 186>;  // n x  4 x  f64 vector value
-def nxv8f64  : VTScalableVec<8,  f64, 187>;  // n x  8 x  f64 vector value
-
-def x86mmx    : ValueType<64,   188>;  // X86 MMX value
-def FlagVT    : ValueType<0,    189> { // Pre-RA sched glue
+def v3i8    : VTVec<3,    i8, 35>;  //    3 x i8 vector value
+def v4i8    : VTVec<4,    i8, 36>;  //    4 x i8 vector value
+def v8i8    : VTVec<8,    i8, 37>;  //    8 x i8 vector value
+def v16i8   : VTVec<16,   i8, 38>;  //   16 x i8 vector value
+def v32i8   : VTVec<32,   i8, 39>;  //   32 x i8 vector value
+def v64i8   : VTVec<64,   i8, 40>;  //   64 x i8 vector value
+def v128i8  : VTVec<128,  i8, 41>;  //  128 x i8 vector value
+def v256i8  : VTVec<256,  i8, 42>;  //  256 x i8 vector value
+def v512i8  : VTVec<512,  i8, 43>;  //  512 x i8 vector value
+def v1024i8 : VTVec<1024, i8, 44>;  // 1024 x i8 vector value
+
+def v1i16   : VTVec<1,   i16, 45>;  //   1 x i16 vector value
+def v2i16   : VTVec<2,   i16, 46>;  //   2 x i16 vector value
+def v3i16   : VTVec<3,   i16, 47>;  //   3 x i16 vector value
+def v4i16   : VTVec<4,   i16, 48>;  //   4 x i16 vector value
+def v8i16   : VTVec<8,   i16, 49>;  //   8 x i16 vector value
+def v16i16  : VTVec<16,  i16, 50>;  //  16 x i16 vector value
+def v32i16  : VTVec<32,  i16, 51>;  //  32 x i16 vector value
+def v64i16  : VTVec<64,  i16, 52>;  //  64 x i16 vector value
+def v128i16 : VTVec<128, i16, 53>;  // 128 x i16 vector value
+def v256i16 : VTVec<256, i16, 54>;  // 256 x i16 vector value
+def v512i16 : VTVec<512, i16, 55>;  // 512 x i16 vector value
+
+def v1i32    : VTVec<1,    i32, 56>;  //    1 x i32 vector value
+def v2i32    : VTVec<2,    i32, 57>;  //    2 x i32 vector value
+def v3i32    : VTVec<3,    i32, 58>;  //    3 x i32 vector value
+def v4i32    : VTVec<4,    i32, 59>;  //    4 x i32 vector value
+def v5i32    : VTVec<5,    i32, 60>;  //    5 x i32 vector value
+def v6i32    : VTVec<6,    i32, 61>;  //    6 x f32 vector value
+def v7i32    : VTVec<7,    i32, 62>;  //    7 x f32 vector value
+def v8i32    : VTVec<8,    i32, 63>;  //    8 x i32 vector value
+def v9i32    : VTVec<9,    i32, 64>;  //    9 x i32 vector value
+def v10i32   : VTVec<10,   i32, 65>;  //   10 x i32 vector value
+def v11i32   : VTVec<11,   i32, 66>;  //   11 x i32 vector value
+def v12i32   : VTVec<12,   i32, 67>;  //   12 x i32 vector value
+def v16i32   : VTVec<16,   i32, 68>;  //   16 x i32 vector value
+def v32i32   : VTVec<32,   i32, 69>;  //   32 x i32 vector value
+def v64i32   : VTVec<64,   i32, 70>;  //   64 x i32 vector value
+def v128i32  : VTVec<128,  i32, 71>;  //  128 x i32 vector value
+def v256i32  : VTVec<256,  i32, 72>;  //  256 x i32 vector value
+def v512i32  : VTVec<512,  i32, 73>;  //  512 x i32 vector value
+def v1024i32 : VTVec<1024, i32, 74>;  // 1024 x i32 vector value
+def v2048i32 : VTVec<2048, i32, 75>;  // 2048 x i32 vector value
+
+def v1i64   : VTVec<1,   i64, 76>;  //   1 x i64 vector value
+def v2i64   : VTVec<2,   i64, 77>;  //   2 x i64 vector value
+def v3i64   : VTVec<3,   i64, 78>;  //   3 x i64 vector value
+def v4i64   : VTVec<4,   i64, 79>;  //   4 x i64 vector value
+def v8i64   : VTVec<8,   i64, 80>;  //   8 x i64 vector value
+def v16i64  : VTVec<16,  i64, 81>;  //  16 x i64 vector value
+def v32i64  : VTVec<32,  i64, 82>;  //  32 x i64 vector value
+def v64i64  : VTVec<64,  i64, 83>;  //  64 x i64 vector value
+def v128i64 : VTVec<128, i64, 84>;  // 128 x i64 vector value
+def v256i64 : VTVec<256, i64, 85>;  // 256 x i64 vector value
+
+def v1i128  : VTVec<1,  i128, 86>;  //  1 x i128 vector value
+
+def v1f16    : VTVec<1,    f16,  87>;  //    1 x f16 vector value
+def v2f16    : VTVec<2,    f16,  88>;  //    2 x f16 vector value
+def v3f16    : VTVec<3,    f16,  89>;  //    3 x f16 vector value
+def v4f16    : VTVec<4,    f16,  90>;  //    4 x f16 vector value
+def v8f16    : VTVec<8,    f16,  91>;  //    8 x f16 vector value
+def v16f16   : VTVec<16,   f16,  92>;  //   16 x f16 vector value
+def v32f16   : VTVec<32,   f16,  93>;  //   32 x f16 vector value
+def v64f16   : VTVec<64,   f16,  94>;  //   64 x f16 vector value
+def v128f16  : VTVec<128,  f16,  95>;  //  128 x f16 vector value
+def v256f16  : VTVec<256,  f16,  96>;  //  256 x f16 vector value
+def v512f16  : VTVec<512,  f16,  97>;  //  512 x f16 vector value
+
+def v2bf16   : VTVec<2,   bf16,  98>;  //    2 x bf16 vector value
+def v3bf16   : VTVec<3,   bf16,  99>;  //    3 x bf16 vector value
+def v4bf16   : VTVec<4,   bf16, 100>;  //    4 x bf16 vector value
+def v8bf16   : VTVec<8,   bf16, 101>;  //    8 x bf16 vector value
+def v16bf16  : VTVec<16,  bf16, 102>;  //   16 x bf16 vector value
+def v32bf16  : VTVec<32,  bf16, 103>;  //   32 x bf16 vector value
+def v64bf16  : VTVec<64,  bf16, 104>;  //   64 x bf16 vector value
+def v128bf16 : VTVec<128, bf16, 105>;  //  128 x bf16 vector value
+
+def v1f32    : VTVec<1,    f32, 106>;  //    1 x f32 vector value
+def v2f32    : VTVec<2,    f32, 107>;  //    2 x f32 vector value
+def v3f32    : VTVec<3,    f32, 108>;  //    3 x f32 vector value
+def v4f32    : VTVec<4,    f32, 109>;  //    4 x f32 vector value
+def v5f32    : VTVec<5,    f32, 110>;  //    5 x f32 vector value
+def v6f32    : VTVec<6,    f32, 111>;  //    6 x f32 vector value
+def v7f32    : VTVec<7,    f32, 112>;  //    7 x f32 vector value
+def v8f32    : VTVec<8,    f32, 113>;  //    8 x f32 vector value
+def v9f32    : VTVec<9,    f32, 114>;  //    9 x f32 vector value
+def v10f32   : VTVec<10,   f32, 115>;  //   10 x f32 vector value
+def v11f32   : VTVec<11,   f32, 116>;  //   11 x f32 vector value
+def v12f32   : VTVec<12,   f32, 117>;  //   12 x f32 vector value
+def v16f32   : VTVec<16,   f32, 118>;  //   16 x f32 vector value
+def v32f32   : VTVec<32,   f32, 119>;  //   32 x f32 vector value
+def v64f32   : VTVec<64,   f32, 120>;  //   64 x f32 vector value
+def v128f32  : VTVec<128,  f32, 121>;  //  128 x f32 vector value
+def v256f32  : VTVec<256,  f32, 122>;  //  256 x f32 vector value
+def v512f32  : VTVec<512,  f32, 123>;  //  512 x f32 vector value
+def v1024f32 : VTVec<1024, f32, 124>;  // 1024 x f32 vector value
+def v2048f32 : VTVec<2048, f32, 125>;  // 2048 x f32 vector value
+
+def v1f64    : VTVec<1,    f64, 126>;  //    1 x f64 vector value
+def v2f64    : VTVec<2,    f64, 127>;  //    2 x f64 vector value
+def v3f64    : VTVec<3,    f64, 128>;  //    3 x f64 vector value
+def v4f64    : VTVec<4,    f64, 129>;  //    4 x f64 vector value
+def v8f64    : VTVec<8,    f64, 130>;  //    8 x f64 vector value
+def v16f64   : VTVec<16,   f64, 131>;  //   16 x f64 vector value
+def v32f64   : VTVec<32,   f64, 132>;  //   32 x f64 vector value
+def v64f64   : VTVec<64,   f64, 133>;  //   64 x f64 vector value
+def v128f64  : VTVec<128,  f64, 134>;  //  128 x f64 vector value
+def v256f64  : VTVec<256,  f64, 135>;  //  256 x f64 vector value
+
+def nxv1i1  : VTScalableVec<1,  i1, 136>;  // n x  1 x i1  vector value
+def nxv2i1  : VTScalableVec<2,  i1, 137>;  // n x  2 x i1  vector value
+def nxv4i1  : VTScalableVec<4,  i1, 138>;  // n x  4 x i1  vector value
+def nxv8i1  : VTScalableVec<8,  i1, 139>;  // n x  8 x i1  vector value
+def nxv16i1 : VTScalableVec<16, i1, 140>;  // n x 16 x i1  vector value
+def nxv32i1 : VTScalableVec<32, i1, 141>;  // n x 32 x i1  vector value
+def nxv64i1 : VTScalableVec<64, i1, 142>;  // n x 64 x i1  vector value
+
+def nxv1i8  : VTScalableVec<1,  i8, 143>;  // n x  1 x i8  vector value
+def nxv2i8  : VTScalableVec<2,  i8, 144>;  // n x  2 x i8  vector value
+def nxv4i8  : VTScalableVec<4,  i8, 145>;  // n x  4 x i8  vector value
+def nxv8i8  : VTScalableVec<8,  i8, 146>;  // n x  8 x i8  vector value
+def nxv16i8 : VTScalableVec<16, i8, 147>;  // n x 16 x i8  vector value
+def nxv32i8 : VTScalableVec<32, i8, 148>;  // n x 32 x i8  vector value
+def nxv64i8 : VTScalableVec<64, i8, 149>;  // n x 64 x i8  vector value
+
+def nxv1i16  : VTScalableVec<1,  i16, 150>;  // n x  1 x i16 vector value
+def nxv2i16  : VTScalableVec<2,  i16, 151>;  // n x  2 x i16 vector value
+def nxv4i16  : VTScalableVec<4,  i16, 152>;  // n x  4 x i16 vector value
+def nxv8i16  : VTScalableVec<8,  i16, 153>;  // n x  8 x i16 vector value
+def nxv16i16 : VTScalableVec<16, i16, 154>;  // n x 16 x i16 vector value
+def nxv32i16 : VTScalableVec<32, i16, 155>;  // n x 32 x i16 vector value
+
+def nxv1i32  : VTScalableVec<1,  i32, 156>;  // n x  1 x i32 vector value
+def nxv2i32  : VTScalableVec<2,  i32, 157>;  // n x  2 x i32 vector value
+def nxv4i32  : VTScalableVec<4,  i32, 158>;  // n x  4 x i32 vector value
+def nxv8i32  : VTScalableVec<8,  i32, 159>;  // n x  8 x i32 vector value
+def nxv16i32 : VTScalableVec<16, i32, 160>;  // n x 16 x i32 vector value
+def nxv32i32 : VTScalableVec<32, i32, 161>;  // n x 32 x i32 vector value
+
+def nxv1i64  : VTScalableVec<1,  i64, 162>;  // n x  1 x i64 vector value
+def nxv2i64  : VTScalableVec<2,  i64, 163>;  // n x  2 x i64 vector value
+def nxv4i64  : VTScalableVec<4,  i64, 164>;  // n x  4 x i64 vector value
+def nxv8i64  : VTScalableVec<8,  i64, 165>;  // n x  8 x i64 vector value
+def nxv16i64 : VTScalableVec<16, i64, 166>;  // n x 16 x i64 vector value
+def nxv32i64 : VTScalableVec<32, i64, 167>;  // n x 32 x i64 vector value
+
+def nxv1f16  : VTScalableVec<1,  f16, 168>;  // n x  1 x  f16 vector value
+def nxv2f16  : VTScalableVec<2,  f16, 169>;  // n x  2 x  f16 vector value
+def nxv4f16  : VTScalableVec<4,  f16, 170>;  // n x  4 x  f16 vector value
+def nxv8f16  : VTScalableVec<8,  f16, 171>;  // n x  8 x  f16 vector value
+def nxv16f16 : VTScalableVec<16, f16, 172>;  // n x 16 x  f16 vector value
+def nxv32f16 : VTScalableVec<32, f16, 173>;  // n x 32 x  f16 vector value
+
+def nxv1bf16  : VTScalableVec<1,  bf16, 174>;  // n x  1 x bf16 vector value
+def nxv2bf16  : VTScalableVec<2,  bf16, 175>;  // n x  2 x bf16 vector value
+def nxv4bf16  : VTScalableVec<4,  bf16, 176>;  // n x  4 x bf16 vector value
+def nxv8bf16  : VTScalableVec<8,  bf16, 177>;  // n x  8 x bf16 vector value
+def nxv16bf16 : VTScalableVec<16, bf16, 178>;  // n x 16 x bf16 vector value
+def nxv32bf16 : VTScalableVec<32, bf16, 179>;  // n x 32 x bf16 vector value
+
+def nxv1f32  : VTScalableVec<1,  f32, 180>;  // n x  1 x  f32 vector value
+def nxv2f32  : VTScalableVec<2,  f32, 181>;  // n x  2 x  f32 vector value
+def nxv4f32  : VTScalableVec<4,  f32, 182>;  // n x  4 x  f32 vector value
+def nxv8f32  : VTScalableVec<8,  f32, 183>;  // n x  8 x  f32 vector value
+def nxv16f32 : VTScalableVec<16, f32, 184>;  // n x 16 x  f32 vector value
+
+def nxv1f64  : VTScalableVec<1,  f64, 185>;  // n x  1 x  f64 vector value
+def nxv2f64  : VTScalableVec<2,  f64, 186>;  // n x  2 x  f64 vector value
+def nxv4f64  : VTScalableVec<4,  f64, 187>;  // n x  4 x  f64 vector value
+def nxv8f64  : VTScalableVec<8,  f64, 188>;  // n x  8 x  f64 vector value
+
+def x86mmx    : ValueType<64,   189>;  // X86 MMX value
+def FlagVT    : ValueType<0,    190> { // Pre-RA sched glue
   let LLVMName = "Glue";
 }
-def isVoid    : ValueType<0,    190>;  // Produces no value
-def untyped   : ValueType<8,    191> { // Produces an untyped value
+def isVoid    : ValueType<0,    191>;  // Produces no value
+def untyped   : ValueType<8,    192> { // Produces an untyped value
   let LLVMName = "Untyped";
 }
-def funcref   : ValueType<0,    192>;  // WebAssembly's funcref type
-def externref : ValueType<0,    193>;  // WebAssembly's externref type
-def x86amx    : ValueType<8192, 194>;  // X86 AMX value
-def i64x8     : ValueType<512,  195>;  // 8 Consecutive GPRs (AArch64)
+def funcref   : ValueType<0,    193>;  // WebAssembly's funcref type
+def externref : ValueType<0,    194>;  // WebAssembly's externref type
+def x86amx    : ValueType<8192, 195>;  // X86 AMX value
+def i64x8     : ValueType<512,  196>;  // 8 Consecutive GPRs (AArch64)
 def aarch64svcount
-              : ValueType<16,   196>;  // AArch64 predicate-as-counter
-def spirvbuiltin : ValueType<0,  197>; // SPIR-V's builtin type
+              : ValueType<16,  197>;  // AArch64 predicate-as-counter
+def spirvbuiltin : ValueType<0, 198>; // SPIR-V's builtin type
 
 def token      : ValueType<0, 248>;  // TokenTy
 def MetadataVT : ValueType<0, 249> { // Metadata
diff --git a/llvm/lib/CodeGen/ValueTypes.cpp b/llvm/lib/CodeGen/ValueTypes.cpp
index ba3b9e00e34e94..731fcabaee402c 100644
--- a/llvm/lib/CodeGen/ValueTypes.cpp
+++ b/llvm/lib/CodeGen/ValueTypes.cpp
@@ -264,6 +264,8 @@ Type *EVT::getTypeForEVT(LLVMContext &Context) const {
     return FixedVectorType::get(Type::getInt8Ty(Context), 1);
   case MVT::v2i8:
     return FixedVectorType::get(Type::getInt8Ty(Context), 2);
+  case MVT::v3i8:
+    return FixedVectorType::get(Type::getInt8Ty(Context), 3);
   case MVT::v4i8:
     return FixedVectorType::get(Type::getInt8Ty(Context), 4);
   case MVT::v8i8:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 10569d97248b96..528257ead585ef 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -308,8 +308,11 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
   setTruncStoreAction(MVT::v2f64, MVT::v2f32, Expand);
   setTruncStoreAction(MVT::v2f64, MVT::v2f16, Expand);
 
+  setTruncStoreAction(MVT::v3i32, MVT::v3i8, Expand);
+
   setTruncStoreAction(MVT::v3i64, MVT::v3i32, Expand);
   setTruncStoreAction(MVT::v3i64, MVT::v3i16, Expand);
+  setTruncStoreAction(MVT::v3i64, MVT::v3i8, Expand);
   setTruncStoreAction(MVT::v3f64, MVT::v3f32, Expand);
   setTruncStoreAction(MVT::v3f64, MVT::v3f16, Expand);
 

>From d190358d2b5bbe531ca43a2569cdf801bb7e1dea Mon Sep 17 00:00:00 2001
From: erichkeane <ekeane at nvidia.com>
Date: Thu, 8 Feb 2024 07:57:57 -0800
Subject: [PATCH 44/72] [OpenACC][NFC] Fix parse result from 'set'

Apparently 'set' was being parsed as 'shutdown'.  There isn't really any
way of detecting this without getting into a Sema implementation,
however fixing this now as I noticed it.
---
 clang/lib/Parse/ParseOpenACC.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/Parse/ParseOpenACC.cpp b/clang/lib/Parse/ParseOpenACC.cpp
index 1fee9f82b3e6a3..e099d077198d09 100644
--- a/clang/lib/Parse/ParseOpenACC.cpp
+++ b/clang/lib/Parse/ParseOpenACC.cpp
@@ -54,7 +54,7 @@ OpenACCDirectiveKindEx getOpenACCDirectiveKind(Token Tok) {
           .Case("declare", OpenACCDirectiveKind::Declare)
           .Case("init", OpenACCDirectiveKind::Init)
           .Case("shutdown", OpenACCDirectiveKind::Shutdown)
-          .Case("set", OpenACCDirectiveKind::Shutdown)
+          .Case("set", OpenACCDirectiveKind::Set)
           .Case("update", OpenACCDirectiveKind::Update)
           .Case("wait", OpenACCDirectiveKind::Wait)
           .Default(OpenACCDirectiveKind::Invalid);

>From 1887caeb8e37ae4cad8f5e839961f1eb420273d7 Mon Sep 17 00:00:00 2001
From: ian Bearman <ianb at microsoft.com>
Date: Thu, 8 Feb 2024 07:59:37 -0800
Subject: [PATCH 45/72] [MLIR] Setting MemorySpace During Bufferization
 (#78484)

Collection of changes with the goal of being able to convert `encoding`
to `memorySpace` during bufferization
- new API for encoder to allow implementation to select destination
memory space
- update existing bufferization implementations to support the new
interface
---
 .../Bufferization/IR/BufferizableOpInterface.h  | 15 ++++++++++-----
 .../Transforms/BufferizableOpInterfaceImpl.cpp  | 13 +++++++------
 .../IR/BufferizableOpInterface.cpp              | 14 ++++++++------
 .../Bufferization/IR/BufferizationOps.cpp       |  4 ++--
 .../Bufferization/Transforms/Bufferize.cpp      |  8 ++++++--
 .../FuncBufferizableOpInterfaceImpl.cpp         |  5 +++--
 .../Transforms/BufferizableOpInterfaceImpl.cpp  | 17 ++++++++++-------
 .../Bufferization/TestTensorCopyInsertion.cpp   |  6 ++++--
 8 files changed, 50 insertions(+), 32 deletions(-)

diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
index 226a2fbd08563c..d8cfeee2466360 100644
--- a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
@@ -257,6 +257,9 @@ struct BufferizationOptions {
   /// Parameters: Value, memory space, bufferization options
   using UnknownTypeConverterFn = std::function<BaseMemRefType(
       Value, Attribute memorySpace, const BufferizationOptions &)>;
+  // Produce a MemorySpace attribute from a tensor type
+  using DefaultMemorySpaceFn =
+      std::function<std::optional<Attribute>(TensorType t)>;
 
   BufferizationOptions();
 
@@ -296,11 +299,6 @@ struct BufferizationOptions {
   /// bufferized or not.
   bool bufferizeFunctionBoundaries = false;
 
-  /// The default memory space that should be used when it cannot be inferred
-  /// from the context. If case of std::nullopt, bufferization fails when the
-  /// memory space cannot be inferred at any point.
-  std::optional<Attribute> defaultMemorySpace = Attribute();
-
   /// Certain ops have aliasing OpOperand/OpResult invariants (e.g., scf.for).
   /// If this flag is set to `false`, those invariants are no longer enforced
   /// with buffer copies.
@@ -351,6 +349,13 @@ struct BufferizationOptions {
   /// used.
   UnknownTypeConverterFn unknownTypeConverterFn = nullptr;
 
+  // Use during type conversion to determine the memory space for memref based
+  // on the original tensor type if the memory space cannot be inferred.
+  // Returning std::nullopt will cause bufferization to fail (useful to indicate
+  // failure to determine memory space for a tensor type).
+  DefaultMemorySpaceFn defaultMemorySpaceFn =
+      [](TensorType t) -> std::optional<Attribute> { return Attribute(); };
+
   /// Seed for the analysis fuzzer. If set to `0`, the fuzzer is deactivated.
   /// Should be used only with `testAnalysisOnly = true`.
   unsigned analysisFuzzerSeed = 0;
diff --git a/mlir/lib/Dialect/Arith/Transforms/BufferizableOpInterfaceImpl.cpp b/mlir/lib/Dialect/Arith/Transforms/BufferizableOpInterfaceImpl.cpp
index f69b2557eec922..d7492c9e25db31 100644
--- a/mlir/lib/Dialect/Arith/Transforms/BufferizableOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/Arith/Transforms/BufferizableOpInterfaceImpl.cpp
@@ -26,17 +26,18 @@ struct ConstantOpInterface
   LogicalResult bufferize(Operation *op, RewriterBase &rewriter,
                           const BufferizationOptions &options) const {
     auto constantOp = cast<arith::ConstantOp>(op);
+    auto type = constantOp.getType().dyn_cast<RankedTensorType>();
+
+    // Only ranked tensors are supported.
+    if (!type)
+      return failure();
 
     Attribute memorySpace;
-    if (options.defaultMemorySpace.has_value())
-      memorySpace = *options.defaultMemorySpace;
+    if (auto memSpace = options.defaultMemorySpaceFn(type))
+      memorySpace = *memSpace;
     else
       return constantOp->emitError("could not infer memory space");
 
-    // Only ranked tensors are supported.
-    if (!isa<RankedTensorType>(constantOp.getType()))
-      return failure();
-
     // Only constants inside a module are supported.
     auto moduleOp = constantOp->getParentOfType<ModuleOp>();
     if (!moduleOp)
diff --git a/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp b/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
index 6ca9702cbbc66b..8f0f6d1fcc8490 100644
--- a/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
+++ b/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
@@ -682,11 +682,12 @@ bufferization::getBufferType(Value value, const BufferizationOptions &options,
     return bufferizableOp.getBufferType(value, options, invocationStack);
 
   // Op is not bufferizable.
-  if (!options.defaultMemorySpace.has_value())
+  auto memSpace =
+      options.defaultMemorySpaceFn(value.getType().cast<TensorType>());
+  if (!memSpace.has_value())
     return op->emitError("could not infer memory space");
 
-  return getMemRefType(value, options, /*layout=*/{},
-                       *options.defaultMemorySpace);
+  return getMemRefType(value, options, /*layout=*/{}, *memSpace);
 }
 
 bool bufferization::hasTensorSemantics(Operation *op) {
@@ -936,11 +937,12 @@ FailureOr<BaseMemRefType> bufferization::detail::defaultGetBufferType(
 
   // If we do not know the memory space and there is no default memory space,
   // report a failure.
-  if (!options.defaultMemorySpace.has_value())
+  auto memSpace =
+      options.defaultMemorySpaceFn(value.getType().cast<TensorType>());
+  if (!memSpace.has_value())
     return op->emitError("could not infer memory space");
 
-  return getMemRefType(value, options, /*layout=*/{},
-                       *options.defaultMemorySpace);
+  return getMemRefType(value, options, /*layout=*/{}, *memSpace);
 }
 
 bool bufferization::detail::defaultIsRepetitiveRegion(
diff --git a/mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp b/mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp
index eb4a96f3549904..34a0c594a5a5a3 100644
--- a/mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp
+++ b/mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp
@@ -234,8 +234,8 @@ AllocTensorOp::getBufferType(Value value, const BufferizationOptions &options,
     if (failed(copyBufferType))
       return failure();
     memorySpace = copyBufferType->getMemorySpace();
-  } else if (options.defaultMemorySpace.has_value()) {
-    memorySpace = *options.defaultMemorySpace;
+  } else if (auto ms = options.defaultMemorySpaceFn(getType())) {
+    memorySpace = *ms;
   } else {
     return getOperation()->emitError("could not infer memory space");
   }
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
index dc94b72edcdf0c..208cbda3a9eb63 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
@@ -210,8 +210,12 @@ struct OneShotBufferizePass
       opt.dumpAliasSets = dumpAliasSets;
       opt.setFunctionBoundaryTypeConversion(
           parseLayoutMapOption(functionBoundaryTypeConversion));
-      if (mustInferMemorySpace)
-        opt.defaultMemorySpace = std::nullopt;
+      if (mustInferMemorySpace) {
+        opt.defaultMemorySpaceFn =
+            [](TensorType t) -> std::optional<Attribute> {
+          return std::nullopt;
+        };
+      }
       opt.printConflicts = printConflicts;
       opt.testAnalysisOnly = testAnalysisOnly;
       opt.bufferizeFunctionBoundaries = bufferizeFunctionBoundaries;
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp b/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
index 07cd1f90b17df4..4cdbbf35dc876b 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
@@ -66,7 +66,7 @@ getBufferizedFunctionArgType(FuncOp funcOp, int64_t index,
   assert(tensorType && "expected TensorType");
 
   BaseMemRefType memrefType = options.functionArgTypeConverterFn(
-      tensorType, *options.defaultMemorySpace, funcOp, options);
+      tensorType, *options.defaultMemorySpaceFn(tensorType), funcOp, options);
 
   auto layoutAttr = funcOp.getArgAttrOfType<AffineMapAttr>(
       index, BufferizationDialect::kBufferLayoutAttrName);
@@ -443,7 +443,8 @@ struct FuncOpInterface
       // Note: If `inferFunctionResultLayout = true`, cast are later folded
       // away.
       BaseMemRefType resultType = options.functionArgTypeConverterFn(
-          tensorType, *options.defaultMemorySpace, funcOp, options);
+          tensorType, *options.defaultMemorySpaceFn(tensorType), funcOp,
+          options);
       Value toMemrefOp = rewriter.create<bufferization::ToMemrefOp>(
           loc, resultType, returnVal);
       returnValues.push_back(toMemrefOp);
diff --git a/mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp b/mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
index 678b7c099fa369..957f6314f35876 100644
--- a/mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
@@ -473,14 +473,14 @@ struct FromElementsOpInterface
   LogicalResult bufferize(Operation *op, RewriterBase &rewriter,
                           const BufferizationOptions &options) const {
     auto fromElementsOp = cast<tensor::FromElementsOp>(op);
+    auto tensorType = cast<RankedTensorType>(fromElementsOp.getType());
 
     // TODO: Implement memory space for this op.
-    if (options.defaultMemorySpace != Attribute())
+    if (options.defaultMemorySpaceFn(tensorType) != Attribute())
       return op->emitError("memory space not implemented yet");
 
     // Allocate a buffer for the result.
     Location loc = op->getLoc();
-    auto tensorType = cast<RankedTensorType>(fromElementsOp.getType());
     auto shape = tensorType.getShape();
     // TODO: Create alloc_tensor ops during TensorCopyInsertion.
     FailureOr<Value> tensorAlloc = allocateTensorForShapedValue(
@@ -588,8 +588,10 @@ struct GenerateOpInterface
                           const BufferizationOptions &options) const {
     auto generateOp = cast<tensor::GenerateOp>(op);
 
+    auto type = generateOp.getResult().getType();
+
     // TODO: Implement memory space for this op.
-    if (options.defaultMemorySpace != Attribute())
+    if (options.defaultMemorySpaceFn(type) != Attribute())
       return op->emitError("memory space not implemented yet");
 
     // Allocate memory.
@@ -1007,10 +1009,6 @@ struct SplatOpInterface
     OpBuilder::InsertionGuard g(rewriter);
     auto splatOp = cast<tensor::SplatOp>(op);
 
-    // TODO: Implement memory space for this op.
-    if (options.defaultMemorySpace != Attribute())
-      return op->emitError("memory space not implemented yet");
-
     // Allocate memory.
     Location loc = op->getLoc();
     FailureOr<Value> tensorAlloc = allocateTensorForShapedValue(
@@ -1021,6 +1019,11 @@ struct SplatOpInterface
 
     // Create linalg::MapOp.
     auto tensorType = cast<RankedTensorType>(tensorAlloc->getType());
+
+    // TODO: Implement memory space for this op.
+    if (options.defaultMemorySpaceFn(tensorType) != Attribute())
+      return op->emitError("memory space not implemented yet");
+
     auto linalgOp =
         rewriter.create<linalg::MapOp>(loc, tensorType, /*inputs=*/ValueRange(),
                                        /*init=*/*tensorAlloc);
diff --git a/mlir/test/lib/Dialect/Bufferization/TestTensorCopyInsertion.cpp b/mlir/test/lib/Dialect/Bufferization/TestTensorCopyInsertion.cpp
index fedfbe350a51a9..2991a3c165ee2d 100644
--- a/mlir/test/lib/Dialect/Bufferization/TestTensorCopyInsertion.cpp
+++ b/mlir/test/lib/Dialect/Bufferization/TestTensorCopyInsertion.cpp
@@ -44,8 +44,10 @@ struct TestTensorCopyInsertionPass
     bufferization::OneShotBufferizationOptions options;
     options.allowReturnAllocsFromLoops = allowReturnAllocsFromLoops;
     options.bufferizeFunctionBoundaries = bufferizeFunctionBoundaries;
-    if (mustInferMemorySpace)
-      options.defaultMemorySpace = std::nullopt;
+    if (mustInferMemorySpace) {
+      options.defaultMemorySpaceFn =
+          [](TensorType t) -> std::optional<Attribute> { return std::nullopt; };
+    }
     if (failed(bufferization::insertTensorCopies(getOperation(), options)))
       signalPassFailure();
   }

>From d2a70a469d4637b565d9fbb99a3d8838ddeae5c9 Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 16:13:22 +0000
Subject: [PATCH 46/72] [NFC][RemoveDIs] Remove conditional compilation for
 RemoveDIs (#81149)

A colleague observes that switching the default value of
LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS to "On" hasn't flipped the value
in their CMakeCache.txt. This probably means that everyone with an
existing build tree is going to not have support built in, meaning
everyone in LLVM would need to clean+rebuild their worktree when we flip
the switch on... which doesn't sound good.

So instead, just delete the flag and everything it does, making everyone
build and run ~400 lit tests in RemoveDIs mode. None of the buildbots
have had trouble with this, so it Should Be Fine (TM).

(Sending for review as this is changing various comments, and touches
several different areas -- I don't want to get too punchy).
---
 llvm/CMakeLists.txt                          |  3 ---
 llvm/cmake/modules/HandleLLVMOptions.cmake   |  4 ----
 llvm/include/llvm/ADT/ilist_iterator.h       | 23 --------------------
 llvm/tools/llc/llc.cpp                       |  8 ++-----
 llvm/tools/llvm-link/llvm-link.cpp           |  8 ++-----
 llvm/tools/llvm-lto/llvm-lto.cpp             |  8 ++-----
 llvm/tools/llvm-lto2/llvm-lto2.cpp           |  8 ++-----
 llvm/tools/llvm-reduce/llvm-reduce.cpp       |  8 ++-----
 llvm/tools/opt/optdriver.cpp                 |  8 ++-----
 llvm/unittests/ADT/IListIteratorBitsTest.cpp | 18 ++-------------
 llvm/unittests/IR/BasicBlockDbgInfoTest.cpp  |  6 -----
 11 files changed, 14 insertions(+), 88 deletions(-)

diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index c31980a47f39b7..81f2753a4edd85 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -653,9 +653,6 @@ option(LLVM_USE_OPROFILE
 option(LLVM_EXTERNALIZE_DEBUGINFO
   "Generate dSYM files and strip executables and libraries (Darwin Only)" OFF)
 
-option(LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS
-  "Add extra Booleans to ilist_iterators to communicate facts for debug-info" ON)
-
 set(LLVM_CODESIGNING_IDENTITY "" CACHE STRING
   "Sign executables and dylibs with the given identity or skip if empty (Darwin Only)")
 
diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 0699a8586fcc7e..486df22c2c1bb6 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -140,10 +140,6 @@ if(LLVM_ENABLE_EXPENSIVE_CHECKS)
   endif()
 endif()
 
-if(LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS)
-  add_compile_definitions(EXPERIMENTAL_DEBUGINFO_ITERATORS)
-endif()
-
 if (LLVM_ENABLE_STRICT_FIXED_SIZE_VECTORS)
   add_compile_definitions(STRICT_FIXED_SIZE_VECTORS)
 endif()
diff --git a/llvm/include/llvm/ADT/ilist_iterator.h b/llvm/include/llvm/ADT/ilist_iterator.h
index 9047b9b73959ee..2393c4d2c403c6 100644
--- a/llvm/include/llvm/ADT/ilist_iterator.h
+++ b/llvm/include/llvm/ADT/ilist_iterator.h
@@ -202,17 +202,12 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
 
   node_pointer NodePtr = nullptr;
 
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-  // (Default: Off) Allow extra position-information flags to be stored
-  // in iterators, in aid of removing debug-info intrinsics from LLVM.
-
   /// Is this position intended to contain any debug-info immediately before
   /// the position?
   mutable bool HeadInclusiveBit = false;
   /// Is this position intended to contain any debug-info immediately after
   /// the position?
   mutable bool TailInclusiveBit = false;
-#endif
 
 public:
   /// Create from an ilist_node.
@@ -231,10 +226,8 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
       const ilist_iterator_w_bits<OptionsT, IsReverse, RHSIsConst> &RHS,
       std::enable_if_t<IsConst || !RHSIsConst, void *> = nullptr)
       : NodePtr(RHS.NodePtr) {
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
     HeadInclusiveBit = RHS.HeadInclusiveBit;
     TailInclusiveBit = RHS.TailInclusiveBit;
-#endif
   }
 
   // This is templated so that we can allow assigning to a const iterator from
@@ -243,10 +236,8 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
   std::enable_if_t<IsConst || !RHSIsConst, ilist_iterator_w_bits &>
   operator=(const ilist_iterator_w_bits<OptionsT, IsReverse, RHSIsConst> &RHS) {
     NodePtr = RHS.NodePtr;
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
     HeadInclusiveBit = RHS.HeadInclusiveBit;
     TailInclusiveBit = RHS.TailInclusiveBit;
-#endif
     return *this;
   }
 
@@ -280,10 +271,8 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
           const_cast<typename ilist_iterator_w_bits<OptionsT, IsReverse,
                                                     false>::node_reference>(
               *NodePtr));
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
       New.HeadInclusiveBit = HeadInclusiveBit;
       New.TailInclusiveBit = TailInclusiveBit;
-#endif
       return New;
     }
     return ilist_iterator_w_bits<OptionsT, IsReverse, false>();
@@ -309,18 +298,14 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
   // Increment and decrement operators...
   ilist_iterator_w_bits &operator--() {
     NodePtr = IsReverse ? NodePtr->getNext() : NodePtr->getPrev();
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
     HeadInclusiveBit = false;
     TailInclusiveBit = false;
-#endif
     return *this;
   }
   ilist_iterator_w_bits &operator++() {
     NodePtr = IsReverse ? NodePtr->getPrev() : NodePtr->getNext();
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
     HeadInclusiveBit = false;
     TailInclusiveBit = false;
-#endif
     return *this;
   }
   ilist_iterator_w_bits operator--(int) {
@@ -340,18 +325,10 @@ class ilist_iterator_w_bits : ilist_detail::SpecificNodeAccess<OptionsT> {
   /// Check for end.  Only valid if ilist_sentinel_tracking<true>.
   bool isEnd() const { return NodePtr ? NodePtr->isSentinel() : false; }
 
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
   bool getHeadBit() const { return HeadInclusiveBit; }
   bool getTailBit() const { return TailInclusiveBit; }
   void setHeadBit(bool SetBit) const { HeadInclusiveBit = SetBit; }
   void setTailBit(bool SetBit) const { TailInclusiveBit = SetBit; }
-#else
-  // Store and return no information if we're not using this feature.
-  bool getHeadBit() const { return false; }
-  bool getTailBit() const { return false; }
-  void setHeadBit(bool SetBit) const { (void)SetBit; }
-  void setTailBit(bool SetBit) const { (void)SetBit; }
-#endif
 };
 
 template <typename From> struct simplify_type;
diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp
index 3e2567c441df5c..b292f70ba89dee 100644
--- a/llvm/tools/llc/llc.cpp
+++ b/llvm/tools/llc/llc.cpp
@@ -365,15 +365,11 @@ int main(int argc, char **argv) {
   }
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   if (TimeTrace)
     timeTraceProfilerInitialize(TimeTraceGranularity, argv[0]);
diff --git a/llvm/tools/llvm-link/llvm-link.cpp b/llvm/tools/llvm-link/llvm-link.cpp
index d50e0678f46102..e6c219a8cd7ece 100644
--- a/llvm/tools/llvm-link/llvm-link.cpp
+++ b/llvm/tools/llvm-link/llvm-link.cpp
@@ -473,15 +473,11 @@ int main(int argc, char **argv) {
   cl::ParseCommandLineOptions(argc, argv, "llvm linker\n");
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   LLVMContext Context;
   Context.setDiagnosticHandler(std::make_unique<LLVMLinkDiagnosticHandler>(),
diff --git a/llvm/tools/llvm-lto/llvm-lto.cpp b/llvm/tools/llvm-lto/llvm-lto.cpp
index f27281438282ba..7943d6952b828d 100644
--- a/llvm/tools/llvm-lto/llvm-lto.cpp
+++ b/llvm/tools/llvm-lto/llvm-lto.cpp
@@ -945,15 +945,11 @@ int main(int argc, char **argv) {
   cl::ParseCommandLineOptions(argc, argv, "llvm LTO linker\n");
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   if (OptLevel < '0' || OptLevel > '3')
     error("optimization level must be between 0 and 3");
diff --git a/llvm/tools/llvm-lto2/llvm-lto2.cpp b/llvm/tools/llvm-lto2/llvm-lto2.cpp
index c212374a0eccb6..d5de4f6b1a277c 100644
--- a/llvm/tools/llvm-lto2/llvm-lto2.cpp
+++ b/llvm/tools/llvm-lto2/llvm-lto2.cpp
@@ -230,15 +230,11 @@ static int run(int argc, char **argv) {
   cl::ParseCommandLineOptions(argc, argv, "Resolution-based LTO test harness");
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   // FIXME: Workaround PR30396 which means that a symbol can appear
   // more than once if it is defined in module-level assembly and
diff --git a/llvm/tools/llvm-reduce/llvm-reduce.cpp b/llvm/tools/llvm-reduce/llvm-reduce.cpp
index 71ce0ca5ab6abd..f913771487afe1 100644
--- a/llvm/tools/llvm-reduce/llvm-reduce.cpp
+++ b/llvm/tools/llvm-reduce/llvm-reduce.cpp
@@ -151,15 +151,11 @@ int main(int Argc, char **Argv) {
   cl::ParseCommandLineOptions(Argc, Argv, "LLVM automatic testcase reducer.\n");
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   if (Argc == 1) {
     cl::PrintHelpMessage();
diff --git a/llvm/tools/opt/optdriver.cpp b/llvm/tools/opt/optdriver.cpp
index 3f66bfc9f01763..85f52941a85b48 100644
--- a/llvm/tools/opt/optdriver.cpp
+++ b/llvm/tools/opt/optdriver.cpp
@@ -462,15 +462,11 @@ extern "C" int optMain(
       argc, argv, "llvm .bc -> .bc modular optimizer and analysis printer\n");
 
   // RemoveDIs debug-info transition: tests may request that we /try/ to use the
-  // new debug-info format, if it's built in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // new debug-info format.
   if (TryUseNewDbgInfoFormat) {
-    // If LLVM was built with support for this, turn the new debug-info format
-    // on.
+    // Turn the new debug-info format on.
     UseNewDbgInfoFormat = true;
   }
-#endif
-  (void)TryUseNewDbgInfoFormat;
 
   LLVMContext Context;
 
diff --git a/llvm/unittests/ADT/IListIteratorBitsTest.cpp b/llvm/unittests/ADT/IListIteratorBitsTest.cpp
index 167b30a5e30851..8ae73b1ed5f787 100644
--- a/llvm/unittests/ADT/IListIteratorBitsTest.cpp
+++ b/llvm/unittests/ADT/IListIteratorBitsTest.cpp
@@ -55,10 +55,8 @@ TEST(IListIteratorBitsTest, ConsAndAssignment) {
 
   simple_ilist<Node, ilist_iterator_bits<true>>::iterator I, I2;
 
-// Two sets of tests: if we've compiled in the iterator bits, then check that
-// HeadInclusiveBit and TailInclusiveBit are preserved on assignment and copy
-// construction, but not on other operations.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
+  // Check that HeadInclusiveBit and TailInclusiveBit are preserved on
+  // assignment and copy construction, but not on other operations.
   I = L.begin();
   EXPECT_FALSE(I.getHeadBit());
   EXPECT_FALSE(I.getTailBit());
@@ -85,18 +83,6 @@ TEST(IListIteratorBitsTest, ConsAndAssignment) {
   simple_ilist<Node, ilist_iterator_bits<true>>::iterator I3(I);
   EXPECT_TRUE(I3.getHeadBit());
   EXPECT_TRUE(I3.getTailBit());
-#else
-  // The calls should be available, but shouldn't actually store information.
-  I = L.begin();
-  EXPECT_FALSE(I.getHeadBit());
-  EXPECT_FALSE(I.getTailBit());
-  I.setHeadBit(true);
-  I.setTailBit(true);
-  EXPECT_FALSE(I.getHeadBit());
-  EXPECT_FALSE(I.getTailBit());
-  // Suppress warnings as we don't test with this variable.
-  (void)I2;
-#endif
 }
 
 class dummy {
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index ef2b288d859a7a..53b191c6883841 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -27,11 +27,6 @@ using namespace llvm;
 
 extern cl::opt<bool> UseNewDbgInfoFormat;
 
-// None of these tests are meaningful or do anything if we do not have the
-// experimental "head" bit compiled into ilist_iterator (aka
-// ilist_iterator_w_bits), thus there's no point compiling these tests in.
-#ifdef EXPERIMENTAL_DEBUGINFO_ITERATORS
-
 static std::unique_ptr<Module> parseIR(LLVMContext &C, const char *IR) {
   SMDiagnostic Err;
   std::unique_ptr<Module> Mod = parseAssemblyString(IR, Err, C);
@@ -1535,4 +1530,3 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) {
 }
 
 } // End anonymous namespace.
-#endif // EXPERIMENTAL_DEBUGINFO_ITERATORS

>From dbd4c15485c50a03f0fd0b574838d2dc6aef7009 Mon Sep 17 00:00:00 2001
From: Ivan Kosarev <ivan.kosarev at amd.com>
Date: Thu, 8 Feb 2024 18:23:00 +0200
Subject: [PATCH 47/72] [AMDGPU][True16] Support VOP3 source DPP operands.
 (#80892)

---
 .../AMDGPU/AsmParser/AMDGPUAsmParser.cpp      |  43 +++++--
 .../Disassembler/AMDGPUDisassembler.cpp       |  38 ++++++
 .../AMDGPU/Disassembler/AMDGPUDisassembler.h  |   1 +
 llvm/lib/Target/AMDGPU/SIFoldOperands.cpp     |  32 +++--
 llvm/lib/Target/AMDGPU/SIInstrInfo.td         |  23 +++-
 llvm/lib/Target/AMDGPU/SIRegisterInfo.td      |   6 +
 .../GlobalISel/inst-select-fceil.s16.mir      |   6 +-
 .../GlobalISel/inst-select-ffloor.s16.mir     |   6 +-
 .../CodeGen/AMDGPU/fix-sgpr-copies-f16.mir    |   4 +-
 .../gfx11_asm_vop3_dpp16_from_vop1-fake16.s   |  85 ++++++++++++++
 .../AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s   |  64 +++++-----
 .../gfx11_asm_vop3_dpp8_from_vop1-fake16.s    |  25 ++++
 .../MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s |  24 ++--
 .../gfx11_dasm_vop3_dpp16_from_vop1.txt       | 111 +++++++++++++-----
 .../AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt |  51 ++++++--
 15 files changed, 410 insertions(+), 109 deletions(-)
 create mode 100644 llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1-fake16.s
 create mode 100644 llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1-fake16.s

diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 225e781588668f..a94da992b33859 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -314,8 +314,9 @@ class AMDGPUOperand : public MCParsedAsmOperand {
     return isRegOrImmWithInputMods(AMDGPU::VS_64RegClassID, MVT::f64);
   }
 
-  bool isRegOrInlineImmWithFP16InputMods() const {
-    return isRegOrInline(AMDGPU::VS_32RegClassID, MVT::f16);
+  template <bool IsFake16> bool isRegOrInlineImmWithFP16InputMods() const {
+    return isRegOrInline(
+        IsFake16 ? AMDGPU::VS_32RegClassID : AMDGPU::VS_16RegClassID, MVT::f16);
   }
 
   bool isRegOrInlineImmWithFP32InputMods() const {
@@ -8151,7 +8152,7 @@ ParseStatus AMDGPUAsmParser::parseOModSI(OperandVector &Operands) {
 
 // Determines which bit DST_OP_SEL occupies in the op_sel operand according to
 // the number of src operands present, then copies that bit into src0_modifiers.
-void cvtVOP3DstOpSelOnly(MCInst &Inst) {
+static void cvtVOP3DstOpSelOnly(MCInst &Inst, const MCRegisterInfo &MRI) {
   int Opc = Inst.getOpcode();
   int OpSelIdx = AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::op_sel);
   if (OpSelIdx == -1)
@@ -8168,23 +8169,34 @@ void cvtVOP3DstOpSelOnly(MCInst &Inst) {
 
   unsigned OpSel = Inst.getOperand(OpSelIdx).getImm();
 
-  if ((OpSel & (1 << SrcNum)) != 0) {
-    int ModIdx = AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0_modifiers);
-    uint32_t ModVal = Inst.getOperand(ModIdx).getImm();
-    Inst.getOperand(ModIdx).setImm(ModVal | SISrcMods::DST_OP_SEL);
+  int DstIdx = AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::vdst);
+  if (DstIdx == -1)
+    return;
+
+  const MCOperand &DstOp = Inst.getOperand(DstIdx);
+  int ModIdx = AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0_modifiers);
+  uint32_t ModVal = Inst.getOperand(ModIdx).getImm();
+  if (DstOp.isReg() &&
+      MRI.getRegClass(AMDGPU::VGPR_16RegClassID).contains(DstOp.getReg())) {
+    if (AMDGPU::isHi(DstOp.getReg(), MRI))
+      ModVal |= SISrcMods::DST_OP_SEL;
+  } else {
+    if ((OpSel & (1 << SrcNum)) != 0)
+      ModVal |= SISrcMods::DST_OP_SEL;
   }
+  Inst.getOperand(ModIdx).setImm(ModVal);
 }
 
 void AMDGPUAsmParser::cvtVOP3OpSel(MCInst &Inst,
                                    const OperandVector &Operands) {
   cvtVOP3P(Inst, Operands);
-  cvtVOP3DstOpSelOnly(Inst);
+  cvtVOP3DstOpSelOnly(Inst, *getMRI());
 }
 
 void AMDGPUAsmParser::cvtVOP3OpSel(MCInst &Inst, const OperandVector &Operands,
                                    OptionalImmIndexMap &OptionalIdx) {
   cvtVOP3P(Inst, Operands, OptionalIdx);
-  cvtVOP3DstOpSelOnly(Inst);
+  cvtVOP3DstOpSelOnly(Inst, *getMRI());
 }
 
 static bool isRegOrImmWithInputMods(const MCInstrDesc &Desc, unsigned OpNum) {
@@ -8433,8 +8445,17 @@ void AMDGPUAsmParser::cvtVOP3P(MCInst &Inst, const OperandVector &Operands,
 
     uint32_t ModVal = 0;
 
-    if ((OpSel & (1 << J)) != 0)
-      ModVal |= SISrcMods::OP_SEL_0;
+    const MCOperand &SrcOp = Inst.getOperand(OpIdx);
+    if (SrcOp.isReg() && getMRI()
+                             ->getRegClass(AMDGPU::VGPR_16RegClassID)
+                             .contains(SrcOp.getReg())) {
+      bool VGPRSuffixIsHi = AMDGPU::isHi(SrcOp.getReg(), *getMRI());
+      if (VGPRSuffixIsHi)
+        ModVal |= SISrcMods::OP_SEL_0;
+    } else {
+      if ((OpSel & (1 << J)) != 0)
+        ModVal |= SISrcMods::OP_SEL_0;
+    }
 
     if ((OpSelHi & (1 << J)) != 0)
       ModVal |= SISrcMods::OP_SEL_1;
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index fba9eb53c8a8b4..85377d07c52dcb 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -913,6 +913,41 @@ static VOPModifiers collectVOPModifiers(const MCInst &MI,
   return Modifiers;
 }
 
+// Instructions decode the op_sel/suffix bits into the src_modifier
+// operands. Copy those bits into the src operands for true16 VGPRs.
+void AMDGPUDisassembler::convertTrue16OpSel(MCInst &MI) const {
+  const unsigned Opc = MI.getOpcode();
+  const MCRegisterClass &ConversionRC =
+      MRI.getRegClass(AMDGPU::VGPR_16RegClassID);
+  constexpr std::array<std::tuple<int, int, unsigned>, 4> OpAndOpMods = {
+      {{AMDGPU::OpName::src0, AMDGPU::OpName::src0_modifiers,
+        SISrcMods::OP_SEL_0},
+       {AMDGPU::OpName::src1, AMDGPU::OpName::src1_modifiers,
+        SISrcMods::OP_SEL_0},
+       {AMDGPU::OpName::src2, AMDGPU::OpName::src2_modifiers,
+        SISrcMods::OP_SEL_0},
+       {AMDGPU::OpName::vdst, AMDGPU::OpName::src0_modifiers,
+        SISrcMods::DST_OP_SEL}}};
+  for (const auto &[OpName, OpModsName, OpSelMask] : OpAndOpMods) {
+    int OpIdx = AMDGPU::getNamedOperandIdx(Opc, OpName);
+    int OpModsIdx = AMDGPU::getNamedOperandIdx(Opc, OpModsName);
+    if (OpIdx == -1 || OpModsIdx == -1)
+      continue;
+    MCOperand &Op = MI.getOperand(OpIdx);
+    if (!Op.isReg())
+      continue;
+    if (!ConversionRC.contains(Op.getReg()))
+      continue;
+    unsigned OpEnc = MRI.getEncodingValue(Op.getReg());
+    const MCOperand &OpMods = MI.getOperand(OpModsIdx);
+    unsigned ModVal = OpMods.getImm();
+    if (ModVal & OpSelMask) { // isHi
+      unsigned RegIdx = OpEnc & AMDGPU::HWEncoding::REG_IDX_MASK;
+      Op.setReg(ConversionRC.getRegister(RegIdx * 2 + 1));
+    }
+  }
+}
+
 // MAC opcodes have special old and src2 operands.
 // src2 is tied to dst, while old is not tied (but assumed to be).
 bool AMDGPUDisassembler::isMacDPP(MCInst &MI) const {
@@ -968,6 +1003,7 @@ DecodeStatus AMDGPUDisassembler::convertDPP8Inst(MCInst &MI) const {
     unsigned DescNumOps = MCII->get(Opc).getNumOperands();
     if (MI.getNumOperands() < DescNumOps &&
         AMDGPU::hasNamedOperand(Opc, AMDGPU::OpName::op_sel)) {
+      convertTrue16OpSel(MI);
       auto Mods = collectVOPModifiers(MI);
       insertNamedMCOperand(MI, MCOperand::createImm(Mods.OpSel),
                            AMDGPU::OpName::op_sel);
@@ -991,6 +1027,8 @@ DecodeStatus AMDGPUDisassembler::convertVOP3DPPInst(MCInst &MI) const {
   if (isMacDPP(MI))
     convertMacDPPInst(MI);
 
+  convertTrue16OpSel(MI);
+
   int VDstInIdx =
       AMDGPU::getNamedOperandIdx(MI.getOpcode(), AMDGPU::OpName::vdst_in);
   if (VDstInIdx != -1)
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
index 5a89b30f6fb36a..02feaf553c0c45 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
@@ -203,6 +203,7 @@ class AMDGPUDisassembler : public MCDisassembler {
   DecodeStatus convertVOP3PDPPInst(MCInst &MI) const;
   DecodeStatus convertVOPCDPPInst(MCInst &MI) const;
   void convertMacDPPInst(MCInst &MI) const;
+  void convertTrue16OpSel(MCInst &MI) const;
 
   enum OpWidthTy {
     OPW32,
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index a812cdc61500cc..8bf05682cbe7ea 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -756,14 +756,14 @@ void SIFoldOperands::foldOperand(
   int UseOpIdx,
   SmallVectorImpl<FoldCandidate> &FoldList,
   SmallVectorImpl<MachineInstr *> &CopiesToReplace) const {
-  const MachineOperand &UseOp = UseMI->getOperand(UseOpIdx);
+  const MachineOperand *UseOp = &UseMI->getOperand(UseOpIdx);
 
-  if (!isUseSafeToFold(*UseMI, UseOp))
+  if (!isUseSafeToFold(*UseMI, *UseOp))
     return;
 
   // FIXME: Fold operands with subregs.
-  if (UseOp.isReg() && OpToFold.isReg() &&
-      (UseOp.isImplicit() || UseOp.getSubReg() != AMDGPU::NoSubRegister))
+  if (UseOp->isReg() && OpToFold.isReg() &&
+      (UseOp->isImplicit() || UseOp->getSubReg() != AMDGPU::NoSubRegister))
     return;
 
   // Special case for REG_SEQUENCE: We can't fold literals into
@@ -859,7 +859,6 @@ void SIFoldOperands::foldOperand(
     if (MovOp == AMDGPU::COPY)
       return;
 
-    UseMI->setDesc(TII->get(MovOp));
     MachineInstr::mop_iterator ImpOpI = UseMI->implicit_operands().begin();
     MachineInstr::mop_iterator ImpOpE = UseMI->implicit_operands().end();
     while (ImpOpI != ImpOpE) {
@@ -867,6 +866,19 @@ void SIFoldOperands::foldOperand(
       ImpOpI++;
       UseMI->removeOperand(UseMI->getOperandNo(Tmp));
     }
+    UseMI->setDesc(TII->get(MovOp));
+
+    if (MovOp == AMDGPU::V_MOV_B16_t16_e64) {
+      const auto &SrcOp = UseMI->getOperand(UseOpIdx);
+      MachineOperand NewSrcOp(SrcOp);
+      MachineFunction *MF = UseMI->getParent()->getParent();
+      UseMI->removeOperand(1);
+      UseMI->addOperand(*MF, MachineOperand::CreateImm(0)); // src0_modifiers
+      UseMI->addOperand(NewSrcOp);                          // src0
+      UseMI->addOperand(*MF, MachineOperand::CreateImm(0)); // op_sel
+      UseOpIdx = 2;
+      UseOp = &UseMI->getOperand(UseOpIdx);
+    }
     CopiesToReplace.push_back(UseMI);
   } else {
     if (UseMI->isCopy() && OpToFold.isReg() &&
@@ -1027,7 +1039,7 @@ void SIFoldOperands::foldOperand(
 
     // Don't fold into target independent nodes.  Target independent opcodes
     // don't have defined register classes.
-    if (UseDesc.isVariadic() || UseOp.isImplicit() ||
+    if (UseDesc.isVariadic() || UseOp->isImplicit() ||
         UseDesc.operands()[UseOpIdx].RegClass == -1)
       return;
   }
@@ -1062,17 +1074,17 @@ void SIFoldOperands::foldOperand(
       TRI->getRegClass(FoldDesc.operands()[0].RegClass);
 
   // Split 64-bit constants into 32-bits for folding.
-  if (UseOp.getSubReg() && AMDGPU::getRegBitWidth(*FoldRC) == 64) {
-    Register UseReg = UseOp.getReg();
+  if (UseOp->getSubReg() && AMDGPU::getRegBitWidth(*FoldRC) == 64) {
+    Register UseReg = UseOp->getReg();
     const TargetRegisterClass *UseRC = MRI->getRegClass(UseReg);
     if (AMDGPU::getRegBitWidth(*UseRC) != 64)
       return;
 
     APInt Imm(64, OpToFold.getImm());
-    if (UseOp.getSubReg() == AMDGPU::sub0) {
+    if (UseOp->getSubReg() == AMDGPU::sub0) {
       Imm = Imm.getLoBits(32);
     } else {
-      assert(UseOp.getSubReg() == AMDGPU::sub1);
+      assert(UseOp->getSubReg() == AMDGPU::sub1);
       Imm = Imm.getHiBits(32);
     }
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index 7edec5a7a5505b..22599773d562cb 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -1148,7 +1148,13 @@ def FPT16InputModsMatchClass : FPInputModsMatchClass<16> {
 def FP32InputModsMatchClass : FPInputModsMatchClass<32>;
 def FP64InputModsMatchClass : FPInputModsMatchClass<64>;
 
-def FP16VCSrcInputModsMatchClass : FPVCSrcInputModsMatchClass<16>;
+class FP16VCSrcInputModsMatchClass<bit IsFake16>
+    : FPVCSrcInputModsMatchClass<16> {
+  let Name = !if(IsFake16, "RegOrInlineImmWithFPFake16InputMods",
+                 "RegOrInlineImmWithFPT16InputMods");
+  let PredicateMethod = "isRegOrInlineImmWithFP16InputMods<" #
+                        !if(IsFake16, "true", "false") # ">";
+}
 def FP32VCSrcInputModsMatchClass : FPVCSrcInputModsMatchClass<32>;
 
 class InputMods <AsmOperandClass matchClass> : Operand <i32> {
@@ -1166,7 +1172,8 @@ def FPT16InputMods : FPInputMods<FPT16InputModsMatchClass>;
 def FP32InputMods : FPInputMods<FP32InputModsMatchClass>;
 def FP64InputMods : FPInputMods<FP64InputModsMatchClass>;
 
-def FP16VCSrcInputMods : FPInputMods<FP16VCSrcInputModsMatchClass>;
+class FP16VCSrcInputMods<bit IsFake16>
+  : FPInputMods<FP16VCSrcInputModsMatchClass<IsFake16>>;
 def FP32VCSrcInputMods : FPInputMods<FP32VCSrcInputModsMatchClass>;
 
 class IntInputModsMatchClass <int opSize> : AsmOperandClass {
@@ -1653,11 +1660,11 @@ class getSrcModDPP_t16 <ValueType VT, bit IsFake16 = 1> {
 }
 
 // Return type of input modifiers operand for specified input operand for DPP
-class getSrcModVOP3DPP <ValueType VT> {
+class getSrcModVOP3DPP <ValueType VT, bit IsFake16 = 1> {
   Operand ret =
       !if (VT.isFP,
            !if (!or(!eq(VT.Value, f16.Value), !eq(VT.Value, bf16.Value)),
-                FP16VCSrcInputMods, FP32VCSrcInputMods),
+                FP16VCSrcInputMods<IsFake16>, FP32VCSrcInputMods),
            Int32VCSrcInputMods);
 }
 
@@ -2450,6 +2457,10 @@ class VOP_PAT_GEN <VOPProfile p, int mode=PatGenMode.NoPattern> : VOPProfile <p.
 class VOPProfile_True16<VOPProfile P> : VOPProfile<P.ArgVT> {
   let IsTrue16 = 1;
   let IsRealTrue16 = 1;
+
+  let HasOpSel = 1;
+  let HasModifiers = 1; // All instructions at least have OpSel.
+
   // Most DstVT are 16-bit, but not all.
   let DstRC = getVALUDstForVT<DstVT, 1 /*IsTrue16*/, 0 /*IsVOP3Encoding*/>.ret;
   let DstRC64 = getVALUDstForVT<DstVT>.ret;
@@ -2461,6 +2472,10 @@ class VOPProfile_True16<VOPProfile P> : VOPProfile<P.ArgVT> {
   let Src0ModDPP = getSrcModDPP_t16<Src0VT, 0 /*IsFake16*/>.ret;
   let Src1ModDPP = getSrcModDPP_t16<Src1VT, 0 /*IsFake16*/>.ret;
   let Src2ModDPP = getSrcModDPP_t16<Src2VT, 0 /*IsFake16*/>.ret;
+  let Src0VOP3DPP = VGPRSrc_16;
+  let Src0ModVOP3DPP = getSrcModVOP3DPP<Src0VT, 0 /*IsFake16*/>.ret;
+  let Src1ModVOP3DPP = getSrcModVOP3DPP<Src1VT, 0 /*IsFake16*/>.ret;
+  let Src2ModVOP3DPP = getSrcModVOP3DPP<Src2VT, 0 /*IsFake16*/>.ret;
 
   let DstRC64 = getVALUDstForVT<DstVT, 1 /*IsTrue16*/, 1 /*IsVOP3Encoding*/>.ret;
   let Src0RC64 = getVOP3SrcForVT<Src0VT, 1 /*IsTrue16*/>.ret;
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
index c9dbe02037ef2e..aabb6c29062114 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
@@ -1235,6 +1235,12 @@ def VGPRSrc_16_Lo128 : RegisterOperand<VGPR_16_Lo128> {
   let EncoderMethod = "getMachineOpValueT16Lo128";
 }
 
+// True 16 operands.
+def VGPRSrc_16 : RegisterOperand<VGPR_16> {
+  let DecoderMethod = "DecodeVGPR_16RegisterClass";
+  let EncoderMethod = "getMachineOpValueT16";
+}
+
 //===----------------------------------------------------------------------===//
 //  ASrc_* Operands with an AccVGPR
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fceil.s16.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fceil.s16.mir
index 84da311108ce38..014534ab79fe64 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fceil.s16.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fceil.s16.mir
@@ -50,7 +50,7 @@ body: |
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_16 = COPY [[COPY]]
-    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_CEIL_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY2]]
     ;
@@ -88,7 +88,7 @@ body: |
     ; GFX11: liveins: $sgpr0
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
-    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[COPY]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[V_CEIL_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY1]]
     ;
@@ -127,7 +127,7 @@ body: |
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_16 = COPY [[COPY]]
-    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 1, [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_CEIL_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY2]]
     ;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.s16.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.s16.mir
index 30975a8937db62..dcf9e169f586be 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.s16.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ffloor.s16.mir
@@ -59,7 +59,7 @@ body: |
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_16 = COPY [[COPY]]
-    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[COPY1]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_FLOOR_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY2]]
     ;
@@ -97,7 +97,7 @@ body: |
     ; GFX11: liveins: $sgpr0
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
-    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[COPY]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[COPY]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[V_FLOOR_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY1]]
     ;
@@ -136,7 +136,7 @@ body: |
     ; GFX11-NEXT: {{  $}}
     ; GFX11-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
     ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_16 = COPY [[COPY]]
-    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 1, [[COPY1]], 0, 0, implicit $mode, implicit $exec
+    ; GFX11-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 1, [[COPY1]], 0, 0, 0, implicit $mode, implicit $exec
     ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_FLOOR_F16_t16_e64_]]
     ; GFX11-NEXT: $vgpr0 = COPY [[COPY2]]
     ;
diff --git a/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-f16.mir b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-f16.mir
index 7767aa54c81519..9ae5f559e860af 100644
--- a/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-f16.mir
+++ b/llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-f16.mir
@@ -66,7 +66,7 @@ body:             |
     ; REAL16: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
     ; REAL16-NEXT: [[V_CVT_F32_U32_e64_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e64 [[DEF]], 0, 0, implicit $mode, implicit $exec
     ; REAL16-NEXT: [[DEF1:%[0-9]+]]:sreg_32 = IMPLICIT_DEF
-    ; REAL16-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[V_CVT_F32_U32_e64_]].lo16, 0, 0, implicit $mode, implicit $exec
+    ; REAL16-NEXT: [[V_CEIL_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_CEIL_F16_t16_e64 0, [[V_CVT_F32_U32_e64_]].lo16, 0, 0, 0, implicit $mode, implicit $exec
     ;
     ; FAKE16-LABEL: name: ceil_f16
     ; FAKE16: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
@@ -87,7 +87,7 @@ body:             |
     ; REAL16: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
     ; REAL16-NEXT: [[V_CVT_F32_U32_e64_:%[0-9]+]]:vgpr_32 = V_CVT_F32_U32_e64 [[DEF]], 0, 0, implicit $mode, implicit $exec
     ; REAL16-NEXT: [[DEF1:%[0-9]+]]:sreg_32 = IMPLICIT_DEF
-    ; REAL16-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[V_CVT_F32_U32_e64_]].lo16, 0, 0, implicit $mode, implicit $exec
+    ; REAL16-NEXT: [[V_FLOOR_F16_t16_e64_:%[0-9]+]]:vgpr_16 = nofpexcept V_FLOOR_F16_t16_e64 0, [[V_CVT_F32_U32_e64_]].lo16, 0, 0, 0, implicit $mode, implicit $exec
     ;
     ; FAKE16-LABEL: name: floor_f16
     ; FAKE16: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1-fake16.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1-fake16.s
new file mode 100644
index 00000000000000..1871a41ec5983e
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1-fake16.s
@@ -0,0 +1,85 @@
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefixes=GFX11 %s
+
+v_ceil_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0]
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3]
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_mirror
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_half_mirror
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_shl:1
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_shl:15
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_shr:1
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_shr:15
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_ror:1
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_ror:15
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+
+v_ceil_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+
+v_ceil_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
+// GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x09,0x13]
+
+v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
+// GFX11: [0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x05,0x30]
+
+v_floor_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0]
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_floor_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3]
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_mirror
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_half_mirror
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_shl:1
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_shl:15
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_shr:1
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_shr:15
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_ror:1
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_ror:15
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+
+v_floor_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+
+v_floor_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
+// GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x09,0x13]
+
+v_floor_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
+// GFX11: [0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x05,0x30]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
index 9a65c6687f3f84..701a72597e45d2 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16_from_vop1.s
@@ -1,4 +1,4 @@
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefixes=GFX11 %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16,+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefixes=GFX11 %s
 
 v_bfrev_b32_e64_dpp v5, v1 quad_perm:[3,2,1,0]
 // GFX11: [0x05,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
@@ -42,46 +42,52 @@ v_bfrev_b32_e64_dpp v5, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 f
 v_bfrev_b32_e64_dpp v255, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: [0xff,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0xff,0x6f,0x05,0x30]
 
-v_ceil_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0]
+v_ceil_f16_e64_dpp v5.l, v1.l quad_perm:[3,2,1,0]
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3]
+v_ceil_f16_e64_dpp v5.l, v1.h quad_perm:[3,2,1,0]
+// GFX11: [0x05,0x08,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_ceil_f16_e64_dpp v5.h, v1.l quad_perm:[3,2,1,0]
+// GFX11: [0x05,0x40,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_ceil_f16_e64_dpp v5.l, v1.l quad_perm:[0,1,2,3]
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_mirror
+v_ceil_f16_e64_dpp v5.l, v1.l row_mirror
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_half_mirror
+v_ceil_f16_e64_dpp v5.l, v1.l row_half_mirror
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_shl:1
+v_ceil_f16_e64_dpp v5.l, v1.l row_shl:1
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_shl:15
+v_ceil_f16_e64_dpp v5.l, v1.l row_shl:15
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_shr:1
+v_ceil_f16_e64_dpp v5.l, v1.l row_shr:1
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_shr:15
+v_ceil_f16_e64_dpp v5.l, v1.l row_shr:15
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_ror:1
+v_ceil_f16_e64_dpp v5.l, v1.l row_ror:1
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_ror:15
+v_ceil_f16_e64_dpp v5.l, v1.l row_ror:15
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf
+v_ceil_f16_e64_dpp v5.l, v1.l row_share:0 row_mask:0xf bank_mask:0xf
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
 
-v_ceil_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
+v_ceil_f16_e64_dpp v5.l, v1.l mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
 
-v_ceil_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
+v_ceil_f16_e64_dpp v5.l, v1.l mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
 // GFX11: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x09,0x13]
 
-v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
+v_ceil_f16_e64_dpp v255.l, -|v255.l| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: [0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x05,0x30]
 
 v_ceil_f32_e64_dpp v5, v1 quad_perm:[3,2,1,0]
@@ -1512,46 +1518,46 @@ v_ffbl_b32_e64_dpp v5, v1 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi
 v_ffbl_b32_e64_dpp v255, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: [0xff,0x00,0xba,0xd5,0xfa,0x00,0x00,0x00,0xff,0x6f,0x05,0x30]
 
-v_floor_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0]
+v_floor_f16_e64_dpp v5.l, v1.l quad_perm:[3,2,1,0]
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 
-v_floor_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3]
+v_floor_f16_e64_dpp v5.l, v1.l quad_perm:[0,1,2,3]
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_mirror
+v_floor_f16_e64_dpp v5.l, v1.l row_mirror
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_half_mirror
+v_floor_f16_e64_dpp v5.l, v1.l row_half_mirror
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_shl:1
+v_floor_f16_e64_dpp v5.l, v1.l row_shl:1
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_shl:15
+v_floor_f16_e64_dpp v5.l, v1.l row_shl:15
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_shr:1
+v_floor_f16_e64_dpp v5.l, v1.l row_shr:1
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_shr:15
+v_floor_f16_e64_dpp v5.l, v1.l row_shr:15
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_ror:1
+v_floor_f16_e64_dpp v5.l, v1.l row_ror:1
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_ror:15
+v_floor_f16_e64_dpp v5.l, v1.l row_ror:15
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf
+v_floor_f16_e64_dpp v5.l, v1.l row_share:0 row_mask:0xf bank_mask:0xf
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
 
-v_floor_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
+v_floor_f16_e64_dpp v5.l, v1.l mul:2 row_share:15 row_mask:0x0 bank_mask:0x1
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
 
-v_floor_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
+v_floor_f16_e64_dpp v5.l, v1.l mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 bound_ctrl:1 fi:0
 // GFX11: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x09,0x13]
 
-v_floor_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
+v_floor_f16_e64_dpp v255.l, -|v255.l| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:0 fi:1
 // GFX11: [0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x05,0x30]
 
 v_floor_f32_e64_dpp v5, v1 quad_perm:[3,2,1,0]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1-fake16.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1-fake16.s
new file mode 100644
index 00000000000000..1bef1fe215acdd
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1-fake16.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefix=GFX11 %s
+
+v_ceil_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_ceil_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+
+v_ceil_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
+// GFX11: [0x05,0x00,0xdc,0xd5,0xea,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+
+v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
+// GFX11: [0xff,0x81,0xdc,0xd5,0xe9,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+
+v_floor_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_floor_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+
+v_floor_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
+// GFX11: [0x05,0x00,0xdb,0xd5,0xea,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+
+v_floor_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
+// GFX11: [0xff,0x81,0xdb,0xd5,0xe9,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s
index 3897b82785f65b..043e0f9334ad84 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8_from_vop1.s
@@ -1,4 +1,4 @@
-// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefix=GFX11 %s
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16,+wavefrontsize32,-wavefrontsize64 -show-encoding %s | FileCheck --check-prefix=GFX11 %s
 
 v_bfrev_b32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: [0x05,0x00,0xb8,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
@@ -9,16 +9,22 @@ v_bfrev_b32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
 v_bfrev_b32_e64_dpp v255, v255 dpp8:[0,0,0,0,0,0,0,0] fi:0
 // GFX11: [0xff,0x00,0xb8,0xd5,0xe9,0x00,0x00,0x00,0xff,0x00,0x00,0x00]
 
-v_ceil_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+v_ceil_f16_e64_dpp v5.l, v1.l dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 
-v_ceil_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0]
+v_ceil_f16_e64_dpp v5.l, v1.h dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x08,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_ceil_f16_e64_dpp v5.h, v1.l dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: [0x05,0x40,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_ceil_f16_e64_dpp v5.l, v1.l mul:2 dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
 
-v_ceil_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
+v_ceil_f16_e64_dpp v5.l, v1.l mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
 // GFX11: [0x05,0x00,0xdc,0xd5,0xea,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
 
-v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
+v_ceil_f16_e64_dpp v255.l, -|v255.l| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
 // GFX11: [0xff,0x81,0xdc,0xd5,0xe9,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
 
 v_ceil_f32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
@@ -375,16 +381,16 @@ v_ffbl_b32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] fi:1
 v_ffbl_b32_e64_dpp v255, v255 dpp8:[0,0,0,0,0,0,0,0] fi:0
 // GFX11: [0xff,0x00,0xba,0xd5,0xe9,0x00,0x00,0x00,0xff,0x00,0x00,0x00]
 
-v_floor_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
+v_floor_f16_e64_dpp v5.l, v1.l dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 
-v_floor_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0]
+v_floor_f16_e64_dpp v5.l, v1.l mul:2 dpp8:[7,6,5,4,3,2,1,0]
 // GFX11: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
 
-v_floor_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
+v_floor_f16_e64_dpp v5.l, v1.l mul:4 dpp8:[7,6,5,4,3,2,1,0] fi:1
 // GFX11: [0x05,0x00,0xdb,0xd5,0xea,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
 
-v_floor_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
+v_floor_f16_e64_dpp v255.l, -|v255.l| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:0
 // GFX11: [0xff,0x81,0xdb,0xd5,0xe9,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
 
 v_floor_f32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0]
diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16_from_vop1.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16_from_vop1.txt
index cf29efa5ff56bb..fe5084539aedee 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16_from_vop1.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16_from_vop1.txt
@@ -1,4 +1,5 @@
-# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11 %s
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11,GFX11-REAL16 %s
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11,GFX11-FAKE16 %s
 
 # GFX11: v_bfrev_b32_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 0x05,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff
@@ -42,48 +43,74 @@
 # GFX11: v_bfrev_b32_e64_dpp v255, v255 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0xff,0x6f,0x0d,0x30]
 0xff,0x00,0xb8,0xd5,0xfa,0x00,0x00,0x00,0xff,0x6f,0x0d,0x30
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.h, v1.h op_sel:[1,1] mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x48,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x48,0xdc,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01
+
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
 0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13
 
-# GFX11: v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.h op_sel:[1,0] mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x08,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x08,0xdc,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13
+
+# GFX11-REAL16: v_ceil_f16_e64_dpp v255.l, -|v255.l| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
 0xff,0x81,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30
 
+# GFX11-REAL16: v_ceil_f16_e64_dpp v255.h, -|v255.l| op_sel:[0,1] clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0xc1,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0xff,0xc1,0xdc,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30
+
 # GFX11: v_ceil_f32_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xa2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 0x05,0x00,0xa2,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff
 
@@ -1302,48 +1329,74 @@
 # GFX11: v_exp_f32_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xa5,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
 0xff,0x81,0xa5,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 quad_perm:[0,1,2,3] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0xe4,0x00,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x40,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_half_mirror row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x41,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_shl:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x01,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_shl:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x0f,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_shr:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x11,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_shr:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1f,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_ror:1 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x21,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_ror:15 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x2f,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 row_share:0 row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x00,0x01,0x50,0x01,0xff
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.h, v1.h op_sel:[1,1] mul:2 row_share:15 row_mask:0x0 bank_mask:0x1 ; encoding: [0x05,0x48,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x48,0xdb,0xd5,0xfa,0x00,0x00,0x08,0x01,0x5f,0x01,0x01
+
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
 0x05,0x00,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13
 
-# GFX11: v_floor_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.h op_sel:[1,0] mul:4 row_xmask:0 row_mask:0x1 bank_mask:0x3 ; encoding: [0x05,0x08,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x08,0xdb,0xd5,0xfa,0x00,0x00,0x10,0x01,0x60,0x01,0x13
+
+# GFX11-REAL16: v_floor_f16_e64_dpp v255.l, -|v255.l| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v255, -|v255| clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
 0xff,0x81,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30
 
+# GFX11-REAL16: v_floor_f16_e64_dpp v255.h, -|v255.l| op_sel:[0,1] clamp div:2 row_xmask:15 row_mask:0x3 bank_mask:0x0 bound_ctrl:1 fi:1 ; encoding: [0xff,0xc1,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0xff,0xc1,0xdb,0xd5,0xfa,0x00,0x00,0x38,0xff,0x6f,0x0d,0x30
+
 # GFX11: v_floor_f32_e64_dpp v5, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xa4,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
 0x05,0x00,0xa4,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff
 
diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt
index bfda6d10c2f6d4..c1b500e495fe70 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8_from_vop1.txt
@@ -1,4 +1,5 @@
-# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11 %s
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=+real-true16 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11,GFX11-REAL16 %s
+# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -disassemble -show-encoding < %s | FileCheck -check-prefixes=GFX11,GFX11-FAKE16 %s
 
 # GFX11: v_bfrev_b32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xb8,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 0x05,0x00,0xb8,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05
@@ -6,18 +7,34 @@
 # GFX11: v_bfrev_b32_e64_dpp v255, v255 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x00,0xb8,0xd5,0xea,0x00,0x00,0x00,0xff,0x00,0x00,0x00]
 0xff,0x00,0xb8,0xd5,0xea,0x00,0x00,0x00,0xff,0x00,0x00,0x00
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05
 
-# GFX11: v_ceil_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.h, v1.h op_sel:[1,1] mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x48,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x48,0xdc,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05
+
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.l mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05
 
-# GFX11: v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# GFX11-REAL16: v_ceil_f16_e64_dpp v5.l, v1.h op_sel:[1,0] mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x08,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x08,0xdc,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05
+
+# GFX11-REAL16: v_ceil_f16_e64_dpp v255.l, -|v255.l| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# GFX11-FAKE16: v_ceil_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
 0xff,0x81,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00
 
+# GFX11-REAL16: v_ceil_f16_e64_dpp v255.h, -|v255.l| op_sel:[0,1] clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0xc1,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0xff,0xc1,0xdc,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00
+
 # GFX11: v_ceil_f32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xa2,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 0x05,0x00,0xa2,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05
 
@@ -288,18 +305,34 @@
 # GFX11: v_exp_f32_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xa5,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
 0xff,0x81,0xa5,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05
 
-# GFX11: v_floor_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.h, v1.h op_sel:[1,1] mul:2 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x48,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x48,0xdb,0xd5,0xe9,0x00,0x00,0x08,0x01,0x77,0x39,0x05
+
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.l mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v5, v1 mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
 0x05,0x00,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05
 
-# GFX11: v_floor_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# GFX11-REAL16: v_floor_f16_e64_dpp v5.l, v1.h op_sel:[1,0] mul:4 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x08,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0x05,0x08,0xdb,0xd5,0xe9,0x00,0x00,0x10,0x01,0x77,0x39,0x05
+
+# GFX11-REAL16: v_floor_f16_e64_dpp v255.l, -|v255.l| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# GFX11-FAKE16: v_floor_f16_e64_dpp v255, -|v255| clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0x81,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
 0xff,0x81,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00
 
+# GFX11-REAL16: v_floor_f16_e64_dpp v255.h, -|v255.l| op_sel:[0,1] clamp div:2 dpp8:[0,0,0,0,0,0,0,0] fi:1 ; encoding: [0xff,0xc1,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00]
+# COM: GFX11-FAKE16: warning: invalid instruction encoding
+0xff,0xc1,0xdb,0xd5,0xea,0x00,0x00,0x38,0xff,0x00,0x00,0x00
+
 # GFX11: v_floor_f32_e64_dpp v5, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xa4,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
 0x05,0x00,0xa4,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05
 

>From 129fc723d1388a419110a51220985cc9c12b60e7 Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 14:01:38 +0000
Subject: [PATCH 48/72] [X86] X86FixupVectorConstants - add destination
 register width to rebuildSplatCst/rebuildZeroUpperCst/rebuildExtCst callbacks

As found on #81136 - we aren't correctly handling for cases where the constant pool entry is wider than the destination register width, causing incorrect scaling of the truncated constant for load-extension cases.

This first patch just pulls out the destination register width argument, its still currently driven by the constant pool entry but that will be addressed in a followup.
---
 .../Target/X86/X86FixupVectorConstants.cpp    | 52 +++++++++++--------
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 9c46cee572fc91..9b90b5e4bc1ea0 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -121,6 +121,13 @@ static std::optional<APInt> extractConstantBits(const Constant *C) {
   return std::nullopt;
 }
 
+static std::optional<APInt> extractConstantBits(const Constant *C,
+                                                unsigned NumBits) {
+  if (std::optional<APInt> Bits = extractConstantBits(C))
+    return Bits->zextOrTrunc(NumBits);
+  return std::nullopt;
+}
+
 // Attempt to compute the splat width of bits data by normalizing the splat to
 // remove undefs.
 static std::optional<APInt> getSplatableConstant(const Constant *C,
@@ -217,16 +224,15 @@ static Constant *rebuildConstant(LLVMContext &Ctx, Type *SclTy,
 
 // Attempt to rebuild a normalized splat vector constant of the requested splat
 // width, built up of potentially smaller scalar values.
-static Constant *rebuildSplatCst(const Constant *C, unsigned /*NumElts*/,
-                                 unsigned SplatBitWidth) {
+static Constant *rebuildSplatCst(const Constant *C, unsigned /*NumBits*/,
+                                 unsigned /*NumElts*/, unsigned SplatBitWidth) {
   std::optional<APInt> Splat = getSplatableConstant(C, SplatBitWidth);
   if (!Splat)
     return nullptr;
 
   // Determine scalar size to use for the constant splat vector, clamping as we
   // might have found a splat smaller than the original constant data.
-  const Type *OriginalType = C->getType();
-  Type *SclTy = OriginalType->getScalarType();
+  Type *SclTy = C->getType()->getScalarType();
   unsigned NumSclBits = SclTy->getPrimitiveSizeInBits();
   NumSclBits = std::min<unsigned>(NumSclBits, SplatBitWidth);
 
@@ -236,20 +242,19 @@ static Constant *rebuildSplatCst(const Constant *C, unsigned /*NumElts*/,
                    : 64;
 
   // Extract per-element bits.
-  return rebuildConstant(OriginalType->getContext(), SclTy, *Splat, NumSclBits);
+  return rebuildConstant(C->getContext(), SclTy, *Splat, NumSclBits);
 }
 
-static Constant *rebuildZeroUpperCst(const Constant *C, unsigned /*NumElts*/,
+static Constant *rebuildZeroUpperCst(const Constant *C, unsigned NumBits,
+                                     unsigned /*NumElts*/,
                                      unsigned ScalarBitWidth) {
-  Type *Ty = C->getType();
-  Type *SclTy = Ty->getScalarType();
-  unsigned NumBits = Ty->getPrimitiveSizeInBits();
+  Type *SclTy = C->getType()->getScalarType();
   unsigned NumSclBits = SclTy->getPrimitiveSizeInBits();
   LLVMContext &Ctx = C->getContext();
 
   if (NumBits > ScalarBitWidth) {
     // Determine if the upper bits are all zero.
-    if (std::optional<APInt> Bits = extractConstantBits(C)) {
+    if (std::optional<APInt> Bits = extractConstantBits(C, NumBits)) {
       if (Bits->countLeadingZeros() >= (NumBits - ScalarBitWidth)) {
         // If the original constant was made of smaller elements, try to retain
         // those types.
@@ -266,16 +271,15 @@ static Constant *rebuildZeroUpperCst(const Constant *C, unsigned /*NumElts*/,
   return nullptr;
 }
 
-static Constant *rebuildExtCst(const Constant *C, bool IsSExt, unsigned NumElts,
+static Constant *rebuildExtCst(const Constant *C, bool IsSExt,
+                               unsigned NumBits, unsigned NumElts,
                                unsigned SrcEltBitWidth) {
-  Type *Ty = C->getType();
-  unsigned NumBits = Ty->getPrimitiveSizeInBits();
   unsigned DstEltBitWidth = NumBits / NumElts;
   assert((NumBits % NumElts) == 0 && (NumBits % SrcEltBitWidth) == 0 &&
          (DstEltBitWidth % SrcEltBitWidth) == 0 &&
          (DstEltBitWidth > SrcEltBitWidth) && "Illegal extension width");
 
-  if (std::optional<APInt> Bits = extractConstantBits(C)) {
+  if (std::optional<APInt> Bits = extractConstantBits(C, NumBits)) {
     assert((Bits->getBitWidth() / DstEltBitWidth) == NumElts &&
            (Bits->getBitWidth() % DstEltBitWidth) == 0 &&
            "Unexpected constant extension");
@@ -290,19 +294,20 @@ static Constant *rebuildExtCst(const Constant *C, bool IsSExt, unsigned NumElts,
       TruncBits.insertBits(Elt.trunc(SrcEltBitWidth), I * SrcEltBitWidth);
     }
 
+    Type *Ty = C->getType();
     return rebuildConstant(Ty->getContext(), Ty->getScalarType(), TruncBits,
                            SrcEltBitWidth);
   }
 
   return nullptr;
 }
-static Constant *rebuildSExtCst(const Constant *C, unsigned NumElts,
-                                unsigned SrcEltBitWidth) {
-  return rebuildExtCst(C, true, NumElts, SrcEltBitWidth);
+static Constant *rebuildSExtCst(const Constant *C, unsigned NumBits,
+                                unsigned NumElts, unsigned SrcEltBitWidth) {
+  return rebuildExtCst(C, true, NumBits, NumElts, SrcEltBitWidth);
 }
-static Constant *rebuildZExtCst(const Constant *C, unsigned NumElts,
-                                unsigned SrcEltBitWidth) {
-  return rebuildExtCst(C, false, NumElts, SrcEltBitWidth);
+static Constant *rebuildZExtCst(const Constant *C, unsigned NumBits,
+                                unsigned NumElts, unsigned SrcEltBitWidth) {
+  return rebuildExtCst(C, false, NumBits, NumElts, SrcEltBitWidth);
 }
 
 bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
@@ -320,7 +325,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     int Op;
     int NumCstElts;
     int BitWidth;
-    std::function<Constant *(const Constant *, unsigned, unsigned)>
+    std::function<Constant *(const Constant *, unsigned, unsigned, unsigned)>
         RebuildConstant;
   };
   auto FixupConstant = [&](ArrayRef<FixupEntry> Fixups, unsigned OperandNo) {
@@ -335,12 +340,13 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     assert(MI.getNumOperands() >= (OperandNo + X86::AddrNumOperands) &&
            "Unexpected number of operands!");
     if (auto *C = X86::getConstantFromPool(MI, OperandNo)) {
+      unsigned NumBits = C->getType()->getPrimitiveSizeInBits();
       for (const FixupEntry &Fixup : Fixups) {
         if (Fixup.Op) {
           // Construct a suitable constant and adjust the MI to use the new
           // constant pool entry.
-          if (Constant *NewCst =
-                  Fixup.RebuildConstant(C, Fixup.NumCstElts, Fixup.BitWidth)) {
+          if (Constant *NewCst = Fixup.RebuildConstant(
+                  C, NumBits, Fixup.NumCstElts, Fixup.BitWidth)) {
             unsigned NewCPI =
                 CP->getConstantPoolIndex(NewCst, Align(Fixup.BitWidth / 8));
             MI.setDesc(TII->get(Fixup.Op));

>From 167d4304b8a159f3f789d6364056da72178544f8 Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 15:59:19 +0000
Subject: [PATCH 49/72] [X86] Add test case for #81136

---
 llvm/test/CodeGen/X86/pr81136.ll | 46 ++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 llvm/test/CodeGen/X86/pr81136.ll

diff --git a/llvm/test/CodeGen/X86/pr81136.ll b/llvm/test/CodeGen/X86/pr81136.ll
new file mode 100644
index 00000000000000..8843adca0933c2
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr81136.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc < %s -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s
+
+; FIXME: Should be vpmovzxbq[128,1] instead of vpmovzxbd[128,1,0,0]
+define i64 @PR81136(i32 %a0, i32 %a1, ptr %a2) {
+; CHECK-LABEL: PR81136:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vmovd %edi, %xmm0
+; CHECK-NEXT:    vmovd %esi, %xmm1
+; CHECK-NEXT:    vmovdqa (%rdx), %ymm2
+; CHECK-NEXT:    vpxor %xmm3, %xmm3, %xmm3
+; CHECK-NEXT:    vpmovzxbd {{.*#+}} xmm4 = [128,1,0,0]
+; CHECK-NEXT:    vpcmpgtq %xmm3, %xmm4, %xmm4
+; CHECK-NEXT:    vpcmpgtw %xmm0, %xmm1, %xmm0
+; CHECK-NEXT:    vpcmpeqd %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:    vpxor %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:    vpmovsxwq %xmm0, %xmm0
+; CHECK-NEXT:    vpalignr {{.*#+}} xmm0 = mem[8,9,10,11,12,13,14,15],xmm0[0,1,2,3,4,5,6,7]
+; CHECK-NEXT:    vpcmpeqq %xmm3, %xmm0, %xmm0
+; CHECK-NEXT:    vpxor %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:    vpcmpeqq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2, %xmm1
+; CHECK-NEXT:    vextractf128 $1, %ymm2, %xmm2
+; CHECK-NEXT:    vpcmpeqq {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2, %xmm2
+; CHECK-NEXT:    vinsertf128 $1, %xmm0, %ymm4, %ymm0
+; CHECK-NEXT:    vinsertf128 $1, %xmm2, %ymm1, %ymm1
+; CHECK-NEXT:    vandnpd %ymm0, %ymm1, %ymm0
+; CHECK-NEXT:    vmovmskpd %ymm0, %eax
+; CHECK-NEXT:    popcntl %eax, %eax
+; CHECK-NEXT:    negq %rax
+; CHECK-NEXT:    retq
+  %v0 = bitcast i32 %a0 to <2 x i16>
+  %v1 = bitcast i32 %a1 to <2 x i16>
+  %cmp15 = icmp sle <2 x i16> %v1, %v0
+  %conv16 = sext <2 x i1> %cmp15 to <2 x i64>
+  %shuffle29 = shufflevector <2 x i64> %conv16, <2 x i64> <i64 128, i64 1>, <4 x i32> <i32 2, i32 3, i32 3, i32 0>
+  %data = load volatile <4 x i64>, ptr %a2, align 32
+  %cmp65 = icmp ne <4 x i64> %data, <i64 -2071602529, i64 -1537047284, i64 717942021, i64 597457239>
+  %cmp67 = icmp ne <4 x i64> %shuffle29, zeroinitializer
+  %and = and <4 x i1> %cmp65, %cmp67
+  %mask = bitcast <4 x i1> %and to i4
+  %cnt = tail call i4 @llvm.ctpop.i4(i4 %mask)
+  %cntz = zext i4 %cnt to i64
+  %res = sub nsw i64 0, %cntz
+  ret i64 %res
+}
+declare i4 @llvm.ctpop.i4(i4)

>From ffae34095e1f504cbdfdbdd9a9e33b9d580a41bd Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 16:31:09 +0000
Subject: [PATCH 50/72] [X86] X86FixupVectorConstants - rename
 FixupEntry::BitWidth to FixupEntry::MemBitWidth NFC.

Make it clearer that this refers to the width of the constant element stored in memory - which won't match the register element width after a sext/zextload
---
 llvm/lib/Target/X86/X86FixupVectorConstants.cpp | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 9b90b5e4bc1ea0..32ca9c164c579b 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -324,7 +324,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
   struct FixupEntry {
     int Op;
     int NumCstElts;
-    int BitWidth;
+    int MemBitWidth;
     std::function<Constant *(const Constant *, unsigned, unsigned, unsigned)>
         RebuildConstant;
   };
@@ -332,23 +332,23 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
 #ifdef EXPENSIVE_CHECKS
     assert(llvm::is_sorted(Fixups,
                            [](const FixupEntry &A, const FixupEntry &B) {
-                             return (A.NumCstElts * A.BitWidth) <
-                                    (B.NumCstElts * B.BitWidth);
+                             return (A.NumCstElts * A.MemBitWidth) <
+                                    (B.NumCstElts * B.MemBitWidth);
                            }) &&
            "Constant fixup table not sorted in ascending constant size");
 #endif
     assert(MI.getNumOperands() >= (OperandNo + X86::AddrNumOperands) &&
            "Unexpected number of operands!");
     if (auto *C = X86::getConstantFromPool(MI, OperandNo)) {
-      unsigned NumBits = C->getType()->getPrimitiveSizeInBits();
+      unsigned RegBitWidth = C->getType()->getPrimitiveSizeInBits();
       for (const FixupEntry &Fixup : Fixups) {
         if (Fixup.Op) {
           // Construct a suitable constant and adjust the MI to use the new
           // constant pool entry.
           if (Constant *NewCst = Fixup.RebuildConstant(
-                  C, NumBits, Fixup.NumCstElts, Fixup.BitWidth)) {
+                  C, RegBitWidth, Fixup.NumCstElts, Fixup.MemBitWidth)) {
             unsigned NewCPI =
-                CP->getConstantPoolIndex(NewCst, Align(Fixup.BitWidth / 8));
+                CP->getConstantPoolIndex(NewCst, Align(Fixup.MemBitWidth / 8));
             MI.setDesc(TII->get(Fixup.Op));
             MI.getOperand(OperandNo + X86::AddrDisp).setIndex(NewCPI);
             return true;

>From 3fe0dfd32d2a62ad5b8458c8e362b1eb1e79b2f4 Mon Sep 17 00:00:00 2001
From: stephenpeckham <118857872+stephenpeckham at users.noreply.github.com>
Date: Thu, 8 Feb 2024 10:44:19 -0600
Subject: [PATCH 51/72] [XCOFF][obj2yaml] Support SymbolAlignmentAndType as 2
 separate fields in YAML. (#76828)

XCOFF encodes a symbol type and alignment in a single 8-bit field. It is
easier to read and write YAML files if the fields can be specified
separately. This PR causes obj2yaml to write the fields separately and
allows yaml2obj to read either the single combined field or the separate
fields.
---
 llvm/include/llvm/ObjectYAML/XCOFFYAML.h      |   7 ++
 llvm/lib/ObjectYAML/XCOFFEmitter.cpp          |  99 ++++++++++-----
 llvm/lib/ObjectYAML/XCOFFYAML.cpp             |  16 ++-
 llvm/test/tools/obj2yaml/XCOFF/aix.yaml       |  12 +-
 .../tools/obj2yaml/XCOFF/aux-symbols.yaml     |  12 +-
 .../tools/yaml2obj/XCOFF/aux-aligntype.yaml   | 114 ++++++++++++++++++
 .../tools/yaml2obj/XCOFF/aux-symbols.yaml     |  25 ++++
 llvm/tools/obj2yaml/xcoff2yaml.cpp            |   4 +-
 8 files changed, 250 insertions(+), 39 deletions(-)
 create mode 100644 llvm/test/tools/yaml2obj/XCOFF/aux-aligntype.yaml

diff --git a/llvm/include/llvm/ObjectYAML/XCOFFYAML.h b/llvm/include/llvm/ObjectYAML/XCOFFYAML.h
index f1e821fe5fa369..dd359ac8e53dd3 100644
--- a/llvm/include/llvm/ObjectYAML/XCOFFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/XCOFFYAML.h
@@ -121,6 +121,9 @@ struct CsectAuxEnt : AuxSymbolEnt {
   // Common fields for both XCOFF32 and XCOFF64.
   std::optional<uint32_t> ParameterHashIndex;
   std::optional<uint16_t> TypeChkSectNum;
+  std::optional<XCOFF::SymbolType> SymbolType;
+  std::optional<uint8_t> SymbolAlignment;
+  // The two previous values can be encoded as a single value.
   std::optional<uint8_t> SymbolAlignmentAndType;
   std::optional<XCOFF::StorageMappingClass> StorageMappingClass;
 
@@ -237,6 +240,10 @@ template <> struct ScalarEnumerationTraits<XCOFF::StorageMappingClass> {
   static void enumeration(IO &IO, XCOFF::StorageMappingClass &Value);
 };
 
+template <> struct ScalarEnumerationTraits<XCOFF::SymbolType> {
+  static void enumeration(IO &IO, XCOFF::SymbolType &Value);
+};
+
 template <> struct ScalarEnumerationTraits<XCOFF::CFileStringType> {
   static void enumeration(IO &IO, XCOFF::CFileStringType &Type);
 };
diff --git a/llvm/lib/ObjectYAML/XCOFFEmitter.cpp b/llvm/lib/ObjectYAML/XCOFFEmitter.cpp
index ccf768c06aebfe..5b244ffccd1056 100644
--- a/llvm/lib/ObjectYAML/XCOFFEmitter.cpp
+++ b/llvm/lib/ObjectYAML/XCOFFEmitter.cpp
@@ -23,6 +23,7 @@
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
+using namespace llvm::object;
 
 namespace {
 
@@ -56,14 +57,14 @@ class XCOFFWriter {
   bool writeSymbols();
   void writeStringTable();
 
-  void writeAuxSymbol(const XCOFFYAML::CsectAuxEnt &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::FileAuxEnt &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::FunctionAuxEnt &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::ExcpetionAuxEnt &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::BlockAuxEnt &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::SectAuxEntForDWARF &AuxSym);
-  void writeAuxSymbol(const XCOFFYAML::SectAuxEntForStat &AuxSym);
-  void writeAuxSymbol(const std::unique_ptr<XCOFFYAML::AuxSymbolEnt> &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::CsectAuxEnt &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::FileAuxEnt &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::FunctionAuxEnt &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::ExcpetionAuxEnt &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::BlockAuxEnt &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::SectAuxEntForDWARF &AuxSym);
+  bool writeAuxSymbol(const XCOFFYAML::SectAuxEntForStat &AuxSym);
+  bool writeAuxSymbol(const std::unique_ptr<XCOFFYAML::AuxSymbolEnt> &AuxSym);
 
   XCOFFYAML::Object &Obj;
   bool Is64Bit = false;
@@ -181,7 +182,7 @@ bool XCOFFWriter::initStringTable() {
   StrTblBuilder.clear();
 
   if (Obj.StrTbl.Strings) {
-    // All specified strings should be added to the string table.
+    // Add all specified strings to the string table.
     for (StringRef StringEnt : *Obj.StrTbl.Strings)
       StrTblBuilder.add(StringEnt);
 
@@ -524,12 +525,44 @@ bool XCOFFWriter::writeRelocations() {
   return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::CsectAuxEnt &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::CsectAuxEnt &AuxSym) {
+  uint8_t SymAlignAndType = 0;
+  if (AuxSym.SymbolAlignmentAndType) {
+    if (AuxSym.SymbolType || AuxSym.SymbolAlignment) {
+      ErrHandler("cannot specify SymbolType or SymbolAlignment if "
+                 "SymbolAlignmentAndType is specified");
+      return false;
+    }
+    SymAlignAndType = *AuxSym.SymbolAlignmentAndType;
+  } else {
+    if (AuxSym.SymbolType) {
+      uint8_t SymbolType = *AuxSym.SymbolType;
+      if (SymbolType & ~XCOFFCsectAuxRef::SymbolTypeMask) {
+        ErrHandler("symbol type must be less than " +
+                   Twine(1 + XCOFFCsectAuxRef::SymbolTypeMask));
+        return false;
+      }
+      SymAlignAndType = SymbolType;
+    }
+    if (AuxSym.SymbolAlignment) {
+      const uint8_t ShiftedSymbolAlignmentMask =
+          XCOFFCsectAuxRef::SymbolAlignmentMask >>
+          XCOFFCsectAuxRef::SymbolAlignmentBitOffset;
+
+      if (*AuxSym.SymbolAlignment & ~ShiftedSymbolAlignmentMask) {
+        ErrHandler("symbol alignment must be less than " +
+                   Twine(1 + ShiftedSymbolAlignmentMask));
+        return false;
+      }
+      SymAlignAndType |= (*AuxSym.SymbolAlignment
+                          << XCOFFCsectAuxRef::SymbolAlignmentBitOffset);
+    }
+  }
   if (Is64Bit) {
     W.write<uint32_t>(AuxSym.SectionOrLengthLo.value_or(0));
     W.write<uint32_t>(AuxSym.ParameterHashIndex.value_or(0));
     W.write<uint16_t>(AuxSym.TypeChkSectNum.value_or(0));
-    W.write<uint8_t>(AuxSym.SymbolAlignmentAndType.value_or(0));
+    W.write<uint8_t>(SymAlignAndType);
     W.write<uint8_t>(AuxSym.StorageMappingClass.value_or(XCOFF::XMC_PR));
     W.write<uint32_t>(AuxSym.SectionOrLengthHi.value_or(0));
     W.write<uint8_t>(0);
@@ -538,23 +571,25 @@ void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::CsectAuxEnt &AuxSym) {
     W.write<uint32_t>(AuxSym.SectionOrLength.value_or(0));
     W.write<uint32_t>(AuxSym.ParameterHashIndex.value_or(0));
     W.write<uint16_t>(AuxSym.TypeChkSectNum.value_or(0));
-    W.write<uint8_t>(AuxSym.SymbolAlignmentAndType.value_or(0));
+    W.write<uint8_t>(SymAlignAndType);
     W.write<uint8_t>(AuxSym.StorageMappingClass.value_or(XCOFF::XMC_PR));
     W.write<uint32_t>(AuxSym.StabInfoIndex.value_or(0));
     W.write<uint16_t>(AuxSym.StabSectNum.value_or(0));
   }
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::ExcpetionAuxEnt &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::ExcpetionAuxEnt &AuxSym) {
   assert(Is64Bit && "can't write the exception auxiliary symbol for XCOFF32");
   W.write<uint64_t>(AuxSym.OffsetToExceptionTbl.value_or(0));
   W.write<uint32_t>(AuxSym.SizeOfFunction.value_or(0));
   W.write<uint32_t>(AuxSym.SymIdxOfNextBeyond.value_or(0));
   W.write<uint8_t>(0);
   W.write<uint8_t>(XCOFF::AUX_EXCEPT);
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FunctionAuxEnt &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FunctionAuxEnt &AuxSym) {
   if (Is64Bit) {
     W.write<uint64_t>(AuxSym.PtrToLineNum.value_or(0));
     W.write<uint32_t>(AuxSym.SizeOfFunction.value_or(0));
@@ -568,9 +603,10 @@ void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FunctionAuxEnt &AuxSym) {
     W.write<uint32_t>(AuxSym.SymIdxOfNextBeyond.value_or(0));
     W.OS.write_zeros(2);
   }
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FileAuxEnt &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FileAuxEnt &AuxSym) {
   StringRef FileName = AuxSym.FileNameOrString.value_or("");
   if (nameShouldBeInStringTable(FileName)) {
     W.write<int32_t>(0);
@@ -586,9 +622,10 @@ void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::FileAuxEnt &AuxSym) {
   } else {
     W.OS.write_zeros(3);
   }
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::BlockAuxEnt &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::BlockAuxEnt &AuxSym) {
   if (Is64Bit) {
     W.write<uint32_t>(AuxSym.LineNum.value_or(0));
     W.OS.write_zeros(13);
@@ -599,9 +636,10 @@ void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::BlockAuxEnt &AuxSym) {
     W.write<uint16_t>(AuxSym.LineNumLo.value_or(0));
     W.OS.write_zeros(12);
   }
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::SectAuxEntForDWARF &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::SectAuxEntForDWARF &AuxSym) {
   if (Is64Bit) {
     W.write<uint64_t>(AuxSym.LengthOfSectionPortion.value_or(0));
     W.write<uint64_t>(AuxSym.NumberOfRelocEnt.value_or(0));
@@ -613,34 +651,36 @@ void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::SectAuxEntForDWARF &AuxSym) {
     W.write<uint32_t>(AuxSym.NumberOfRelocEnt.value_or(0));
     W.OS.write_zeros(6);
   }
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(const XCOFFYAML::SectAuxEntForStat &AuxSym) {
+bool XCOFFWriter::writeAuxSymbol(const XCOFFYAML::SectAuxEntForStat &AuxSym) {
   assert(!Is64Bit && "can't write the stat auxiliary symbol for XCOFF64");
   W.write<uint32_t>(AuxSym.SectionLength.value_or(0));
   W.write<uint16_t>(AuxSym.NumberOfRelocEnt.value_or(0));
   W.write<uint16_t>(AuxSym.NumberOfLineNum.value_or(0));
   W.OS.write_zeros(10);
+  return true;
 }
 
-void XCOFFWriter::writeAuxSymbol(
+bool XCOFFWriter::writeAuxSymbol(
     const std::unique_ptr<XCOFFYAML::AuxSymbolEnt> &AuxSym) {
   if (auto AS = dyn_cast<XCOFFYAML::CsectAuxEnt>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::FunctionAuxEnt>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::ExcpetionAuxEnt>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::FileAuxEnt>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::BlockAuxEnt>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::SectAuxEntForDWARF>(AuxSym.get()))
-    writeAuxSymbol(*AS);
+    return writeAuxSymbol(*AS);
   else if (auto AS = dyn_cast<XCOFFYAML::SectAuxEntForStat>(AuxSym.get()))
-    writeAuxSymbol(*AS);
-  else
-    llvm_unreachable("unknown auxiliary symbol type");
+    return writeAuxSymbol(*AS);
+  llvm_unreachable("unknown auxiliary symbol type");
+  return false;
 }
 
 bool XCOFFWriter::writeSymbols() {
@@ -698,7 +738,8 @@ bool XCOFFWriter::writeSymbols() {
     } else {
       for (const std::unique_ptr<XCOFFYAML::AuxSymbolEnt> &AuxSym :
            YamlSym.AuxEntries) {
-        writeAuxSymbol(AuxSym);
+        if (!writeAuxSymbol(AuxSym))
+          return false;
       }
       // Pad with zeros.
       if (NumOfAuxSym > YamlSym.AuxEntries.size())
diff --git a/llvm/lib/ObjectYAML/XCOFFYAML.cpp b/llvm/lib/ObjectYAML/XCOFFYAML.cpp
index 398b09c72170ba..83bf61301387f0 100644
--- a/llvm/lib/ObjectYAML/XCOFFYAML.cpp
+++ b/llvm/lib/ObjectYAML/XCOFFYAML.cpp
@@ -127,6 +127,17 @@ void ScalarEnumerationTraits<XCOFF::StorageMappingClass>::enumeration(
 #undef ECase
 }
 
+void ScalarEnumerationTraits<XCOFF::SymbolType>::enumeration(
+    IO &IO, XCOFF::SymbolType &Value) {
+#define ECase(X) IO.enumCase(Value, #X, XCOFF::X)
+  ECase(XTY_ER);
+  ECase(XTY_SD);
+  ECase(XTY_LD);
+  ECase(XTY_CM);
+#undef ECase
+  IO.enumFallback<Hex8>(Value);
+}
+
 void ScalarEnumerationTraits<XCOFFYAML::AuxSymbolType>::enumeration(
     IO &IO, XCOFFYAML::AuxSymbolType &Type) {
 #define ECase(X) IO.enumCase(Type, #X, XCOFFYAML::X)
@@ -229,6 +240,8 @@ static void auxSymMapping(IO &IO, XCOFFYAML::CsectAuxEnt &AuxSym, bool Is64) {
   IO.mapOptional("ParameterHashIndex", AuxSym.ParameterHashIndex);
   IO.mapOptional("TypeChkSectNum", AuxSym.TypeChkSectNum);
   IO.mapOptional("SymbolAlignmentAndType", AuxSym.SymbolAlignmentAndType);
+  IO.mapOptional("SymbolType", AuxSym.SymbolType);
+  IO.mapOptional("SymbolAlignment", AuxSym.SymbolAlignment);
   IO.mapOptional("StorageMappingClass", AuxSym.StorageMappingClass);
   if (Is64) {
     IO.mapOptional("SectionOrLengthLo", AuxSym.SectionOrLengthLo);
@@ -350,7 +363,8 @@ void MappingTraits<XCOFFYAML::Symbol>::mapping(IO &IO, XCOFFYAML::Symbol &S) {
   IO.mapOptional("AuxEntries", S.AuxEntries);
 }
 
-void MappingTraits<XCOFFYAML::StringTable>::mapping(IO &IO, XCOFFYAML::StringTable &Str) {
+void MappingTraits<XCOFFYAML::StringTable>::mapping(
+    IO &IO, XCOFFYAML::StringTable &Str) {
   IO.mapOptional("ContentSize", Str.ContentSize);
   IO.mapOptional("Length", Str.Length);
   IO.mapOptional("Strings", Str.Strings);
diff --git a/llvm/test/tools/obj2yaml/XCOFF/aix.yaml b/llvm/test/tools/obj2yaml/XCOFF/aix.yaml
index fbd5fa0629d10b..9f2f68b646b6f4 100644
--- a/llvm/test/tools/obj2yaml/XCOFF/aix.yaml
+++ b/llvm/test/tools/obj2yaml/XCOFF/aix.yaml
@@ -56,7 +56,8 @@
 # CHECK32-NEXT:       - Type:            AUX_CSECT
 # CHECK32-NEXT:         ParameterHashIndex: 0
 # CHECK32-NEXT:         TypeChkSectNum:  0
-# CHECK32-NEXT:         SymbolAlignmentAndType: 0
+# CHECK32-NEXT:         SymbolType:      XTY_ER
+# CHECK32-NEXT:         SymbolAlignment: 0
 # CHECK32-NEXT:         StorageMappingClass: XMC_PR
 # CHECK32-NEXT:         SectionOrLength: 0
 # CHECK32-NEXT:         StabInfoIndex:   0
@@ -71,7 +72,8 @@
 # CHECK32-NEXT:       - Type:            AUX_CSECT
 # CHECK32-NEXT:         ParameterHashIndex: 0
 # CHECK32-NEXT:         TypeChkSectNum:  0
-# CHECK32-NEXT:         SymbolAlignmentAndType: 0
+# CHECK32-NEXT:         SymbolType:      XTY_ER
+# CHECK32-NEXT:         SymbolAlignment: 0
 # CHECK32-NEXT:         StorageMappingClass: XMC_PR
 # CHECK32-NEXT:         SectionOrLength: 0
 # CHECK32-NEXT:         StabInfoIndex:   0
@@ -128,7 +130,8 @@
 # CHECK64-NEXT:       - Type:            AUX_CSECT
 # CHECK64-NEXT:         ParameterHashIndex: 0
 # CHECK64-NEXT:         TypeChkSectNum:  0
-# CHECK64-NEXT:         SymbolAlignmentAndType: 0
+# CHECK64-NEXT:         SymbolType:      XTY_ER
+# CHECK64-NEXT:         SymbolAlignment: 0
 # CHECK64-NEXT:         StorageMappingClass: XMC_PR
 # CHECK64-NEXT:         SectionOrLengthLo: 0
 # CHECK64-NEXT:         SectionOrLengthHi: 0
@@ -142,7 +145,8 @@
 # CHECK64-NEXT:       - Type:            AUX_CSECT
 # CHECK64-NEXT:         ParameterHashIndex: 0
 # CHECK64-NEXT:         TypeChkSectNum:  0
-# CHECK64-NEXT:         SymbolAlignmentAndType: 0
+# CHECK64-NEXT:         SymbolType:      XTY_ER
+# CHECK64-NEXT:         SymbolAlignment: 0
 # CHECK64-NEXT:         StorageMappingClass: XMC_PR
 # CHECK64-NEXT:         SectionOrLengthLo: 0
 # CHECK64-NEXT:         SectionOrLengthHi: 0
diff --git a/llvm/test/tools/obj2yaml/XCOFF/aux-symbols.yaml b/llvm/test/tools/obj2yaml/XCOFF/aux-symbols.yaml
index 7f93b8dae0ca9b..8155ac1acd186b 100644
--- a/llvm/test/tools/obj2yaml/XCOFF/aux-symbols.yaml
+++ b/llvm/test/tools/obj2yaml/XCOFF/aux-symbols.yaml
@@ -34,7 +34,8 @@
 # CHECK32-NEXT:       - Type:            AUX_CSECT
 # CHECK32-NEXT:         ParameterHashIndex: 1
 # CHECK32-NEXT:         TypeChkSectNum:  2
-# CHECK32-NEXT:         SymbolAlignmentAndType: 41
+# CHECK32-NEXT:         SymbolType: XTY_SD
+# CHECK32-NEXT:         SymbolAlignment: 5
 # CHECK32-NEXT:         StorageMappingClass: XMC_PR
 # CHECK32-NEXT:         SectionOrLength: 3
 # CHECK32-NEXT:         StabInfoIndex:   4
@@ -54,7 +55,8 @@
 # CHECK32-NEXT:       - Type:            AUX_CSECT
 # CHECK32-NEXT:         ParameterHashIndex: 1
 # CHECK32-NEXT:         TypeChkSectNum:  2
-# CHECK32-NEXT:         SymbolAlignmentAndType: 17
+# CHECK32-NEXT:         SymbolType: XTY_SD
+# CHECK32-NEXT:         SymbolAlignment: 2
 # CHECK32-NEXT:         StorageMappingClass: XMC_PR
 # CHECK32-NEXT:         SectionOrLength: 4
 # CHECK32-NEXT:         StabInfoIndex:   5
@@ -174,7 +176,8 @@ Symbols:
 # CHECK64-NEXT:       - Type:            AUX_CSECT
 # CHECK64-NEXT:         ParameterHashIndex: 1
 # CHECK64-NEXT:         TypeChkSectNum:  2
-# CHECK64-NEXT:         SymbolAlignmentAndType: 41
+# CHECK64-NEXT:         SymbolType: XTY_SD
+# CHECK64-NEXT:         SymbolAlignment: 5
 # CHECK64-NEXT:         StorageMappingClass: XMC_PR
 # CHECK64-NEXT:         SectionOrLengthLo: 3
 # CHECK64-NEXT:         SectionOrLengthHi: 4
@@ -196,7 +199,8 @@ Symbols:
 # CHECK64-NEXT:       - Type:            AUX_CSECT
 # CHECK64-NEXT:         ParameterHashIndex: 1
 # CHECK64-NEXT:         TypeChkSectNum:  2
-# CHECK64-NEXT:         SymbolAlignmentAndType: 17
+# CHECK64-NEXT:         SymbolType: XTY_SD
+# CHECK64-NEXT:         SymbolAlignment: 2
 # CHECK64-NEXT:         StorageMappingClass: XMC_PR
 # CHECK64-NEXT:         SectionOrLengthLo: 3
 # CHECK64-NEXT:         SectionOrLengthHi: 4
diff --git a/llvm/test/tools/yaml2obj/XCOFF/aux-aligntype.yaml b/llvm/test/tools/yaml2obj/XCOFF/aux-aligntype.yaml
new file mode 100644
index 00000000000000..190224dd620603
--- /dev/null
+++ b/llvm/test/tools/yaml2obj/XCOFF/aux-aligntype.yaml
@@ -0,0 +1,114 @@
+## Check that yaml2obj can parse SymbolAlignmentAndType, SymbolAlignment,
+## and SymbolType.
+
+# RUN: yaml2obj %s --docnum=1 -DMAGIC=0x01DF -o %t32
+# RUN: obj2yaml %t32 | FileCheck %s --check-prefix=CHECK
+# RUN: yaml2obj %s --docnum=1 -DMAGIC=0x01F7 -o %t64
+# RUN: obj2yaml %t64 | FileCheck %s --check-prefix=CHECK
+
+# CHECK:        --- !XCOFF
+# CHECK-NEXT: FileHeader:
+# CHECK-NEXT:   MagicNumber:
+# CHECK:      Symbols:
+# CHECK:       - Name:            .fcn1
+# CHECK:         NumberOfAuxEntries: 1
+# CHECK-NEXT:    AuxEntries:
+# CHECK-NEXT:      - Type:            AUX_CSECT
+# CHECK:             SymbolType:      XTY_ER
+# CHECK-NEXT:        SymbolAlignment: 4
+# CHECK:       - Name:            .fcn2
+# CHECK:         NumberOfAuxEntries: 1
+# CHECK-NEXT:    AuxEntries:
+# CHECK-NEXT:      - Type:            AUX_CSECT
+# CHECK:             SymbolType:      XTY_SD
+# CHECK-NEXT:        SymbolAlignment: 2
+# CHECK:       - Name:            .fcn3
+# CHECK:         NumberOfAuxEntries: 1
+# CHECK-NEXT:    AuxEntries:
+# CHECK-NEXT:      - Type:            AUX_CSECT
+# CHECK:             SymbolType:      XTY_SD
+# CHECK-NEXT:        SymbolAlignment: 0
+
+--- !XCOFF
+FileHeader:
+  MagicNumber: [[MAGIC]]
+Symbols:
+  - StorageClass: C_EXT
+    Name: .fcn1
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolAlignment: 4
+  - StorageClass: C_EXT
+    Name: .fcn2
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolAlignment: 2
+        SymbolType: XTY_SD
+  - StorageClass:    C_EXT
+    Name: .fcn3
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolType: XTY_SD
+
+## Ensure that SymbolAlignment is in range.
+# RUN: not yaml2obj %s --docnum=2 -o %t 2>&1 | FileCheck %s --check-prefix=ERROR1
+# ERROR1: symbol alignment must be less than 32
+
+--- !XCOFF
+FileHeader:
+  MagicNumber:     0x1F7
+Symbols:
+  - StorageClass:    C_EXT
+    Name:               .fcn1
+    AuxEntries:
+      - Type:               AUX_CSECT
+        SymbolType: XTY_SD
+        SymbolAlignment: 32
+        SectionOrLengthLo:    4
+
+## Ensure that neither SymbolAlignment nor SymbolType can be specified if
+## SymbolAlignmentAndType is specified.
+# RUN: not yaml2obj %s --docnum=3 -o %t 2>&1 | FileCheck %s --check-prefix=ERROR2
+# ERROR2: cannot specify SymbolType or SymbolAlignment if SymbolAlignmentAndType is specified
+
+--- !XCOFF
+FileHeader:
+  MagicNumber: 0x1DF
+Symbols:
+  - StorageClass: C_EXT
+    Name: .fcn1
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolAlignmentAndType: 17
+        SymbolAlignment: 4
+        SectionOrLength: 4
+
+# RUN: not yaml2obj %s --docnum=4 -o %t 2>&1 | FileCheck %s --check-prefix=ERROR2
+
+--- !XCOFF
+FileHeader:
+  MagicNumber: 0x1DF
+Symbols:
+  - StorageClass: C_EXT
+    Name: .fcn1
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolAlignmentAndType: 17
+        SymbolAlignment: 4
+        SymbolType: XTY_CM
+        SectionOrLength: 4
+
+# RUN: not yaml2obj %s --docnum=5 -o %t 2>&1 | FileCheck %s --check-prefix=ERROR2
+
+--- !XCOFF
+FileHeader:
+  MagicNumber: 0x1F7
+Symbols:
+  - StorageClass: C_EXT
+  - StorageClass: C_EXT
+    Name: .fcn2
+    AuxEntries:
+      - Type: AUX_CSECT
+        SymbolAlignmentAndType: 18
+        SymbolType: XTY_SD
+        SectionOrLengthLo: 4
diff --git a/llvm/test/tools/yaml2obj/XCOFF/aux-symbols.yaml b/llvm/test/tools/yaml2obj/XCOFF/aux-symbols.yaml
index fe75c1941bc16f..04c774dcc3ae26 100644
--- a/llvm/test/tools/yaml2obj/XCOFF/aux-symbols.yaml
+++ b/llvm/test/tools/yaml2obj/XCOFF/aux-symbols.yaml
@@ -579,3 +579,28 @@ Symbols:
     AuxEntries:
       - Type: AUX_FILE
         FileNameOrString: foo
+
+## Case10: Specify a SymbolType outside the range of field definition.
+# RUN: not yaml2obj %s -DSYMTYPE=8 --docnum=8 -o %t10 2>&1 | \ 
+# RUN:   FileCheck %s --check-prefix BADSYMTYPE
+
+# BADSYMTYPE: error: symbol type must be less than 8
+
+## Case11: Specify a SymbolType outside the range of its enumeration.
+# RUN: yaml2obj %s -DSYMTYPE=7 --docnum=8 -o %t11
+# RUN: llvm-readobj --syms %t11 | FileCheck %s --check-prefix=STYPE
+
+--- !XCOFF
+FileHeader:
+  MagicNumber: 0x1DF
+Symbols:
+  - Name:               aux_fcn_csect
+    StorageClass:       C_EXT
+    Type:               0x20
+    AuxEntries:
+      - Type:                   AUX_CSECT
+        SymbolAlignment: 4
+        SymbolType: [[SYMTYPE=<none>]]
+
+# STYPE:      SymbolAlignmentLog2: 4
+# STYPE-NEXT:   SymbolType: 0x7
diff --git a/llvm/tools/obj2yaml/xcoff2yaml.cpp b/llvm/tools/obj2yaml/xcoff2yaml.cpp
index 0acbf486622369..e426b645cbeff6 100644
--- a/llvm/tools/obj2yaml/xcoff2yaml.cpp
+++ b/llvm/tools/obj2yaml/xcoff2yaml.cpp
@@ -209,7 +209,9 @@ void XCOFFDumper::dumpCsectAuxSym(XCOFFYAML::Symbol &Sym,
   XCOFFYAML::CsectAuxEnt CsectAuxSym;
   CsectAuxSym.ParameterHashIndex = AuxEntPtr.getParameterHashIndex();
   CsectAuxSym.TypeChkSectNum = AuxEntPtr.getTypeChkSectNum();
-  CsectAuxSym.SymbolAlignmentAndType = AuxEntPtr.getSymbolAlignmentAndType();
+  CsectAuxSym.SymbolAlignment = AuxEntPtr.getAlignmentLog2();
+  CsectAuxSym.SymbolType =
+      static_cast<XCOFF::SymbolType>(AuxEntPtr.getSymbolType());
   CsectAuxSym.StorageMappingClass = AuxEntPtr.getStorageMappingClass();
 
   if (Obj.is64Bit()) {

>From 92b9879e9baf71a0efc77f4e1d3e8f09d9d92e55 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Valentin=20Clement=20=28=E3=83=90=E3=83=AC=E3=83=B3?=
 =?UTF-8?q?=E3=82=BF=E3=82=A4=E3=83=B3=20=E3=82=AF=E3=83=AC=E3=83=A1?=
 =?UTF-8?q?=E3=83=B3=29?= <clementval at gmail.com>
Date: Thu, 8 Feb 2024 08:49:11 -0800
Subject: [PATCH 52/72] [flang][openacc] Use original input for base address
 with optional (#80931)

In #80317 the data op generation was updated to use correctly the #0
result from the hlfir.delcare op. In case of optional that are not
descriptor, it is preferable to use the original input for the varPtr
value of the OpenACC data op.
This patch also make sure that the descriptor value of optional is only
accessed when present.
---
 flang/lib/Lower/DirectivesCommon.h      | 93 +++++++++++++++++++------
 flang/lib/Lower/OpenACC.cpp             | 20 ++++--
 flang/test/Lower/OpenACC/acc-bounds.f90 | 38 +++++++++-
 3 files changed, 124 insertions(+), 27 deletions(-)

diff --git a/flang/lib/Lower/DirectivesCommon.h b/flang/lib/Lower/DirectivesCommon.h
index bd880376517dd8..8d560db34e05bf 100644
--- a/flang/lib/Lower/DirectivesCommon.h
+++ b/flang/lib/Lower/DirectivesCommon.h
@@ -52,10 +52,13 @@ namespace lower {
 /// operations.
 struct AddrAndBoundsInfo {
   explicit AddrAndBoundsInfo() {}
-  explicit AddrAndBoundsInfo(mlir::Value addr) : addr(addr) {}
-  explicit AddrAndBoundsInfo(mlir::Value addr, mlir::Value isPresent)
-      : addr(addr), isPresent(isPresent) {}
+  explicit AddrAndBoundsInfo(mlir::Value addr, mlir::Value rawInput)
+      : addr(addr), rawInput(rawInput) {}
+  explicit AddrAndBoundsInfo(mlir::Value addr, mlir::Value rawInput,
+                             mlir::Value isPresent)
+      : addr(addr), rawInput(rawInput), isPresent(isPresent) {}
   mlir::Value addr = nullptr;
+  mlir::Value rawInput = nullptr;
   mlir::Value isPresent = nullptr;
 };
 
@@ -615,20 +618,30 @@ getDataOperandBaseAddr(Fortran::lower::AbstractConverter &converter,
                        fir::FirOpBuilder &builder,
                        Fortran::lower::SymbolRef sym, mlir::Location loc) {
   mlir::Value symAddr = converter.getSymbolAddress(sym);
+  mlir::Value rawInput = symAddr;
   if (auto declareOp =
-          mlir::dyn_cast_or_null<hlfir::DeclareOp>(symAddr.getDefiningOp()))
+          mlir::dyn_cast_or_null<hlfir::DeclareOp>(symAddr.getDefiningOp())) {
     symAddr = declareOp.getResults()[0];
+    rawInput = declareOp.getResults()[1];
+  }
 
   // TODO: Might need revisiting to handle for non-shared clauses
   if (!symAddr) {
     if (const auto *details =
-            sym->detailsIf<Fortran::semantics::HostAssocDetails>())
+            sym->detailsIf<Fortran::semantics::HostAssocDetails>()) {
       symAddr = converter.getSymbolAddress(details->symbol());
+      rawInput = symAddr;
+    }
   }
 
   if (!symAddr)
     llvm::report_fatal_error("could not retrieve symbol address");
 
+  mlir::Value isPresent;
+  if (Fortran::semantics::IsOptional(sym))
+    isPresent =
+        builder.create<fir::IsPresentOp>(loc, builder.getI1Type(), rawInput);
+
   if (auto boxTy =
           fir::unwrapRefType(symAddr.getType()).dyn_cast<fir::BaseBoxType>()) {
     if (boxTy.getEleTy().isa<fir::RecordType>())
@@ -638,8 +651,6 @@ getDataOperandBaseAddr(Fortran::lower::AbstractConverter &converter,
     // `fir.ref<fir.class<T>>` type.
     if (symAddr.getType().isa<fir::ReferenceType>()) {
       if (Fortran::semantics::IsOptional(sym)) {
-        mlir::Value isPresent =
-            builder.create<fir::IsPresentOp>(loc, builder.getI1Type(), symAddr);
         mlir::Value addr =
             builder.genIfOp(loc, {boxTy}, isPresent, /*withElseRegion=*/true)
                 .genThen([&]() {
@@ -652,14 +663,13 @@ getDataOperandBaseAddr(Fortran::lower::AbstractConverter &converter,
                   builder.create<fir::ResultOp>(loc, mlir::ValueRange{absent});
                 })
                 .getResults()[0];
-        return AddrAndBoundsInfo(addr, isPresent);
+        return AddrAndBoundsInfo(addr, rawInput, isPresent);
       }
       mlir::Value addr = builder.create<fir::LoadOp>(loc, symAddr);
-      return AddrAndBoundsInfo(addr);
-      ;
+      return AddrAndBoundsInfo(addr, rawInput, isPresent);
     }
   }
-  return AddrAndBoundsInfo(symAddr);
+  return AddrAndBoundsInfo(symAddr, rawInput, isPresent);
 }
 
 template <typename BoundsOp, typename BoundsType>
@@ -807,7 +817,7 @@ genBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc,
              Fortran::lower::StatementContext &stmtCtx,
              const std::list<Fortran::parser::SectionSubscript> &subscripts,
              std::stringstream &asFortran, fir::ExtendedValue &dataExv,
-             bool dataExvIsAssumedSize, mlir::Value baseAddr,
+             bool dataExvIsAssumedSize, AddrAndBoundsInfo &info,
              bool treatIndexAsSection = false) {
   int dimension = 0;
   mlir::Type idxTy = builder.getIndexType();
@@ -831,11 +841,30 @@ genBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc,
       mlir::Value stride = one;
       bool strideInBytes = false;
 
-      if (fir::unwrapRefType(baseAddr.getType()).isa<fir::BaseBoxType>()) {
-        mlir::Value d = builder.createIntegerConstant(loc, idxTy, dimension);
-        auto dimInfo = builder.create<fir::BoxDimsOp>(loc, idxTy, idxTy, idxTy,
-                                                      baseAddr, d);
-        stride = dimInfo.getByteStride();
+      if (fir::unwrapRefType(info.addr.getType()).isa<fir::BaseBoxType>()) {
+        if (info.isPresent) {
+          stride =
+              builder
+                  .genIfOp(loc, idxTy, info.isPresent, /*withElseRegion=*/true)
+                  .genThen([&]() {
+                    mlir::Value d =
+                        builder.createIntegerConstant(loc, idxTy, dimension);
+                    auto dimInfo = builder.create<fir::BoxDimsOp>(
+                        loc, idxTy, idxTy, idxTy, info.addr, d);
+                    builder.create<fir::ResultOp>(loc, dimInfo.getByteStride());
+                  })
+                  .genElse([&] {
+                    mlir::Value zero =
+                        builder.createIntegerConstant(loc, idxTy, 0);
+                    builder.create<fir::ResultOp>(loc, zero);
+                  })
+                  .getResults()[0];
+        } else {
+          mlir::Value d = builder.createIntegerConstant(loc, idxTy, dimension);
+          auto dimInfo = builder.create<fir::BoxDimsOp>(loc, idxTy, idxTy,
+                                                        idxTy, info.addr, d);
+          stride = dimInfo.getByteStride();
+        }
         strideInBytes = true;
       }
 
@@ -919,7 +948,26 @@ genBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc,
           }
         }
 
-        extent = fir::factory::readExtent(builder, loc, dataExv, dimension);
+        if (info.isPresent &&
+            fir::unwrapRefType(info.addr.getType()).isa<fir::BaseBoxType>()) {
+          extent =
+              builder
+                  .genIfOp(loc, idxTy, info.isPresent, /*withElseRegion=*/true)
+                  .genThen([&]() {
+                    mlir::Value ext = fir::factory::readExtent(
+                        builder, loc, dataExv, dimension);
+                    builder.create<fir::ResultOp>(loc, ext);
+                  })
+                  .genElse([&] {
+                    mlir::Value zero =
+                        builder.createIntegerConstant(loc, idxTy, 0);
+                    builder.create<fir::ResultOp>(loc, zero);
+                  })
+                  .getResults()[0];
+        } else {
+          extent = fir::factory::readExtent(builder, loc, dataExv, dimension);
+        }
+
         if (dataExvIsAssumedSize && dimension + 1 == dataExvRank) {
           extent = zero;
           if (ubound && lbound) {
@@ -976,6 +1024,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                   dataExv = converter.genExprAddr(operandLocation, *exprBase,
                                                   stmtCtx);
                   info.addr = fir::getBase(dataExv);
+                  info.rawInput = info.addr;
                   asFortran << (*exprBase).AsFortran();
                 } else {
                   const Fortran::parser::Name &name =
@@ -993,7 +1042,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                   bounds = genBoundsOps<BoundsOp, BoundsType>(
                       builder, operandLocation, converter, stmtCtx,
                       arrayElement->subscripts, asFortran, dataExv,
-                      dataExvIsAssumedSize, info.addr, treatIndexAsSection);
+                      dataExvIsAssumedSize, info, treatIndexAsSection);
                 }
                 asFortran << ')';
               } else if (auto structComp = Fortran::parser::Unwrap<
@@ -1001,6 +1050,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                 fir::ExtendedValue compExv =
                     converter.genExprAddr(operandLocation, *expr, stmtCtx);
                 info.addr = fir::getBase(compExv);
+                info.rawInput = info.addr;
                 if (fir::unwrapRefType(info.addr.getType())
                         .isa<fir::SequenceType>())
                   bounds = genBaseBoundsOps<BoundsOp, BoundsType>(
@@ -1012,7 +1062,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                     *Fortran::parser::GetLastName(*structComp).symbol);
                 if (isOptional)
                   info.isPresent = builder.create<fir::IsPresentOp>(
-                      operandLocation, builder.getI1Type(), info.addr);
+                      operandLocation, builder.getI1Type(), info.rawInput);
 
                 if (auto loadOp = mlir::dyn_cast_or_null<fir::LoadOp>(
                         info.addr.getDefiningOp())) {
@@ -1020,6 +1070,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                       fir::isPointerType(loadOp.getType()))
                     info.addr = builder.create<fir::BoxAddrOp>(operandLocation,
                                                                info.addr);
+                  info.rawInput = info.addr;
                 }
 
                 // If the component is an allocatable or pointer the result of
@@ -1029,6 +1080,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                 if (auto boxAddrOp = mlir::dyn_cast_or_null<fir::BoxAddrOp>(
                         info.addr.getDefiningOp())) {
                   info.addr = boxAddrOp.getVal();
+                  info.rawInput = info.addr;
                   bounds = genBoundsOpsFromBox<BoundsOp, BoundsType>(
                       builder, operandLocation, converter, compExv, info);
                 }
@@ -1043,6 +1095,7 @@ AddrAndBoundsInfo gatherDataOperandAddrAndBounds(
                   fir::ExtendedValue compExv =
                       converter.genExprAddr(operandLocation, *expr, stmtCtx);
                   info.addr = fir::getBase(compExv);
+                  info.rawInput = info.addr;
                   asFortran << (*expr).AsFortran();
                 } else if (const auto *dataRef{
                                std::get_if<Fortran::parser::DataRef>(
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 43f54c6d2a71bb..6ae270f63f5cf4 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -67,9 +67,12 @@ static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc,
   mlir::Value varPtrPtr;
   if (auto boxTy = baseAddr.getType().dyn_cast<fir::BaseBoxType>()) {
     if (isPresent) {
+      mlir::Type ifRetTy = boxTy.getEleTy();
+      if (!fir::isa_ref_type(ifRetTy))
+        ifRetTy = fir::ReferenceType::get(ifRetTy);
       baseAddr =
           builder
-              .genIfOp(loc, {boxTy.getEleTy()}, isPresent,
+              .genIfOp(loc, {ifRetTy}, isPresent,
                        /*withElseRegion=*/true)
               .genThen([&]() {
                 mlir::Value boxAddr =
@@ -78,7 +81,7 @@ static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc,
               })
               .genElse([&] {
                 mlir::Value absent =
-                    builder.create<fir::AbsentOp>(loc, boxTy.getEleTy());
+                    builder.create<fir::AbsentOp>(loc, ifRetTy);
                 builder.create<fir::ResultOp>(loc, mlir::ValueRange{absent});
               })
               .getResults()[0];
@@ -295,9 +298,16 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
                                        asFortran, bounds,
                                        /*treatIndexAsSection=*/true);
 
-    Op op = createDataEntryOp<Op>(
-        builder, operandLocation, info.addr, asFortran, bounds, structured,
-        implicit, dataClause, info.addr.getType(), info.isPresent);
+    // If the input value is optional and is not a descriptor, we use the
+    // rawInput directly.
+    mlir::Value baseAddr =
+        ((info.addr.getType() != fir::unwrapRefType(info.rawInput.getType())) &&
+         info.isPresent)
+            ? info.rawInput
+            : info.addr;
+    Op op = createDataEntryOp<Op>(builder, operandLocation, baseAddr, asFortran,
+                                  bounds, structured, implicit, dataClause,
+                                  baseAddr.getType(), info.isPresent);
     dataOperands.push_back(op.getAccPtr());
   }
 }
diff --git a/flang/test/Lower/OpenACC/acc-bounds.f90 b/flang/test/Lower/OpenACC/acc-bounds.f90
index bd96bc8bcba359..df97cbcd187d2b 100644
--- a/flang/test/Lower/OpenACC/acc-bounds.f90
+++ b/flang/test/Lower/OpenACC/acc-bounds.f90
@@ -126,8 +126,8 @@ subroutine acc_optional_data(a)
   
 ! CHECK-LABEL: func.func @_QMopenacc_boundsPacc_optional_data(
 ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>> {fir.bindc_name = "a", fir.optional}) {
-! CHECK: %[[ARG0_DECL:.*]]:2 = hlfir.declare %arg0 {fortran_attrs = #fir.var_attrs<optional, pointer>, uniq_name = "_QMopenacc_boundsFacc_optional_dataEa"} : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>, !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>)
-! CHECK: %[[IS_PRESENT:.*]] = fir.is_present %[[ARG0_DECL]]#0 : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> i1
+! CHECK: %[[ARG0_DECL:.*]]:2 = hlfir.declare %[[ARG0]] {fortran_attrs = #fir.var_attrs<optional, pointer>, uniq_name = "_QMopenacc_boundsFacc_optional_dataEa"} : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>, !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>)
+! CHECK: %[[IS_PRESENT:.*]] = fir.is_present %[[ARG0_DECL]]#1 : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> i1
 ! CHECK: %[[BOX:.*]] = fir.if %[[IS_PRESENT]] -> (!fir.box<!fir.ptr<!fir.array<?xf32>>>) {
 ! CHECK:   %[[LOAD:.*]] = fir.load %[[ARG0_DECL]]#0 : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
 ! CHECK:   fir.result %[[LOAD]] : !fir.box<!fir.ptr<!fir.array<?xf32>>>
@@ -153,4 +153,38 @@ subroutine acc_optional_data(a)
 ! CHECK: %[[ATTACH:.*]] = acc.attach varPtr(%[[BOX_ADDR]] : !fir.ptr<!fir.array<?xf32>>) bounds(%[[BOUND]]) -> !fir.ptr<!fir.array<?xf32>> {name = "a"}
 ! CHECK: acc.data dataOperands(%[[ATTACH]] : !fir.ptr<!fir.array<?xf32>>)
 
+  subroutine acc_optional_data2(a, n)
+    integer :: n
+    real, optional :: a(n)
+    !$acc data no_create(a)
+    !$acc end data
+  end subroutine
+
+! CHECK-LABEL: func.func @_QMopenacc_boundsPacc_optional_data2(
+! CHECK-SAME: %[[A:.*]]: !fir.ref<!fir.array<?xf32>> {fir.bindc_name = "a", fir.optional}, %[[N:.*]]: !fir.ref<i32> {fir.bindc_name = "n"}) {
+! CHECK: %[[DECL_A:.*]]:2 = hlfir.declare %[[A]](%{{.*}}) {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QMopenacc_boundsFacc_optional_data2Ea"} : (!fir.ref<!fir.array<?xf32>>, !fir.shape<1>) -> (!fir.box<!fir.array<?xf32>>, !fir.ref<!fir.array<?xf32>>)
+! CHECK: %[[NO_CREATE:.*]] = acc.nocreate varPtr(%[[DECL_A]]#1 : !fir.ref<!fir.array<?xf32>>) bounds(%10) -> !fir.ref<!fir.array<?xf32>> {name = "a"}
+! CHECK: acc.data dataOperands(%[[NO_CREATE]] : !fir.ref<!fir.array<?xf32>>) {
+
+  subroutine acc_optional_data3(a, n)
+    integer :: n
+    real, optional :: a(n)
+    !$acc data no_create(a(1:n))
+    !$acc end data
+  end subroutine
+
+! CHECK-LABEL: func.func @_QMopenacc_boundsPacc_optional_data3(
+! CHECK-SAME: %[[A:.*]]: !fir.ref<!fir.array<?xf32>> {fir.bindc_name = "a", fir.optional}, %[[N:.*]]: !fir.ref<i32> {fir.bindc_name = "n"}) {
+! CHECK: %[[DECL_A:.*]]:2 = hlfir.declare %[[A]](%{{.*}}) {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QMopenacc_boundsFacc_optional_data3Ea"} : (!fir.ref<!fir.array<?xf32>>, !fir.shape<1>) -> (!fir.box<!fir.array<?xf32>>, !fir.ref<!fir.array<?xf32>>)
+! CHECK: %[[PRES:.*]] = fir.is_present %[[DECL_A]]#1 : (!fir.ref<!fir.array<?xf32>>) -> i1
+! CHECK: %[[STRIDE:.*]] = fir.if %[[PRES]] -> (index) {
+! CHECK:   %[[DIMS:.*]]:3 = fir.box_dims %[[DECL_A]]#0, %c0{{.*}} : (!fir.box<!fir.array<?xf32>>, index) -> (index, index, index)
+! CHECK:   fir.result %[[DIMS]]#2 : index
+! CHECK: } else {
+! CHECK:   fir.result %c0{{.*}} : index
+! CHECK: }
+! CHECK: %[[BOUNDS:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%{{.*}} : index) extent(%{{.*}} : index) stride(%[[STRIDE]] : index) startIdx(%c1 : index) {strideInBytes = true}
+! CHECK: %[[NOCREATE:.*]] = acc.nocreate varPtr(%[[DECL_A]]#1 : !fir.ref<!fir.array<?xf32>>) bounds(%14) -> !fir.ref<!fir.array<?xf32>> {name = "a(1:n)"}
+! CHECK: acc.data dataOperands(%[[NOCREATE]] : !fir.ref<!fir.array<?xf32>>) {
+
 end module

>From 3d71314243ed433811f525073bdeb7174d4561ef Mon Sep 17 00:00:00 2001
From: Adrian Prantl <aprantl at apple.com>
Date: Thu, 8 Feb 2024 08:54:52 -0800
Subject: [PATCH 53/72] Add missing textual header to module map

---
 clang/include/module.modulemap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/include/module.modulemap b/clang/include/module.modulemap
index 794526bc289c0b..9285595af11baf 100644
--- a/clang/include/module.modulemap
+++ b/clang/include/module.modulemap
@@ -81,6 +81,7 @@ module Clang_Basic {
   textual header "clang/Basic/RISCVVTypes.def"
   textual header "clang/Basic/Sanitizers.def"
   textual header "clang/Basic/TargetCXXABI.def"
+  textual header "clang/Basic/TargetOSMacros.def"
   textual header "clang/Basic/TransformTypeTraits.def"
   textual header "clang/Basic/TokenKinds.def"
   textual header "clang/Basic/WebAssemblyReferenceTypes.def"

>From b7142dea623811d064b98e20685830a327eee57f Mon Sep 17 00:00:00 2001
From: Adrian Prantl <aprantl at apple.com>
Date: Thu, 8 Feb 2024 09:03:47 -0800
Subject: [PATCH 54/72] Fix a truly strange triple in testcase

---
 lldb/test/API/macosx/universal/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lldb/test/API/macosx/universal/Makefile b/lldb/test/API/macosx/universal/Makefile
index 8712fdecf56666..7d4762f240874c 100644
--- a/lldb/test/API/macosx/universal/Makefile
+++ b/lldb/test/API/macosx/universal/Makefile
@@ -14,7 +14,7 @@ testit.x86_64: testit.x86_64.o
 	$(CC) -isysroot $(SDKROOT) -target x86_64-apple-macosx10.9 -o testit.x86_64 $<
 
 testit.x86_64h.o: main.c
-	$(CC) -isysroot $(SDKROOT) -g -O0 -target x86_64h-apple-macosx10.9-apple-macosx10.9-apple-macosx10.9-apple-macosx10.9 -c -o testit.x86_64h.o $<
+	$(CC) -isysroot $(SDKROOT) -g -O0 -target x86_64h-apple-macosx10.9 -c -o testit.x86_64h.o $<
 
 testit.x86_64.o: main.c
 	$(CC) -isysroot $(SDKROOT) -g -O0 -target x86_64-apple-macosx10.9 -c -o testit.x86_64.o $<

>From f84ec903a53944a0d1a42148ff859e51ccdc5157 Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 16:40:48 +0000
Subject: [PATCH 55/72] [DebugInfo][RemoveDIs] Turn on non-instrinsic
 debug-info by default

This patch causes all variable-location debug-info to be converted into
non-intrinsic records as they passes through the optimisation /
instrumentation passes. There's a brief introduction here [0] and a more
detailed thread on what this means on discourse at [1].

If this commit is breaking your downstream tests, please see comment 12 in
[1], which documents the kind of variation in tests we'd expect to see from
this change and what to do about it.

[0] https://llvm.org/docs/RemoveDIsDebugInfo.html
[1] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
---
 llvm/lib/IR/BasicBlock.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index fe9d0d08c5fe97..bf02eba9fb448d 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -34,7 +34,7 @@ cl::opt<bool>
     UseNewDbgInfoFormat("experimental-debuginfo-iterators",
                         cl::desc("Enable communicating debuginfo positions "
                                  "through iterators, eliminating intrinsics"),
-                        cl::init(false));
+                        cl::init(true));
 
 DPMarker *BasicBlock::createMarker(Instruction *I) {
   assert(IsNewDbgInfoFormat &&

>From f16c144f8e81f121a23a5e28ce59287070300f74 Mon Sep 17 00:00:00 2001
From: Jason Molenda <jmolenda at apple.com>
Date: Thu, 8 Feb 2024 09:16:12 -0800
Subject: [PATCH 56/72] [lldb] Fix printf formatting of std::time_t seconds
 (#81078)

This formatter
https://github.com/llvm/llvm-project/pull/78609
was originally passing the signed seconds (which can refer to times in
the past) with an unsigned printf formatter, and had tests that expected
to see negative values from the printf which always failed on macOS. I'm
not clear how they ever passed on any platform.

Fix the printf to print seconds as a signed value, and re-enable the
tests.
---
 .../Plugins/Language/CPlusPlus/LibCxx.cpp     |  6 ++--
 .../chrono/TestDataFormatterLibcxxChrono.py   | 30 +++++++++----------
 2 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
index a7d7066bb2c11d..7893aa7cc1f9df 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
@@ -1108,7 +1108,7 @@ bool lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
 
   const std::time_t seconds = ptr_sp->GetValueAsSigned(0);
   if (seconds < chrono_timestamp_min || seconds > chrono_timestamp_max)
-    stream.Printf("timestamp=%" PRIu64 " s", static_cast<uint64_t>(seconds));
+    stream.Printf("timestamp=%" PRId64 " s", static_cast<int64_t>(seconds));
   else {
     std::array<char, 128> str;
     std::size_t size =
@@ -1116,8 +1116,8 @@ bool lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
     if (size == 0)
       return false;
 
-    stream.Printf("date/time=%s timestamp=%" PRIu64 " s", str.data(),
-                  static_cast<uint64_t>(seconds));
+    stream.Printf("date/time=%s timestamp=%" PRId64 " s", str.data(),
+                  static_cast<int64_t>(seconds));
   }
 
   return true;
diff --git a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/chrono/TestDataFormatterLibcxxChrono.py b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/chrono/TestDataFormatterLibcxxChrono.py
index 9706f9e94e922f..a90fb828d121a7 100644
--- a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/chrono/TestDataFormatterLibcxxChrono.py
+++ b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/chrono/TestDataFormatterLibcxxChrono.py
@@ -54,17 +54,16 @@ def test_with_run_command(self):
             substrs=["ss_0 = date/time=1970-01-01T00:00:00Z timestamp=0 s"],
         )
 
-        # FIXME disabled temporarily, macOS is printing this as an unsigned?
-        #self.expect(
-        #    "frame variable ss_neg_date_time",
-        #    substrs=[
-        #        "ss_neg_date_time = date/time=-32767-01-01T00:00:00Z timestamp=-1096193779200 s"
-        #    ],
-        #)
-        #self.expect(
-        #    "frame variable ss_neg_seconds",
-        #    substrs=["ss_neg_seconds = timestamp=-1096193779201 s"],
-        #)
+        self.expect(
+            "frame variable ss_neg_date_time",
+            substrs=[
+                "ss_neg_date_time = date/time=-32767-01-01T00:00:00Z timestamp=-1096193779200 s"
+            ],
+        )
+        self.expect(
+            "frame variable ss_neg_seconds",
+            substrs=["ss_neg_seconds = timestamp=-1096193779201 s"],
+        )
 
         self.expect(
             "frame variable ss_pos_date_time",
@@ -77,11 +76,10 @@ def test_with_run_command(self):
             substrs=["ss_pos_seconds = timestamp=971890963200 s"],
         )
 
-        # FIXME disabled temporarily, macOS is printing this as an unsigned?
-        #self.expect(
-        #    "frame variable ss_min",
-        #    substrs=["ss_min = timestamp=-9223372036854775808 s"],
-        #)
+        self.expect(
+            "frame variable ss_min",
+            substrs=["ss_min = timestamp=-9223372036854775808 s"],
+        )
         self.expect(
             "frame variable ss_max",
             substrs=["ss_max = timestamp=9223372036854775807 s"],

>From 3c2b0cd338fe6158c4a89b9150ee401a31af79c2 Mon Sep 17 00:00:00 2001
From: Dave Lee <davelee.com at gmail.com>
Date: Thu, 8 Feb 2024 09:32:12 -0800
Subject: [PATCH 57/72] [lldb] Refactor GetFormatFromCString to always check
 for partial matches  (NFC) (#81018)

Refactors logic in `ParseInternal` that was previously calling
`GetFormatFromCString` twice, once with `partial_match_ok` set to false,
and the second time set to true.

With this change, lldb formats (ie `%@`, `%S`, etc) are checked first.
If a format is not one of those, then `GetFormatFromCString` is called
once, and now always checks for partial matches.
---
 .../lldb/DataFormatters/FormatManager.h       |  2 +-
 lldb/source/Core/FormatEntity.cpp             | 26 ++++++++-----------
 lldb/source/DataFormatters/FormatManager.cpp  | 17 +++++-------
 lldb/source/Interpreter/OptionArgParser.cpp   |  3 +--
 4 files changed, 20 insertions(+), 28 deletions(-)

diff --git a/lldb/include/lldb/DataFormatters/FormatManager.h b/lldb/include/lldb/DataFormatters/FormatManager.h
index 986614f0c5e431..db2fe99c44cafc 100644
--- a/lldb/include/lldb/DataFormatters/FormatManager.h
+++ b/lldb/include/lldb/DataFormatters/FormatManager.h
@@ -138,7 +138,7 @@ class FormatManager : public IFormatChangeListener {
   }
 
   static bool GetFormatFromCString(const char *format_cstr,
-                                   bool partial_match_ok, lldb::Format &format);
+                                   lldb::Format &format);
 
   static char GetFormatAsFormatChar(lldb::Format format);
 
diff --git a/lldb/source/Core/FormatEntity.cpp b/lldb/source/Core/FormatEntity.cpp
index 3c665c2eb2133b..fa5eadc6ff4e9a 100644
--- a/lldb/source/Core/FormatEntity.cpp
+++ b/lldb/source/Core/FormatEntity.cpp
@@ -2151,11 +2151,7 @@ static Status ParseInternal(llvm::StringRef &format, Entry &parent_entry,
             if (entry.printf_format.find('%') == std::string::npos) {
               bool clear_printf = false;
 
-              if (FormatManager::GetFormatFromCString(
-                      entry.printf_format.c_str(), false, entry.fmt)) {
-                // We have an LLDB format, so clear the printf format
-                clear_printf = true;
-              } else if (entry.printf_format.size() == 1) {
+              if (entry.printf_format.size() == 1) {
                 switch (entry.printf_format[0]) {
                 case '@': // if this is an @ sign, print ObjC description
                   entry.number = ValueObject::
@@ -2198,20 +2194,20 @@ static Status ParseInternal(llvm::StringRef &format, Entry &parent_entry,
                       eValueObjectRepresentationStyleExpressionPath;
                   clear_printf = true;
                   break;
-                default:
+                }
+              }
+
+              if (entry.number == 0) {
+                if (FormatManager::GetFormatFromCString(
+                        entry.printf_format.c_str(), entry.fmt)) {
+                  clear_printf = true;
+                } else if (entry.printf_format == "tid") {
+                  verify_is_thread_id = true;
+                } else {
                   error.SetErrorStringWithFormat("invalid format: '%s'",
                                                  entry.printf_format.c_str());
                   return error;
                 }
-              } else if (FormatManager::GetFormatFromCString(
-                             entry.printf_format.c_str(), true, entry.fmt)) {
-                clear_printf = true;
-              } else if (entry.printf_format == "tid") {
-                verify_is_thread_id = true;
-              } else {
-                error.SetErrorStringWithFormat("invalid format: '%s'",
-                                               entry.printf_format.c_str());
-                return error;
               }
 
               // Our format string turned out to not be a printf style format
diff --git a/lldb/source/DataFormatters/FormatManager.cpp b/lldb/source/DataFormatters/FormatManager.cpp
index f1f135de32ca87..092fa3c8ce496d 100644
--- a/lldb/source/DataFormatters/FormatManager.cpp
+++ b/lldb/source/DataFormatters/FormatManager.cpp
@@ -91,7 +91,7 @@ static bool GetFormatFromFormatChar(char format_char, Format &format) {
 }
 
 static bool GetFormatFromFormatName(llvm::StringRef format_name,
-                                    bool partial_match_ok, Format &format) {
+                                    Format &format) {
   uint32_t i;
   for (i = 0; i < g_num_format_infos; ++i) {
     if (format_name.equals_insensitive(g_format_infos[i].format_name)) {
@@ -100,13 +100,11 @@ static bool GetFormatFromFormatName(llvm::StringRef format_name,
     }
   }
 
-  if (partial_match_ok) {
-    for (i = 0; i < g_num_format_infos; ++i) {
-      if (llvm::StringRef(g_format_infos[i].format_name)
-              .starts_with_insensitive(format_name)) {
-        format = g_format_infos[i].format;
-        return true;
-      }
+  for (i = 0; i < g_num_format_infos; ++i) {
+    if (llvm::StringRef(g_format_infos[i].format_name)
+            .starts_with_insensitive(format_name)) {
+      format = g_format_infos[i].format;
+      return true;
     }
   }
   format = eFormatInvalid;
@@ -124,7 +122,6 @@ void FormatManager::Changed() {
 }
 
 bool FormatManager::GetFormatFromCString(const char *format_cstr,
-                                         bool partial_match_ok,
                                          lldb::Format &format) {
   bool success = false;
   if (format_cstr && format_cstr[0]) {
@@ -134,7 +131,7 @@ bool FormatManager::GetFormatFromCString(const char *format_cstr,
         return true;
     }
 
-    success = GetFormatFromFormatName(format_cstr, partial_match_ok, format);
+    success = GetFormatFromFormatName(format_cstr, format);
   }
   if (!success)
     format = eFormatInvalid;
diff --git a/lldb/source/Interpreter/OptionArgParser.cpp b/lldb/source/Interpreter/OptionArgParser.cpp
index d13805a75ffbf7..75ccad87467e95 100644
--- a/lldb/source/Interpreter/OptionArgParser.cpp
+++ b/lldb/source/Interpreter/OptionArgParser.cpp
@@ -93,8 +93,7 @@ Status OptionArgParser::ToFormat(const char *s, lldb::Format &format,
         *byte_size_ptr = 0;
     }
 
-    const bool partial_match_ok = true;
-    if (!FormatManager::GetFormatFromCString(s, partial_match_ok, format)) {
+    if (!FormatManager::GetFormatFromCString(s, format)) {
       StreamString error_strm;
       error_strm.Printf(
           "Invalid format character or name '%s'. Valid values are:\n", s);

>From 9339b466df9bcfeba2001ce72fd593c6d97072f6 Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Thu, 8 Feb 2024 17:31:06 +0000
Subject: [PATCH 58/72] [X86] X86FixupVectorConstants - use explicit register
 bitwidth for the loaded vector instead of using constant pool bitwidth

Fixes #81136 - we might be loading from a constant pool entry wider than the destination register bitwidth, affecting the vextload scale calculation.

ConvertToBroadcastAVX512 doesn't yet set an explicit bitwidth (it will default to the constant pool bitwidth) due to difficulties in looking up the original register width through the fold tables, but as we only use rebuildSplatCst this shouldn't cause any miscompilations, although it might prevent folding to broadcast if only the lower bits match a splatable pattern.
---
 .../Target/X86/X86FixupVectorConstants.cpp    | 35 +++++++++++--------
 llvm/test/CodeGen/X86/pr81136.ll              |  3 +-
 2 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 32ca9c164c579b..da7dcbb25a9577 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -226,6 +226,7 @@ static Constant *rebuildConstant(LLVMContext &Ctx, Type *SclTy,
 // width, built up of potentially smaller scalar values.
 static Constant *rebuildSplatCst(const Constant *C, unsigned /*NumBits*/,
                                  unsigned /*NumElts*/, unsigned SplatBitWidth) {
+  // TODO: Truncate to NumBits once ConvertToBroadcastAVX512 support this.
   std::optional<APInt> Splat = getSplatableConstant(C, SplatBitWidth);
   if (!Splat)
     return nullptr;
@@ -328,7 +329,8 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     std::function<Constant *(const Constant *, unsigned, unsigned, unsigned)>
         RebuildConstant;
   };
-  auto FixupConstant = [&](ArrayRef<FixupEntry> Fixups, unsigned OperandNo) {
+  auto FixupConstant = [&](ArrayRef<FixupEntry> Fixups, unsigned RegBitWidth,
+                           unsigned OperandNo) {
 #ifdef EXPENSIVE_CHECKS
     assert(llvm::is_sorted(Fixups,
                            [](const FixupEntry &A, const FixupEntry &B) {
@@ -340,7 +342,8 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     assert(MI.getNumOperands() >= (OperandNo + X86::AddrNumOperands) &&
            "Unexpected number of operands!");
     if (auto *C = X86::getConstantFromPool(MI, OperandNo)) {
-      unsigned RegBitWidth = C->getType()->getPrimitiveSizeInBits();
+      RegBitWidth =
+          RegBitWidth ? RegBitWidth : C->getType()->getPrimitiveSizeInBits();
       for (const FixupEntry &Fixup : Fixups) {
         if (Fixup.Op) {
           // Construct a suitable constant and adjust the MI to use the new
@@ -377,7 +380,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     // TODO: SSE3 MOVDDUP Handling
     return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
                           {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}},
-                         1);
+                         128, 1);
   case X86::VMOVAPDrm:
   case X86::VMOVAPSrm:
   case X86::VMOVUPDrm:
@@ -386,7 +389,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
                           {X86::VBROADCASTSSrm, 1, 32, rebuildSplatCst},
                           {X86::VMOVSDrm, 1, 64, rebuildZeroUpperCst},
                           {X86::VMOVDDUPrm, 1, 64, rebuildSplatCst}},
-                         1);
+                         128, 1);
   case X86::VMOVAPDYrm:
   case X86::VMOVAPSYrm:
   case X86::VMOVUPDYrm:
@@ -394,7 +397,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
     return FixupConstant({{X86::VBROADCASTSSYrm, 1, 32, rebuildSplatCst},
                           {X86::VBROADCASTSDYrm, 1, 64, rebuildSplatCst},
                           {X86::VBROADCASTF128rm, 1, 128, rebuildSplatCst}},
-                         1);
+                         256, 1);
   case X86::VMOVAPDZ128rm:
   case X86::VMOVAPSZ128rm:
   case X86::VMOVUPDZ128rm:
@@ -403,7 +406,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
                           {X86::VBROADCASTSSZ128rm, 1, 32, rebuildSplatCst},
                           {X86::VMOVSDZrm, 1, 64, rebuildZeroUpperCst},
                           {X86::VMOVDDUPZ128rm, 1, 64, rebuildSplatCst}},
-                         1);
+                         128, 1);
   case X86::VMOVAPDZ256rm:
   case X86::VMOVAPSZ256rm:
   case X86::VMOVUPDZ256rm:
@@ -412,7 +415,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {{X86::VBROADCASTSSZ256rm, 1, 32, rebuildSplatCst},
          {X86::VBROADCASTSDZ256rm, 1, 64, rebuildSplatCst},
          {X86::VBROADCASTF32X4Z256rm, 1, 128, rebuildSplatCst}},
-        1);
+        256, 1);
   case X86::VMOVAPDZrm:
   case X86::VMOVAPSZrm:
   case X86::VMOVUPDZrm:
@@ -421,7 +424,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
                           {X86::VBROADCASTSDZrm, 1, 64, rebuildSplatCst},
                           {X86::VBROADCASTF32X4rm, 1, 128, rebuildSplatCst},
                           {X86::VBROADCASTF64X4rm, 1, 256, rebuildSplatCst}},
-                         1);
+                         512, 1);
     /* Integer Loads */
   case X86::MOVDQArm:
   case X86::MOVDQUrm: {
@@ -440,7 +443,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {HasSSE41 ? X86::PMOVZXWDrm : 0, 4, 16, rebuildZExtCst},
         {HasSSE41 ? X86::PMOVSXDQrm : 0, 2, 32, rebuildSExtCst},
         {HasSSE41 ? X86::PMOVZXDQrm : 0, 2, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 128, 1);
   }
   case X86::VMOVDQArm:
   case X86::VMOVDQUrm: {
@@ -465,7 +468,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {X86::VPMOVZXWDrm, 4, 16, rebuildZExtCst},
         {X86::VPMOVSXDQrm, 2, 32, rebuildSExtCst},
         {X86::VPMOVZXDQrm, 2, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 128, 1);
   }
   case X86::VMOVDQAYrm:
   case X86::VMOVDQUYrm: {
@@ -490,7 +493,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {HasAVX2 ? X86::VPMOVZXWDYrm : 0, 8, 16, rebuildZExtCst},
         {HasAVX2 ? X86::VPMOVSXDQYrm : 0, 4, 32, rebuildSExtCst},
         {HasAVX2 ? X86::VPMOVZXDQYrm : 0, 4, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 256, 1);
   }
   case X86::VMOVDQA32Z128rm:
   case X86::VMOVDQA64Z128rm:
@@ -515,7 +518,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {X86::VPMOVZXWDZ128rm, 4, 16, rebuildZExtCst},
         {X86::VPMOVSXDQZ128rm, 2, 32, rebuildSExtCst},
         {X86::VPMOVZXDQZ128rm, 2, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 128, 1);
   }
   case X86::VMOVDQA32Z256rm:
   case X86::VMOVDQA64Z256rm:
@@ -539,7 +542,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {X86::VPMOVZXWDZ256rm, 8, 16, rebuildZExtCst},
         {X86::VPMOVSXDQZ256rm, 4, 32, rebuildSExtCst},
         {X86::VPMOVZXDQZ256rm, 4, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 256, 1);
   }
   case X86::VMOVDQA32Zrm:
   case X86::VMOVDQA64Zrm:
@@ -564,7 +567,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
         {X86::VPMOVZXWDZrm, 16, 16, rebuildZExtCst},
         {X86::VPMOVSXDQZrm, 8, 32, rebuildSExtCst},
         {X86::VPMOVZXDQZrm, 8, 32, rebuildZExtCst}};
-    return FixupConstant(Fixups, 1);
+    return FixupConstant(Fixups, 512, 1);
   }
   }
 
@@ -592,7 +595,9 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
       unsigned OpNo = OpBcst32 == 0 ? OpNoBcst64 : OpNoBcst32;
       FixupEntry Fixups[] = {{(int)OpBcst32, 32, 32, rebuildSplatCst},
                              {(int)OpBcst64, 64, 64, rebuildSplatCst}};
-      return FixupConstant(Fixups, OpNo);
+      // TODO: Add support for RegBitWidth, but currently rebuildSplatCst
+      // doesn't require it (defaults to Constant::getPrimitiveSizeInBits).
+      return FixupConstant(Fixups, 0, OpNo);
     }
     return false;
   };
diff --git a/llvm/test/CodeGen/X86/pr81136.ll b/llvm/test/CodeGen/X86/pr81136.ll
index 8843adca0933c2..b4ac3fc783e0a9 100644
--- a/llvm/test/CodeGen/X86/pr81136.ll
+++ b/llvm/test/CodeGen/X86/pr81136.ll
@@ -1,7 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
 ; RUN: llc < %s -mtriple=x86_64-- -mcpu=btver2 | FileCheck %s
 
-; FIXME: Should be vpmovzxbq[128,1] instead of vpmovzxbd[128,1,0,0]
 define i64 @PR81136(i32 %a0, i32 %a1, ptr %a2) {
 ; CHECK-LABEL: PR81136:
 ; CHECK:       # %bb.0:
@@ -9,7 +8,7 @@ define i64 @PR81136(i32 %a0, i32 %a1, ptr %a2) {
 ; CHECK-NEXT:    vmovd %esi, %xmm1
 ; CHECK-NEXT:    vmovdqa (%rdx), %ymm2
 ; CHECK-NEXT:    vpxor %xmm3, %xmm3, %xmm3
-; CHECK-NEXT:    vpmovzxbd {{.*#+}} xmm4 = [128,1,0,0]
+; CHECK-NEXT:    vpmovzxbq {{.*#+}} xmm4 = [128,1]
 ; CHECK-NEXT:    vpcmpgtq %xmm3, %xmm4, %xmm4
 ; CHECK-NEXT:    vpcmpgtw %xmm0, %xmm1, %xmm0
 ; CHECK-NEXT:    vpcmpeqd %xmm1, %xmm1, %xmm1

>From d84ac25face69657dea2f2c1fb09310c25da1ea5 Mon Sep 17 00:00:00 2001
From: Philip Reames <preames at rivosinc.com>
Date: Thu, 8 Feb 2024 09:40:11 -0800
Subject: [PATCH 59/72] [riscv] Add test coverage in advance of a upcoming fix

This is a reduced test case for a fix for the issue identified in
https://github.com/llvm/llvm-project/issues/80910.
---
 .../rvv/fixed-vectors-buildvec-of-binop.ll    | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-buildvec-of-binop.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-buildvec-of-binop.ll
index c8531ed1f7cf60..e376688aca8a76 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-buildvec-of-binop.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-buildvec-of-binop.ll
@@ -588,3 +588,37 @@ define <8 x i32> @add_constant_rhs_8xi32_partial(<8 x i32> %vin, i32 %a, i32 %b,
   %v3 = insertelement <8 x i32> %v2, i32 %e3, i32 7
   ret <8 x i32> %v3
 }
+
+; FIXME: This is currently showing a miscompile, we effectively
+; truncate before the ashr instead of after it, so if %a or %b
+; is e.g. UINT32_MAX+1 we get different result.
+define <2 x i32> @build_vec_of_trunc_op(i64 %a, i64 %b) {
+; RV32-LABEL: build_vec_of_trunc_op:
+; RV32:       # %bb.0: # %entry
+; RV32-NEXT:    slli a1, a1, 31
+; RV32-NEXT:    srli a0, a0, 1
+; RV32-NEXT:    or a0, a0, a1
+; RV32-NEXT:    slli a3, a3, 31
+; RV32-NEXT:    srli a2, a2, 1
+; RV32-NEXT:    or a2, a2, a3
+; RV32-NEXT:    vsetivli zero, 2, e32, mf2, ta, ma
+; RV32-NEXT:    vmv.v.x v8, a0
+; RV32-NEXT:    vslide1down.vx v8, v8, a2
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: build_vec_of_trunc_op:
+; RV64:       # %bb.0: # %entry
+; RV64-NEXT:    vsetivli zero, 2, e32, mf2, ta, ma
+; RV64-NEXT:    vmv.v.x v8, a0
+; RV64-NEXT:    vslide1down.vx v8, v8, a1
+; RV64-NEXT:    vsrl.vi v8, v8, 1
+; RV64-NEXT:    ret
+entry:
+  %conv11.i = ashr i64 %a, 1
+  %conv11.2 = ashr i64 %b, 1
+  %0 = trunc i64 %conv11.i to i32
+  %1 = trunc i64 %conv11.2 to i32
+  %2 = insertelement <2 x i32> zeroinitializer, i32 %0, i64 0
+  %3 = insertelement <2 x i32> %2, i32 %1, i64 1
+  ret <2 x i32> %3
+}

>From 32fd67b5ce8f16e57b354378cebb601fc2e06708 Mon Sep 17 00:00:00 2001
From: Cooper Partin <coopp at microsoft.com>
Date: Thu, 8 Feb 2024 09:50:21 -0800
Subject: [PATCH 60/72] [DirectX] Fix HLSL bitshifts to leverage the OpenCL
 pipeline for bitshifting (#81030)

Fixes #55106

In HLSL bit shifts are defined to shift by shift size % type size. This
contains the following changes:

HLSL codegen bit shifts will be emitted as x << (y & (sizeof(x) - 1) and
bitshift masking leverages the OpenCL pipeline for this.

Tests were also added to validate this behavior.


Before this change the following was being emitted:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH at Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl = shl i32 %0, %1
  ret i32 %shl
}

After this change:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH at Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl.mask = and i32 %1, 31
  %shl = shl i32 %0, %shl.mask
  ret i32 %shl
}

---------

Co-authored-by: Cooper Partin <coopp at ntdev.microsoft.com>
---
 clang/lib/CodeGen/CGExprScalar.cpp     |  4 +--
 clang/test/CodeGenHLSL/shift-mask.hlsl | 35 ++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/shift-mask.hlsl

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index df8f71cf1d9008..fa03163bbde577 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -4168,7 +4168,7 @@ Value *ScalarExprEmitter::EmitShl(const BinOpInfo &Ops) {
   bool SanitizeBase = SanitizeSignedBase || SanitizeUnsignedBase;
   bool SanitizeExponent = CGF.SanOpts.has(SanitizerKind::ShiftExponent);
   // OpenCL 6.3j: shift values are effectively % word size of LHS.
-  if (CGF.getLangOpts().OpenCL)
+  if (CGF.getLangOpts().OpenCL || CGF.getLangOpts().HLSL)
     RHS = ConstrainShiftValue(Ops.LHS, RHS, "shl.mask");
   else if ((SanitizeBase || SanitizeExponent) &&
            isa<llvm::IntegerType>(Ops.LHS->getType())) {
@@ -4237,7 +4237,7 @@ Value *ScalarExprEmitter::EmitShr(const BinOpInfo &Ops) {
     RHS = Builder.CreateIntCast(RHS, Ops.LHS->getType(), false, "sh_prom");
 
   // OpenCL 6.3j: shift values are effectively % word size of LHS.
-  if (CGF.getLangOpts().OpenCL)
+  if (CGF.getLangOpts().OpenCL || CGF.getLangOpts().HLSL)
     RHS = ConstrainShiftValue(Ops.LHS, RHS, "shr.mask");
   else if (CGF.SanOpts.has(SanitizerKind::ShiftExponent) &&
            isa<llvm::IntegerType>(Ops.LHS->getType())) {
diff --git a/clang/test/CodeGenHLSL/shift-mask.hlsl b/clang/test/CodeGenHLSL/shift-mask.hlsl
new file mode 100644
index 00000000000000..d046efaf9c1f9c
--- /dev/null
+++ b/clang/test/CodeGenHLSL/shift-mask.hlsl
@@ -0,0 +1,35 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   dxil-pc-shadermodel6.3-library %s \
+// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+int shl32(int V, int S) {
+  return V << S;
+}
+
+// CHECK: define noundef i32 @"?shl32{{[@$?.A-Za-z0-9_]+}}"(i32 noundef %V, i32 noundef %S) #0 {
+// CHECK-DAG:  %[[Masked:.*]] = and i32 %{{.*}}, 31
+// CHECK-DAG:  %{{.*}} = shl i32 %{{.*}}, %[[Masked]]
+
+int shr32(int V, int S) {
+  return V >> S;
+}
+
+// CHECK: define noundef i32 @"?shr32{{[@$?.A-Za-z0-9_]+}}"(i32 noundef %V, i32 noundef %S) #0 {
+// CHECK-DAG:  %[[Masked:.*]] = and i32 %{{.*}}, 31
+// CHECK-DAG:  %{{.*}} = ashr i32 %{{.*}}, %[[Masked]]
+
+int64_t shl64(int64_t V, int64_t S) {
+  return V << S;
+}
+
+// CHECK: define noundef i64 @"?shl64{{[@$?.A-Za-z0-9_]+}}"(i64 noundef %V, i64 noundef %S) #0 {
+// CHECK-DAG:  %[[Masked:.*]] = and i64 %{{.*}}, 63
+// CHECK-DAG:  %{{.*}} = shl i64 %{{.*}}, %[[Masked]]
+
+int64_t shr64(int64_t V, int64_t S) {
+  return V >> S;
+}
+
+// CHECK: define noundef i64 @"?shr64{{[@$?.A-Za-z0-9_]+}}"(i64 noundef %V, i64 noundef %S) #0 {
+// CHECK-DAG:  %[[Masked:.*]] = and i64 %{{.*}}, 63
+// CHECK-DAG:  %{{.*}} = ashr i64 %{{.*}}, %[[Masked]]

>From d7a267bddfec10ca4cfd7201ef886125c7f079cc Mon Sep 17 00:00:00 2001
From: "S. Bharadwaj Yadavalli" <Bharadwaj.Yadavalli at microsoft.com>
Date: Thu, 8 Feb 2024 13:02:32 -0500
Subject: [PATCH 61/72] [DirectX][NFC] Change usage pattern *Dxil* to *DXIL*
 for uniformity (#80778)

Match DXIL TableGen class names with structure names in DXIL Emitter.
Delete unnecessary Name field.
---
 llvm/lib/Target/DirectX/DXIL.td          |  89 ++++++++--------
 llvm/lib/Target/DirectX/DXILMetadata.cpp |   8 +-
 llvm/utils/TableGen/DXILEmitter.cpp      | 125 +++++++++++------------
 3 files changed, 107 insertions(+), 115 deletions(-)

diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index aec64607e24602..3f3ace5a1a3a36 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -14,28 +14,28 @@
 include "llvm/IR/Intrinsics.td"
 
 // Abstract representation of the class a DXIL Operation belongs to.
-class DxilOpClass<string name> {
+class DXILOpClass<string name> {
   string Name = name;
 }
 
 // Abstract representation of the category a DXIL Operation belongs to
-class DxilOpCategory<string name> {
+class DXILOpCategory<string name> {
   string Name = name;
 }
 
-def UnaryClass : DxilOpClass<"Unary">;
-def BinaryClass : DxilOpClass<"Binary">;
-def FlattenedThreadIdInGroupClass : DxilOpClass<"FlattenedThreadIdInGroup">;
-def ThreadIdInGroupClass : DxilOpClass<"ThreadIdInGroup">;
-def ThreadIdClass : DxilOpClass<"ThreadId">;
-def GroupIdClass : DxilOpClass<"GroupId">;
+def UnaryClass : DXILOpClass<"Unary">;
+def BinaryClass : DXILOpClass<"Binary">;
+def FlattenedThreadIdInGroupClass : DXILOpClass<"FlattenedThreadIdInGroup">;
+def ThreadIdInGroupClass : DXILOpClass<"ThreadIdInGroup">;
+def ThreadIdClass : DXILOpClass<"ThreadId">;
+def GroupIdClass : DXILOpClass<"GroupId">;
 
-def BinaryUintCategory : DxilOpCategory<"Binary uint">;
-def UnaryFloatCategory : DxilOpCategory<"Unary float">;
-def ComputeIDCategory : DxilOpCategory<"Compute/Mesh/Amplification shader">;
+def BinaryUintCategory : DXILOpCategory<"Binary uint">;
+def UnaryFloatCategory : DXILOpCategory<"Unary float">;
+def ComputeIDCategory : DXILOpCategory<"Compute/Mesh/Amplification shader">;
 
 // The parameter description for a DXIL operation
-class DxilOpParameter<int pos, string type, string name, string doc,
+class DXILOpParameter<int pos, string type, string name, string doc,
                  bit isConstant = 0, string enumName = "",
                  int maxValue = 0> {
   int Pos = pos;               // Position in parameter list
@@ -49,16 +49,13 @@ class DxilOpParameter<int pos, string type, string name, string doc,
 }
 
 // A representation for a DXIL operation
-class DxilOperationDesc<string name> {
-  // TODO : Appears redundant. OpName should serve the same purpose
-  string Name = name; // short, unique name
-
+class DXILOperationDesc {
   string OpName = "";         // Name of DXIL operation
   int OpCode = 0;             // Unique non-negative integer associated with the operation
-  DxilOpClass  OpClass;       // Class of the operation
-  DxilOpCategory OpCategory;  // Category of the operation
+  DXILOpClass  OpClass;       // Class of the operation
+  DXILOpCategory OpCategory;  // Category of the operation
   string Doc = "";            // Description of the operation
-  list<DxilOpParameter> Params = []; // Parameter list of the operation
+  list<DXILOpParameter> Params = []; // Parameter list of the operation
   string OverloadTypes = "";  // Overload types, if applicable
   string Attributes = "";     // Attribute shorthands: rn=does not access
                               // memory,ro=only reads from memory,
@@ -73,9 +70,9 @@ class DxilOperationDesc<string name> {
   list<string> StatsGroup = [];
 }
 
-class DxilOperation<string name, int opCode, DxilOpClass opClass, DxilOpCategory opCategory, string doc,
-              string oloadTypes, string attrs, list<DxilOpParameter> params,
-              list<string> statsGroup = []> : DxilOperationDesc<name> {
+class DXILOperation<string name, int opCode, DXILOpClass opClass, DXILOpCategory opCategory, string doc,
+              string oloadTypes, string attrs, list<DXILOpParameter> params,
+              list<string> statsGroup = []> : DXILOperationDesc {
   let OpName = name;
   let OpCode = opCode;
   let Doc = doc;
@@ -90,56 +87,56 @@ class DxilOperation<string name, int opCode, DxilOpClass opClass, DxilOpCategory
 // LLVM intrinsic that DXIL operation maps to.
 class LLVMIntrinsic<Intrinsic llvm_intrinsic_> { Intrinsic llvm_intrinsic = llvm_intrinsic_; }
 
-def Sin : DxilOperation<"Sin", 13, UnaryClass, UnaryFloatCategory, "returns sine(theta) for theta in radians.",
+def Sin : DXILOperation<"Sin", 13, UnaryClass, UnaryFloatCategory, "returns sine(theta) for theta in radians.",
   "half;float;", "rn",
   [
-    DxilOpParameter<0, "$o", "", "operation result">,
-    DxilOpParameter<1, "i32", "opcode", "DXIL opcode">,
-    DxilOpParameter<2, "$o", "value", "input value">
+    DXILOpParameter<0, "$o", "", "operation result">,
+    DXILOpParameter<1, "i32", "opcode", "DXIL opcode">,
+    DXILOpParameter<2, "$o", "value", "input value">
   ],
   ["floats"]>,
   LLVMIntrinsic<int_sin>;
 
-def UMax : DxilOperation< "UMax", 39,  BinaryClass,  BinaryUintCategory, "unsigned integer maximum. UMax(a,b) = a > b ? a : b",
+def UMax : DXILOperation< "UMax", 39,  BinaryClass,  BinaryUintCategory, "unsigned integer maximum. UMax(a,b) = a > b ? a : b",
     "i16;i32;i64;",  "rn",
   [
-    DxilOpParameter<0,  "$o",  "",  "operation result">,
-    DxilOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
-    DxilOpParameter<2,  "$o",  "a",  "input value">,
-    DxilOpParameter<3,  "$o",  "b",  "input value">
+    DXILOpParameter<0,  "$o",  "",  "operation result">,
+    DXILOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
+    DXILOpParameter<2,  "$o",  "a",  "input value">,
+    DXILOpParameter<3,  "$o",  "b",  "input value">
   ],
   ["uints"]>,
   LLVMIntrinsic<int_umax>;
 
-def ThreadId : DxilOperation< "ThreadId", 93,  ThreadIdClass, ComputeIDCategory, "reads the thread ID", "i32;",  "rn",
+def ThreadId : DXILOperation< "ThreadId", 93,  ThreadIdClass, ComputeIDCategory, "reads the thread ID", "i32;",  "rn",
   [
-    DxilOpParameter<0,  "i32",  "",  "thread ID component">,
-    DxilOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
-    DxilOpParameter<2,  "i32",  "component",  "component to read (x,y,z)">
+    DXILOpParameter<0,  "i32",  "",  "thread ID component">,
+    DXILOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
+    DXILOpParameter<2,  "i32",  "component",  "component to read (x,y,z)">
   ]>,
   LLVMIntrinsic<int_dx_thread_id>;
 
-def GroupId : DxilOperation< "GroupId", 94,  GroupIdClass, ComputeIDCategory, "reads the group ID (SV_GroupID)", "i32;",  "rn",
+def GroupId : DXILOperation< "GroupId", 94,  GroupIdClass, ComputeIDCategory, "reads the group ID (SV_GroupID)", "i32;",  "rn",
   [
-    DxilOpParameter<0,  "i32",  "",  "group ID component">,
-    DxilOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
-    DxilOpParameter<2,  "i32",  "component",  "component to read">
+    DXILOpParameter<0,  "i32",  "",  "group ID component">,
+    DXILOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
+    DXILOpParameter<2,  "i32",  "component",  "component to read">
   ]>,
   LLVMIntrinsic<int_dx_group_id>;
 
-def ThreadIdInGroup : DxilOperation< "ThreadIdInGroup", 95,  ThreadIdInGroupClass, ComputeIDCategory,
+def ThreadIdInGroup : DXILOperation< "ThreadIdInGroup", 95,  ThreadIdInGroupClass, ComputeIDCategory,
   "reads the thread ID within the group (SV_GroupThreadID)", "i32;",  "rn",
   [
-    DxilOpParameter<0,  "i32",  "",  "thread ID in group component">,
-    DxilOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
-    DxilOpParameter<2,  "i32",  "component",  "component to read (x,y,z)">
+    DXILOpParameter<0,  "i32",  "",  "thread ID in group component">,
+    DXILOpParameter<1,  "i32",  "opcode",  "DXIL opcode">,
+    DXILOpParameter<2,  "i32",  "component",  "component to read (x,y,z)">
   ]>,
   LLVMIntrinsic<int_dx_thread_id_in_group>;
 
-def FlattenedThreadIdInGroup : DxilOperation< "FlattenedThreadIdInGroup", 96,  FlattenedThreadIdInGroupClass, ComputeIDCategory,
+def FlattenedThreadIdInGroup : DXILOperation< "FlattenedThreadIdInGroup", 96,  FlattenedThreadIdInGroupClass, ComputeIDCategory,
    "provides a flattened index for a given thread within a given group (SV_GroupIndex)", "i32;",  "rn",
   [
-    DxilOpParameter<0,  "i32",  "",  "result">,
-    DxilOpParameter<1,  "i32",  "opcode",  "DXIL opcode">
+    DXILOpParameter<0,  "i32",  "",  "result">,
+    DXILOpParameter<1,  "i32",  "opcode",  "DXIL opcode">
   ]>,
   LLVMIntrinsic<int_dx_flattened_thread_id_in_group>;
diff --git a/llvm/lib/Target/DirectX/DXILMetadata.cpp b/llvm/lib/Target/DirectX/DXILMetadata.cpp
index db55f25c50774d..2d94490a7f24c3 100644
--- a/llvm/lib/Target/DirectX/DXILMetadata.cpp
+++ b/llvm/lib/Target/DirectX/DXILMetadata.cpp
@@ -213,7 +213,7 @@ class EntryMD {
     // FIXME: add signature for profile other than CS.
     // See https://github.com/llvm/llvm-project/issues/57928.
     MDTuple *Signatures = nullptr;
-    return emitDxilEntryPointTuple(
+    return emitDXILEntryPointTuple(
         &F, F.getName().str(), Signatures, Resources,
         Props.emitDXILEntryProps(RawShaderFlag, Ctx, /*IsLib*/ false), Ctx);
   }
@@ -222,7 +222,7 @@ class EntryMD {
     // FIXME: add signature for profile other than CS.
     // See https://github.com/llvm/llvm-project/issues/57928.
     MDTuple *Signatures = nullptr;
-    return emitDxilEntryPointTuple(
+    return emitDXILEntryPointTuple(
         &F, F.getName().str(), Signatures,
         /*entry in lib doesn't need resources metadata*/ nullptr,
         Props.emitDXILEntryProps(RawShaderFlag, Ctx, /*IsLib*/ true), Ctx);
@@ -233,13 +233,13 @@ class EntryMD {
   static MDTuple *emitEmptyEntryForLib(MDTuple *Resources,
                                        uint64_t RawShaderFlag,
                                        LLVMContext &Ctx) {
-    return emitDxilEntryPointTuple(
+    return emitDXILEntryPointTuple(
         nullptr, "", nullptr, Resources,
         EntryProps::emitEntryPropsForEmptyEntry(RawShaderFlag, Ctx), Ctx);
   }
 
 private:
-  static MDTuple *emitDxilEntryPointTuple(Function *Fn, const std::string &Name,
+  static MDTuple *emitDXILEntryPointTuple(Function *Fn, const std::string &Name,
                                           MDTuple *Signatures,
                                           MDTuple *Resources,
                                           MDTuple *Properties,
diff --git a/llvm/utils/TableGen/DXILEmitter.cpp b/llvm/utils/TableGen/DXILEmitter.cpp
index 475a57a0cadf86..cb9f9c6b03c636 100644
--- a/llvm/utils/TableGen/DXILEmitter.cpp
+++ b/llvm/utils/TableGen/DXILEmitter.cpp
@@ -30,7 +30,7 @@ struct DXILShaderModel {
   int Minor = 0;
 };
 
-struct DXILParam {
+struct DXILParameter {
   int Pos; // position in parameter list
   ParameterKind Kind;
   StringRef Name; // short, unique name
@@ -38,23 +38,21 @@ struct DXILParam {
   bool IsConst;   // whether this argument requires a constant value in the IR
   StringRef EnumName; // the name of the enum type if applicable
   int MaxValue;       // the maximum value for this parameter if applicable
-  DXILParam(const Record *R);
+  DXILParameter(const Record *R);
 };
 
-struct DXILOperationData {
-  StringRef Name; // short, unique name
-
-  StringRef DXILOp;    // name of DXIL operation
-  int DXILOpID;        // ID of DXIL operation
-  StringRef DXILClass; // name of the opcode class
+struct DXILOperationDesc {
+  StringRef OpName;    // name of DXIL operation
+  int OpCode;          // ID of DXIL operation
+  StringRef OpClass;   // name of the opcode class
   StringRef Category;  // classification for this instruction
   StringRef Doc;       // the documentation description of this instruction
 
-  SmallVector<DXILParam> Params; // the operands that this instruction takes
+  SmallVector<DXILParameter> Params; // the operands that this instruction takes
   StringRef OverloadTypes;       // overload types if applicable
   StringRef FnAttr;              // attribute shorthands: rn=does not access
                                  // memory,ro=only reads from memory
-  StringRef Intrinsic; // The llvm intrinsic map to DXILOp. Default is "" which
+  StringRef Intrinsic; // The llvm intrinsic map to OpName. Default is "" which
                        // means no map exist
   bool IsDeriv = false;    // whether this is some kind of derivative
   bool IsGradient = false; // whether this requires a gradient calculation
@@ -71,11 +69,10 @@ struct DXILOperationData {
   int OverloadParamIndex; // parameter index which control the overload.
                           // When < 0, should be only 1 overload type.
   SmallVector<StringRef, 4> counters; // counters for this inst.
-  DXILOperationData(const Record *R) {
-    Name = R->getValueAsString("Name");
-    DXILOp = R->getValueAsString("OpName");
-    DXILOpID = R->getValueAsInt("OpCode");
-    DXILClass = R->getValueAsDef("OpClass")->getValueAsString("Name");
+  DXILOperationDesc(const Record *R) {
+    OpName = R->getValueAsString("OpName");
+    OpCode = R->getValueAsInt("OpCode");
+    OpClass = R->getValueAsDef("OpClass")->getValueAsString("Name");
     Category = R->getValueAsDef("OpCategory")->getValueAsString("Name");
 
     if (R->getValue("llvm_intrinsic")) {
@@ -92,7 +89,7 @@ struct DXILOperationData {
     OverloadParamIndex = -1;
     for (unsigned I = 0; I < ParamList->size(); ++I) {
       Record *Param = ParamList->getElementAsRecord(I);
-      Params.emplace_back(DXILParam(Param));
+      Params.emplace_back(DXILParameter(Param));
       auto &CurParam = Params.back();
       if (CurParam.Kind >= ParameterKind::OVERLOAD)
         OverloadParamIndex = I;
@@ -121,7 +118,7 @@ static ParameterKind parameterTypeNameToKind(StringRef Name) {
       .Default(ParameterKind::INVALID);
 }
 
-DXILParam::DXILParam(const Record *R) {
+DXILParameter::DXILParameter(const Record *R) {
   Name = R->getValueAsString("Name");
   Pos = R->getValueAsInt("Pos");
   Kind = parameterTypeNameToKind(R->getValueAsString("LLVMType"));
@@ -166,10 +163,9 @@ static std::string parameterKindToString(ParameterKind Kind) {
   llvm_unreachable("Unknown llvm::dxil::ParameterKind enum");
 }
 
-static void emitDXILOpEnum(DXILOperationData &DXILOp, raw_ostream &OS) {
+static void emitDXILOpEnum(DXILOperationDesc &Op, raw_ostream &OS) {
   // Name = ID, // Doc
-  OS << DXILOp.Name << " = " << DXILOp.DXILOpID << ", // " << DXILOp.Doc
-     << "\n";
+  OS << Op.OpName << " = " << Op.OpCode << ", // " << Op.Doc << "\n";
 }
 
 static std::string buildCategoryStr(StringSet<> &Cetegorys) {
@@ -182,14 +178,14 @@ static std::string buildCategoryStr(StringSet<> &Cetegorys) {
 }
 
 // Emit enum declaration for DXIL.
-static void emitDXILEnums(std::vector<DXILOperationData> &DXILOps,
+static void emitDXILEnums(std::vector<DXILOperationDesc> &Ops,
                           raw_ostream &OS) {
   // Sort by Category + OpName.
-  llvm::sort(DXILOps, [](DXILOperationData &A, DXILOperationData &B) {
+  llvm::sort(Ops, [](DXILOperationDesc &A, DXILOperationDesc &B) {
     // Group by Category first.
     if (A.Category == B.Category)
       // Inside same Category, order by OpName.
-      return A.DXILOp < B.DXILOp;
+      return A.OpName < B.OpName;
     else
       return A.Category < B.Category;
   });
@@ -199,18 +195,18 @@ static void emitDXILEnums(std::vector<DXILOperationData> &DXILOps,
 
   StringMap<StringSet<>> ClassMap;
   StringRef PrevCategory = "";
-  for (auto &DXILOp : DXILOps) {
-    StringRef Category = DXILOp.Category;
+  for (auto &Op : Ops) {
+    StringRef Category = Op.Category;
     if (Category != PrevCategory) {
       OS << "\n// " << Category << "\n";
       PrevCategory = Category;
     }
-    emitDXILOpEnum(DXILOp, OS);
-    auto It = ClassMap.find(DXILOp.DXILClass);
+    emitDXILOpEnum(Op, OS);
+    auto It = ClassMap.find(Op.OpClass);
     if (It != ClassMap.end()) {
-      It->second.insert(DXILOp.Category);
+      It->second.insert(Op.Category);
     } else {
-      ClassMap[DXILOp.DXILClass].insert(DXILOp.Category);
+      ClassMap[Op.OpClass].insert(Op.Category);
     }
   }
 
@@ -253,18 +249,18 @@ static void emitDXILEnums(std::vector<DXILOperationData> &DXILOps,
 }
 
 // Emit map from llvm intrinsic to DXIL operation.
-static void emitDXILIntrinsicMap(std::vector<DXILOperationData> &DXILOps,
+static void emitDXILIntrinsicMap(std::vector<DXILOperationDesc> &Ops,
                                  raw_ostream &OS) {
   OS << "\n";
   // FIXME: use array instead of SmallDenseMap.
   OS << "static const SmallDenseMap<Intrinsic::ID, dxil::OpCode> LowerMap = "
         "{\n";
-  for (auto &DXILOp : DXILOps) {
-    if (DXILOp.Intrinsic.empty())
+  for (auto &Op : Ops) {
+    if (Op.Intrinsic.empty())
       continue;
     // {Intrinsic::sin, dxil::OpCode::Sin},
-    OS << "  { Intrinsic::" << DXILOp.Intrinsic
-       << ", dxil::OpCode::" << DXILOp.DXILOp << "},\n";
+    OS << "  { Intrinsic::" << Op.Intrinsic << ", dxil::OpCode::" << Op.OpName
+       << "},\n";
   }
   OS << "};\n";
   OS << "\n";
@@ -315,20 +311,20 @@ static std::string lowerFirstLetter(StringRef Name) {
   return LowerName;
 }
 
-static std::string getDXILOpClassName(StringRef DXILOpClass) {
+static std::string getDXILOpClassName(StringRef OpClass) {
   // Lower first letter expect for special case.
-  return StringSwitch<std::string>(DXILOpClass)
+  return StringSwitch<std::string>(OpClass)
       .Case("CBufferLoad", "cbufferLoad")
       .Case("CBufferLoadLegacy", "cbufferLoadLegacy")
       .Case("GSInstanceID", "gsInstanceID")
-      .Default(lowerFirstLetter(DXILOpClass));
+      .Default(lowerFirstLetter(OpClass));
 }
 
-static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,
+static void emitDXILOperationTable(std::vector<DXILOperationDesc> &Ops,
                                    raw_ostream &OS) {
-  // Sort by DXILOpID.
-  llvm::sort(DXILOps, [](DXILOperationData &A, DXILOperationData &B) {
-    return A.DXILOpID < B.DXILOpID;
+  // Sort by OpCode.
+  llvm::sort(Ops, [](DXILOperationDesc &A, DXILOperationDesc &B) {
+    return A.OpCode < B.OpCode;
   });
 
   // Collect Names.
@@ -338,18 +334,18 @@ static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,
 
   StringMap<SmallVector<ParameterKind>> ParameterMap;
   StringSet<> ClassSet;
-  for (auto &DXILOp : DXILOps) {
-    OpStrings.add(DXILOp.DXILOp.str());
+  for (auto &Op : Ops) {
+    OpStrings.add(Op.OpName.str());
 
-    if (ClassSet.contains(DXILOp.DXILClass))
+    if (ClassSet.contains(Op.OpClass))
       continue;
-    ClassSet.insert(DXILOp.DXILClass);
-    OpClassStrings.add(getDXILOpClassName(DXILOp.DXILClass));
+    ClassSet.insert(Op.OpClass);
+    OpClassStrings.add(getDXILOpClassName(Op.OpClass));
     SmallVector<ParameterKind> ParamKindVec;
-    for (auto &Param : DXILOp.Params) {
+    for (auto &Param : Op.Params) {
       ParamKindVec.emplace_back(Param.Kind);
     }
-    ParameterMap[DXILOp.DXILClass] = ParamKindVec;
+    ParameterMap[Op.OpClass] = ParamKindVec;
     Parameters.add(ParamKindVec);
   }
 
@@ -363,26 +359,25 @@ static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,
   // OpCodeClassNameIndex,
   // OverloadKind::FLOAT | OverloadKind::HALF, Attribute::AttrKind::ReadNone, 0,
   // 3, ParameterTableOffset},
-  OS << "static const OpCodeProperty *getOpCodeProperty(dxil::OpCode DXILOp) "
+  OS << "static const OpCodeProperty *getOpCodeProperty(dxil::OpCode Op) "
         "{\n";
 
   OS << "  static const OpCodeProperty OpCodeProps[] = {\n";
-  for (auto &DXILOp : DXILOps) {
-    OS << "  { dxil::OpCode::" << DXILOp.DXILOp << ", "
-       << OpStrings.get(DXILOp.DXILOp.str())
-       << ", OpCodeClass::" << DXILOp.DXILClass << ", "
-       << OpClassStrings.get(getDXILOpClassName(DXILOp.DXILClass)) << ", "
-       << getDXILOperationOverload(DXILOp.OverloadTypes) << ", "
-       << emitDXILOperationFnAttr(DXILOp.FnAttr) << ", "
-       << DXILOp.OverloadParamIndex << ", " << DXILOp.Params.size() << ", "
-       << Parameters.get(ParameterMap[DXILOp.DXILClass]) << " },\n";
+  for (auto &Op : Ops) {
+    OS << "  { dxil::OpCode::" << Op.OpName << ", "
+       << OpStrings.get(Op.OpName.str()) << ", OpCodeClass::" << Op.OpClass
+       << ", " << OpClassStrings.get(getDXILOpClassName(Op.OpClass)) << ", "
+       << getDXILOperationOverload(Op.OverloadTypes) << ", "
+       << emitDXILOperationFnAttr(Op.FnAttr) << ", " << Op.OverloadParamIndex
+       << ", " << Op.Params.size() << ", "
+       << Parameters.get(ParameterMap[Op.OpClass]) << " },\n";
   }
   OS << "  };\n";
 
   OS << "  // FIXME: change search to indexing with\n";
-  OS << "  // DXILOp once all DXIL op is added.\n";
+  OS << "  // Op once all DXIL operations are added.\n";
   OS << "  OpCodeProperty TmpProp;\n";
-  OS << "  TmpProp.OpCode = DXILOp;\n";
+  OS << "  TmpProp.OpCode = Op;\n";
   OS << "  const OpCodeProperty *Prop =\n";
   OS << "      llvm::lower_bound(OpCodeProps, TmpProp,\n";
   OS << "                        [](const OpCodeProperty &A, const "
@@ -394,12 +389,12 @@ static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,
   OS << "}\n\n";
 
   // Emit the string tables.
-  OS << "static const char *getOpCodeName(dxil::OpCode DXILOp) {\n\n";
+  OS << "static const char *getOpCodeName(dxil::OpCode Op) {\n\n";
 
   OpStrings.emitStringLiteralDef(OS,
                                  "  static const char DXILOpCodeNameTable[]");
 
-  OS << "  auto *Prop = getOpCodeProperty(DXILOp);\n";
+  OS << "  auto *Prop = getOpCodeProperty(Op);\n";
   OS << "  unsigned Index = Prop->OpCodeNameOffset;\n";
   OS << "  return DXILOpCodeNameTable + Index;\n";
   OS << "}\n\n";
@@ -431,14 +426,14 @@ static void emitDXILOperationTable(std::vector<DXILOperationData> &DXILOps,
 }
 
 static void EmitDXILOperation(RecordKeeper &Records, raw_ostream &OS) {
-  std::vector<Record *> Ops = Records.getAllDerivedDefinitions("DxilOperation");
+  std::vector<Record *> Ops = Records.getAllDerivedDefinitions("DXILOperation");
   OS << "// Generated code, do not edit.\n";
   OS << "\n";
 
-  std::vector<DXILOperationData> DXILOps;
+  std::vector<DXILOperationDesc> DXILOps;
   DXILOps.reserve(Ops.size());
   for (auto *Record : Ops) {
-    DXILOps.emplace_back(DXILOperationData(Record));
+    DXILOps.emplace_back(DXILOperationDesc(Record));
   }
 
   OS << "#ifdef DXIL_OP_ENUM\n";

>From 46a2c0cf1dc3eaaa8bb2aa929795d81ed3b2fef9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Valentin=20Clement=20=28=E3=83=90=E3=83=AC=E3=83=B3?=
 =?UTF-8?q?=E3=82=BF=E3=82=A4=E3=83=B3=20=E3=82=AF=E3=83=AC=E3=83=A1?=
 =?UTF-8?q?=E3=83=B3=29?= <clementval at gmail.com>
Date: Thu, 8 Feb 2024 10:03:08 -0800
Subject: [PATCH 62/72] [flang][cuda] Lower attribute for local variable
 (#81076)

This is a first simple patch to introduce a new FIR attribute to carry
the CUDA variable attribute information to hlfir.declare and fir.declare
operations. It currently lowers this information for local variables.

The texture attribute is omitted since it is rejected by semantic and
will not make its way to MLIR.

This new attribute is added as optional attribute to the hlfir.declare
and fir.declare operations.
---
 flang/include/flang/Lower/ConvertVariable.h   |  6 +++
 .../flang/Optimizer/Builder/HLFIRTools.h      | 10 ++---
 .../flang/Optimizer/Dialect/FIRAttr.td        | 23 ++++++++++-
 .../include/flang/Optimizer/Dialect/FIROps.td |  3 +-
 .../include/flang/Optimizer/HLFIR/HLFIROps.td |  6 ++-
 flang/lib/Lower/ConvertVariable.cpp           | 41 ++++++++++++++++++-
 flang/lib/Optimizer/Builder/HLFIRTools.cpp    |  5 ++-
 flang/lib/Optimizer/Dialect/FIRAttr.cpp       |  3 +-
 flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp     |  5 ++-
 .../HLFIR/Transforms/ConvertToFIR.cpp         |  6 ++-
 flang/test/Lower/CUDA/cuda-data-attribute.cuf | 22 ++++++++++
 .../Optimizer/FortranVariableTest.cpp         | 12 ++++--
 12 files changed, 121 insertions(+), 21 deletions(-)
 create mode 100644 flang/test/Lower/CUDA/cuda-data-attribute.cuf

diff --git a/flang/include/flang/Lower/ConvertVariable.h b/flang/include/flang/Lower/ConvertVariable.h
index 0ff3ca9bdeac7e..cdbf050e4a7b3a 100644
--- a/flang/include/flang/Lower/ConvertVariable.h
+++ b/flang/include/flang/Lower/ConvertVariable.h
@@ -137,6 +137,12 @@ translateSymbolAttributes(mlir::MLIRContext *mlirContext,
                           fir::FortranVariableFlagsEnum extraFlags =
                               fir::FortranVariableFlagsEnum::None);
 
+/// Translate the CUDA Fortran attributes of \p sym into the FIR CUDA attribute
+/// representation.
+fir::CUDAAttributeAttr
+translateSymbolCUDAAttribute(mlir::MLIRContext *mlirContext,
+                             const Fortran::semantics::Symbol &sym);
+
 /// Map a symbol to a given fir::ExtendedValue. This will generate an
 /// hlfir.declare when lowering to HLFIR and map the hlfir.declare result to the
 /// symbol.
diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
index efbd57c77de5d5..fe69ffa27dc35b 100644
--- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h
+++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
@@ -233,11 +233,11 @@ translateToExtendedValue(mlir::Location loc, fir::FirOpBuilder &builder,
                          fir::FortranVariableOpInterface fortranVariable);
 
 /// Generate declaration for a fir::ExtendedValue in memory.
-fir::FortranVariableOpInterface genDeclare(mlir::Location loc,
-                                           fir::FirOpBuilder &builder,
-                                           const fir::ExtendedValue &exv,
-                                           llvm::StringRef name,
-                                           fir::FortranVariableFlagsAttr flags);
+fir::FortranVariableOpInterface
+genDeclare(mlir::Location loc, fir::FirOpBuilder &builder,
+           const fir::ExtendedValue &exv, llvm::StringRef name,
+           fir::FortranVariableFlagsAttr flags,
+           fir::CUDAAttributeAttr cudaAttr = {});
 
 /// Generate an hlfir.associate to build a variable from an expression value.
 /// The type of the variable must be provided so that scalar logicals are
diff --git a/flang/include/flang/Optimizer/Dialect/FIRAttr.td b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
index 114bf7d1df913d..bc7312453896d8 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRAttr.td
+++ b/flang/include/flang/Optimizer/Dialect/FIRAttr.td
@@ -55,7 +55,28 @@ def fir_FortranVariableFlagsAttr : fir_Attr<"FortranVariableFlags"> {
   let returnType = "::fir::FortranVariableFlagsEnum";
   let convertFromStorage = "$_self.getFlags()";
   let constBuilderCall =
-          "::fir::FortranVariableFlagsAttr::get($_builder.getContext(), $0)";
+        "::fir::FortranVariableFlagsAttr::get($_builder.getContext(), $0)";
+}
+
+def CUDAconstant : I32EnumAttrCase<"Constant", 0, "constant">;
+def CUDAdevice   : I32EnumAttrCase<"Device", 1, "device">;
+def CUDAmanaged  : I32EnumAttrCase<"Managed", 2, "managed">;
+def CUDApinned   : I32EnumAttrCase<"Pinned", 3, "pinned">;
+def CUDAshared   : I32EnumAttrCase<"Shared", 4, "shared">;
+def CUDAunified  : I32EnumAttrCase<"Unified", 5, "unified">;
+// Texture is omitted since it is obsolete and rejected by semantic.
+
+def fir_CUDAAttribute : I32EnumAttr<
+    "CUDAAttribute",
+    "CUDA Fortran variable attributes",
+    [CUDAconstant, CUDAdevice, CUDAmanaged, CUDApinned, CUDAshared,
+     CUDAunified]> {
+  let genSpecializedAttr = 0;
+  let cppNamespace = "::fir";
+}
+
+def fir_CUDAAttributeAttr : EnumAttr<fir_Dialect, fir_CUDAAttribute, "cuda"> {
+  let assemblyFormat = [{ ```<` $value `>` }];
 }
 
 def fir_BoxFieldAttr : I32EnumAttr<
diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td
index fcecc605dfa5cd..b954a0cc74d0e1 100644
--- a/flang/include/flang/Optimizer/Dialect/FIROps.td
+++ b/flang/include/flang/Optimizer/Dialect/FIROps.td
@@ -3027,7 +3027,8 @@ def fir_DeclareOp : fir_Op<"declare", [AttrSizedOperandSegments,
     Optional<AnyShapeOrShiftType>:$shape,
     Variadic<AnyIntegerType>:$typeparams,
     Builtin_StringAttr:$uniq_name,
-    OptionalAttr<fir_FortranVariableFlagsAttr>:$fortran_attrs
+    OptionalAttr<fir_FortranVariableFlagsAttr>:$fortran_attrs,
+    OptionalAttr<fir_CUDAAttributeAttr>:$cuda_attr
   );
 
   let results = (outs AnyRefOrBox);
diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
index 753ede2112a476..f22e9a740da341 100644
--- a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
+++ b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
@@ -88,7 +88,8 @@ def hlfir_DeclareOp : hlfir_Op<"declare", [AttrSizedOperandSegments,
     Optional<AnyShapeOrShiftType>:$shape,
     Variadic<AnyIntegerType>:$typeparams,
     Builtin_StringAttr:$uniq_name,
-    OptionalAttr<fir_FortranVariableFlagsAttr>:$fortran_attrs
+    OptionalAttr<fir_FortranVariableFlagsAttr>:$fortran_attrs,
+    OptionalAttr<fir_CUDAAttributeAttr>:$cuda_attr
   );
 
   let results = (outs AnyFortranVariable, AnyRefOrBoxLike);
@@ -101,7 +102,8 @@ def hlfir_DeclareOp : hlfir_Op<"declare", [AttrSizedOperandSegments,
   let builders = [
     OpBuilder<(ins "mlir::Value":$memref, "llvm::StringRef":$uniq_name,
       CArg<"mlir::Value", "{}">:$shape, CArg<"mlir::ValueRange", "{}">:$typeparams,
-      CArg<"fir::FortranVariableFlagsAttr", "{}">:$fortran_attrs)>];
+      CArg<"fir::FortranVariableFlagsAttr", "{}">:$fortran_attrs,
+      CArg<"fir::CUDAAttributeAttr", "{}">:$cuda_attr)>];
 
   let extraClassDeclaration = [{
     /// Get the variable original base (same as input). It lacks
diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp
index 8ea2557c42b373..f761e14e64794d 100644
--- a/flang/lib/Lower/ConvertVariable.cpp
+++ b/flang/lib/Lower/ConvertVariable.cpp
@@ -1579,6 +1579,38 @@ fir::FortranVariableFlagsAttr Fortran::lower::translateSymbolAttributes(
   return fir::FortranVariableFlagsAttr::get(mlirContext, flags);
 }
 
+fir::CUDAAttributeAttr Fortran::lower::translateSymbolCUDAAttribute(
+    mlir::MLIRContext *mlirContext, const Fortran::semantics::Symbol &sym) {
+  std::optional<Fortran::common::CUDADataAttr> cudaAttr =
+      Fortran::semantics::GetCUDADataAttr(&sym);
+  if (cudaAttr) {
+    fir::CUDAAttribute attr;
+    switch (*cudaAttr) {
+    case Fortran::common::CUDADataAttr::Constant:
+      attr = fir::CUDAAttribute::Constant;
+      break;
+    case Fortran::common::CUDADataAttr::Device:
+      attr = fir::CUDAAttribute::Device;
+      break;
+    case Fortran::common::CUDADataAttr::Managed:
+      attr = fir::CUDAAttribute::Managed;
+      break;
+    case Fortran::common::CUDADataAttr::Pinned:
+      attr = fir::CUDAAttribute::Pinned;
+      break;
+    case Fortran::common::CUDADataAttr::Shared:
+      attr = fir::CUDAAttribute::Shared;
+      break;
+    case Fortran::common::CUDADataAttr::Texture:
+      // Obsolete attribute
+      break;
+    }
+
+    return fir::CUDAAttributeAttr::get(mlirContext, attr);
+  }
+  return {};
+}
+
 /// Map a symbol to its FIR address and evaluated specification expressions.
 /// Not for symbols lowered to fir.box.
 /// Will optionally create fir.declare.
@@ -1618,6 +1650,8 @@ static void genDeclareSymbol(Fortran::lower::AbstractConverter &converter,
     auto name = converter.mangleName(sym);
     fir::FortranVariableFlagsAttr attributes =
         Fortran::lower::translateSymbolAttributes(builder.getContext(), sym);
+    fir::CUDAAttributeAttr cudaAttr =
+        Fortran::lower::translateSymbolCUDAAttribute(builder.getContext(), sym);
 
     if (isCrayPointee) {
       mlir::Type baseType =
@@ -1664,7 +1698,7 @@ static void genDeclareSymbol(Fortran::lower::AbstractConverter &converter,
       return;
     }
     auto newBase = builder.create<hlfir::DeclareOp>(
-        loc, base, name, shapeOrShift, lenParams, attributes);
+        loc, base, name, shapeOrShift, lenParams, attributes, cudaAttr);
     symMap.addVariableDefinition(sym, newBase, force);
     return;
   }
@@ -1709,9 +1743,12 @@ void Fortran::lower::genDeclareSymbol(
     fir::FortranVariableFlagsAttr attributes =
         Fortran::lower::translateSymbolAttributes(
             builder.getContext(), sym.GetUltimate(), extraFlags);
+    fir::CUDAAttributeAttr cudaAttr =
+        Fortran::lower::translateSymbolCUDAAttribute(builder.getContext(),
+                                                     sym.GetUltimate());
     auto name = converter.mangleName(sym);
     hlfir::EntityWithAttributes declare =
-        hlfir::genDeclare(loc, builder, exv, name, attributes);
+        hlfir::genDeclare(loc, builder, exv, name, attributes, cudaAttr);
     symMap.addVariableDefinition(sym, declare.getIfVariableInterface(), force);
     return;
   }
diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
index 94f723b4bae703..61e53117da44da 100644
--- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp
+++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
@@ -198,7 +198,8 @@ mlir::Value hlfir::Entity::getFirBase() const {
 fir::FortranVariableOpInterface
 hlfir::genDeclare(mlir::Location loc, fir::FirOpBuilder &builder,
                   const fir::ExtendedValue &exv, llvm::StringRef name,
-                  fir::FortranVariableFlagsAttr flags) {
+                  fir::FortranVariableFlagsAttr flags,
+                  fir::CUDAAttributeAttr cudaAttr) {
 
   mlir::Value base = fir::getBase(exv);
   assert(fir::conformsWithPassByRef(base.getType()) &&
@@ -228,7 +229,7 @@ hlfir::genDeclare(mlir::Location loc, fir::FirOpBuilder &builder,
       },
       [](const auto &) {});
   auto declareOp = builder.create<hlfir::DeclareOp>(
-      loc, base, name, shapeOrShift, lenParams, flags);
+      loc, base, name, shapeOrShift, lenParams, flags, cudaAttr);
   return mlir::cast<fir::FortranVariableOpInterface>(declareOp.getOperation());
 }
 
diff --git a/flang/lib/Optimizer/Dialect/FIRAttr.cpp b/flang/lib/Optimizer/Dialect/FIRAttr.cpp
index 487109121db0c1..04431b6afdce28 100644
--- a/flang/lib/Optimizer/Dialect/FIRAttr.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRAttr.cpp
@@ -14,6 +14,7 @@
 #include "flang/Optimizer/Dialect/FIRDialect.h"
 #include "flang/Optimizer/Dialect/Support/KindMapping.h"
 #include "mlir/IR/AttributeSupport.h"
+#include "mlir/IR/Builders.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/DialectImplementation.h"
 #include "llvm/ADT/SmallString.h"
@@ -297,5 +298,5 @@ void fir::printFirAttribute(FIROpsDialect *dialect, mlir::Attribute attr,
 void FIROpsDialect::registerAttributes() {
   addAttributes<ClosedIntervalAttr, ExactTypeAttr, FortranVariableFlagsAttr,
                 LowerBoundAttr, PointIntervalAttr, RealAttr, SubclassAttr,
-                UpperBoundAttr>();
+                UpperBoundAttr, CUDAAttributeAttr>();
 }
diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp
index ce12e6fd49c6d9..85644c14748fc9 100644
--- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp
+++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp
@@ -123,14 +123,15 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder,
                              mlir::OperationState &result, mlir::Value memref,
                              llvm::StringRef uniq_name, mlir::Value shape,
                              mlir::ValueRange typeparams,
-                             fir::FortranVariableFlagsAttr fortran_attrs) {
+                             fir::FortranVariableFlagsAttr fortran_attrs,
+                             fir::CUDAAttributeAttr cuda_attr) {
   auto nameAttr = builder.getStringAttr(uniq_name);
   mlir::Type inputType = memref.getType();
   bool hasExplicitLbs = hasExplicitLowerBounds(shape);
   mlir::Type hlfirVariableType =
       getHLFIRVariableType(inputType, hasExplicitLbs);
   build(builder, result, {hlfirVariableType, inputType}, memref, shape,
-        typeparams, nameAttr, fortran_attrs);
+        typeparams, nameAttr, fortran_attrs, cuda_attr);
 }
 
 mlir::LogicalResult hlfir::DeclareOp::verify() {
diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp
index b69018560e3a3e..b15fb590620150 100644
--- a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp
+++ b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp
@@ -320,12 +320,16 @@ class DeclareOpConversion : public mlir::OpRewritePattern<hlfir::DeclareOp> {
     mlir::Location loc = declareOp->getLoc();
     mlir::Value memref = declareOp.getMemref();
     fir::FortranVariableFlagsAttr fortranAttrs;
+    fir::CUDAAttributeAttr cudaAttr;
     if (auto attrs = declareOp.getFortranAttrs())
       fortranAttrs =
           fir::FortranVariableFlagsAttr::get(rewriter.getContext(), *attrs);
+    if (auto attr = declareOp.getCudaAttr())
+      cudaAttr = fir::CUDAAttributeAttr::get(rewriter.getContext(), *attr);
     auto firDeclareOp = rewriter.create<fir::DeclareOp>(
         loc, memref.getType(), memref, declareOp.getShape(),
-        declareOp.getTypeparams(), declareOp.getUniqName(), fortranAttrs);
+        declareOp.getTypeparams(), declareOp.getUniqName(), fortranAttrs,
+        cudaAttr);
 
     // Propagate other attributes from hlfir.declare to fir.declare.
     // OpenACC's acc.declare is one example. Right now, the propagation
diff --git a/flang/test/Lower/CUDA/cuda-data-attribute.cuf b/flang/test/Lower/CUDA/cuda-data-attribute.cuf
new file mode 100644
index 00000000000000..caa8ac7baff383
--- /dev/null
+++ b/flang/test/Lower/CUDA/cuda-data-attribute.cuf
@@ -0,0 +1,22 @@
+! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s
+! RUN: bbc -emit-hlfir -fcuda %s -o - | fir-opt -convert-hlfir-to-fir | FileCheck %s --check-prefix=FIR
+
+! Test lowering of CUDA attribute on local variables.
+
+subroutine local_var_attrs
+  real, constant :: rc
+  real, device :: rd
+  real, allocatable, managed :: rm
+  real, allocatable, pinned :: rp
+end subroutine
+
+! CHECK-LABEL: func.func @_QPlocal_var_attrs()
+! CHECK: %{{.*}}:2 = hlfir.declare %{{.*}} {cuda_attr = #fir.cuda<constant>, uniq_name = "_QFlocal_var_attrsErc"} : (!fir.ref<f32>) -> (!fir.ref<f32>, !fir.ref<f32>)
+! CHECK: %{{.*}}:2 = hlfir.declare %{{.*}} {cuda_attr = #fir.cuda<device>, uniq_name = "_QFlocal_var_attrsErd"} : (!fir.ref<f32>) -> (!fir.ref<f32>, !fir.ref<f32>)
+! CHECK: %{{.*}}:2 = hlfir.declare %{{.*}} {cuda_attr = #fir.cuda<managed>, fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFlocal_var_attrsErm"} : (!fir.ref<!fir.box<!fir.heap<f32>>>) -> (!fir.ref<!fir.box<!fir.heap<f32>>>, !fir.ref<!fir.box<!fir.heap<f32>>>)
+! CHECK: %{{.*}}:2 = hlfir.declare %{{.*}} {cuda_attr = #fir.cuda<pinned>, fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFlocal_var_attrsErp"} : (!fir.ref<!fir.box<!fir.heap<f32>>>) -> (!fir.ref<!fir.box<!fir.heap<f32>>>, !fir.ref<!fir.box<!fir.heap<f32>>>)
+
+! FIR: %{{.*}} = fir.declare %{{.*}} {cuda_attr = #fir.cuda<constant>, uniq_name = "_QFlocal_var_attrsErc"} : (!fir.ref<f32>) -> !fir.ref<f32>
+! FIR: %{{.*}} = fir.declare %{{.*}} {cuda_attr = #fir.cuda<device>, uniq_name = "_QFlocal_var_attrsErd"} : (!fir.ref<f32>) -> !fir.ref<f32>
+! FIR: %{{.*}} = fir.declare %{{.*}} {cuda_attr = #fir.cuda<managed>, fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFlocal_var_attrsErm"} : (!fir.ref<!fir.box<!fir.heap<f32>>>) -> !fir.ref<!fir.box<!fir.heap<f32>>>
+! FIR: %{{.*}} = fir.declare %{{.*}} {cuda_attr = #fir.cuda<pinned>, fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QFlocal_var_attrsErp"} : (!fir.ref<!fir.box<!fir.heap<f32>>>) -> !fir.ref<!fir.box<!fir.heap<f32>>>
diff --git a/flang/unittests/Optimizer/FortranVariableTest.cpp b/flang/unittests/Optimizer/FortranVariableTest.cpp
index 42ed2257f58057..4b101ce61f93ba 100644
--- a/flang/unittests/Optimizer/FortranVariableTest.cpp
+++ b/flang/unittests/Optimizer/FortranVariableTest.cpp
@@ -49,7 +49,8 @@ TEST_F(FortranVariableTest, SimpleScalar) {
   auto name = mlir::StringAttr::get(&context, "x");
   auto declare = builder->create<fir::DeclareOp>(loc, addr.getType(), addr,
       /*shape=*/mlir::Value{}, /*typeParams=*/std::nullopt, name,
-      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{});
+      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{},
+      /*cuda_attr=*/fir::CUDAAttributeAttr{});
 
   fir::FortranVariableOpInterface fortranVariable = declare;
   EXPECT_FALSE(fortranVariable.isArray());
@@ -74,7 +75,8 @@ TEST_F(FortranVariableTest, CharacterScalar) {
   auto name = mlir::StringAttr::get(&context, "x");
   auto declare = builder->create<fir::DeclareOp>(loc, addr.getType(), addr,
       /*shape=*/mlir::Value{}, typeParams, name,
-      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{});
+      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{},
+      /*cuda_attr=*/fir::CUDAAttributeAttr{});
 
   fir::FortranVariableOpInterface fortranVariable = declare;
   EXPECT_FALSE(fortranVariable.isArray());
@@ -104,7 +106,8 @@ TEST_F(FortranVariableTest, SimpleArray) {
   auto name = mlir::StringAttr::get(&context, "x");
   auto declare = builder->create<fir::DeclareOp>(loc, addr.getType(), addr,
       shape, /*typeParams*/ std::nullopt, name,
-      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{});
+      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{},
+      /*cuda_attr=*/fir::CUDAAttributeAttr{});
 
   fir::FortranVariableOpInterface fortranVariable = declare;
   EXPECT_TRUE(fortranVariable.isArray());
@@ -134,7 +137,8 @@ TEST_F(FortranVariableTest, CharacterArray) {
   auto name = mlir::StringAttr::get(&context, "x");
   auto declare = builder->create<fir::DeclareOp>(loc, addr.getType(), addr,
       shape, typeParams, name,
-      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{});
+      /*fortran_attrs=*/fir::FortranVariableFlagsAttr{},
+      /*cuda_attr=*/fir::CUDAAttributeAttr{});
 
   fir::FortranVariableOpInterface fortranVariable = declare;
   EXPECT_TRUE(fortranVariable.isArray());

>From 807048b2c6e40d1ae625edb968133614a24b2989 Mon Sep 17 00:00:00 2001
From: Krystian Stasiowski <sdkrystian at gmail.com>
Date: Thu, 8 Feb 2024 13:04:10 -0500
Subject: [PATCH 63/72] [Clang][Sema] Abbreviated function templates do not
 append invented parameters to empty template parameter lists (#80864)

According to [dcl.fct] p23:
> An abbreviated function template can have a _template-head_. The
invented _template-parameters_ are appended to the
_template-parameter-list_ after the explicitly declared
_template-parameters_.

`template<>` is not a _template-head_ -- a _template-head_ must have at
least one _template-parameter_. This patch corrects our current behavior
of appending the invented template parameters to the innermost template
parameter list, regardless of whether it is empty. Example:
```
template<typename T>
struct A
{
    void f(auto);
};

template<>
void A<int>::f(auto); // ok

template<>
template<> // warning: extraneous template parameter list in template specialization
void A<int>::f(auto);
```
---
 clang/docs/ReleaseNotes.rst                   |  2 ++
 clang/include/clang/AST/DeclTemplate.h        |  1 +
 clang/lib/AST/DeclPrinter.cpp                 |  4 ++++
 clang/lib/Sema/SemaDecl.cpp                   |  2 +-
 clang/lib/Sema/SemaDeclCXX.cpp                | 11 ++++++++-
 clang/test/AST/ast-print-method-decl.cpp      |  3 +--
 .../CXX/dcl.decl/dcl.meaning/dcl.fct/p23.cpp  | 24 +++++++++++++++++++
 clang/test/OpenMP/for_loop_auto.cpp           |  2 +-
 8 files changed, 44 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/CXX/dcl.decl/dcl.meaning/dcl.fct/p23.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 57c46c36955a8a..0072495354b8eb 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -216,6 +216,8 @@ Bug Fixes to C++ Support
   Fixes (`#68490 <https://github.com/llvm/llvm-project/issues/68490>`_)
 - Fix a crash when trying to call a varargs function that also has an explicit object parameter.
   Fixes (`#80971 ICE when explicit object parameter be a function parameter pack`)
+- Fixed a bug where abbreviated function templates would append their invented template parameters to
+  an empty template parameter lists.
 
 Bug Fixes to AST Handling
 ^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h
index baf71145d99dc6..e3b6a7efb1127a 100644
--- a/clang/include/clang/AST/DeclTemplate.h
+++ b/clang/include/clang/AST/DeclTemplate.h
@@ -134,6 +134,7 @@ class TemplateParameterList final
   const_iterator end() const { return begin() + NumParams; }
 
   unsigned size() const { return NumParams; }
+  bool empty() const { return NumParams == 0; }
 
   ArrayRef<NamedDecl *> asArray() { return llvm::ArrayRef(begin(), end()); }
   ArrayRef<const NamedDecl*> asArray() const {
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 822ac12c4c7dd4..43d221968ea3fb 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -1215,6 +1215,10 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params,
                                           bool OmitTemplateKW) {
   assert(Params);
 
+  // Don't print invented template parameter lists.
+  if (!Params->empty() && Params->getParam(0)->isImplicit())
+    return;
+
   if (!OmitTemplateKW)
     Out << "template ";
   Out << '<';
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 18a5d93ab8e8c6..2c526cd0d0e675 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -9759,7 +9759,7 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
   SmallVector<TemplateParameterList *, 4> TemplateParamLists;
   llvm::append_range(TemplateParamLists, TemplateParamListsRef);
   if (TemplateParameterList *Invented = D.getInventedTemplateParameterList()) {
-    if (!TemplateParamLists.empty() &&
+    if (!TemplateParamLists.empty() && !TemplateParamLists.back()->empty() &&
         Invented->getDepth() == TemplateParamLists.back()->getDepth())
       TemplateParamLists.back() = Invented;
     else
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 940d407be808a9..2e9963c46cbefa 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -19313,7 +19313,16 @@ void Sema::ActOnStartFunctionDeclarationDeclarator(
         ExplicitLists, /*IsFriend=*/false, IsMemberSpecialization, IsInvalid,
         /*SuppressDiagnostic=*/true);
   }
-  if (ExplicitParams) {
+  // C++23 [dcl.fct]p23:
+  //   An abbreviated function template can have a template-head. The invented
+  //   template-parameters are appended to the template-parameter-list after
+  //   the explicitly declared template-parameters.
+  //
+  // A template-head must have one or more template-parameters (read:
+  // 'template<>' is *not* a template-head). Only append the invented
+  // template parameters if we matched the nested-name-specifier to a non-empty
+  // TemplateParameterList.
+  if (ExplicitParams && !ExplicitParams->empty()) {
     Info.AutoTemplateParameterDepth = ExplicitParams->getDepth();
     llvm::append_range(Info.TemplateParams, *ExplicitParams);
     Info.NumExplicitTemplateParams = ExplicitParams->size();
diff --git a/clang/test/AST/ast-print-method-decl.cpp b/clang/test/AST/ast-print-method-decl.cpp
index 9f5d1126099442..75dea0cac16be1 100644
--- a/clang/test/AST/ast-print-method-decl.cpp
+++ b/clang/test/AST/ast-print-method-decl.cpp
@@ -32,8 +32,7 @@ struct DelegatingCtor2 {
 
 // CHECK: struct DelegatingCtor3 {
 struct DelegatingCtor3 {
-  // FIXME: template <> should not be output
-  // CHECK: template <> DelegatingCtor3(auto);
+  // CHECK: DelegatingCtor3(auto);
   DelegatingCtor3(auto);
 
   // FIXME: Implicitly specialized method should not be output
diff --git a/clang/test/CXX/dcl.decl/dcl.meaning/dcl.fct/p23.cpp b/clang/test/CXX/dcl.decl/dcl.meaning/dcl.fct/p23.cpp
new file mode 100644
index 00000000000000..469c4e091953c3
--- /dev/null
+++ b/clang/test/CXX/dcl.decl/dcl.meaning/dcl.fct/p23.cpp
@@ -0,0 +1,24 @@
+// RUN: %clang_cc1 -std=c++20 -pedantic-errors -verify %s
+
+// FIXME: This should be an error with -pedantic-errors.
+template<> // expected-warning {{extraneous template parameter list in template specialization}}
+void f(auto);
+
+template<typename>
+void f(auto);
+
+template<typename T>
+struct A {
+  void g(auto);
+};
+
+template<typename T>
+void A<T>::g(auto) { }
+
+template<>
+void A<int>::g(auto) { }
+
+// FIXME: This should be an error with -pedantic-errors.
+template<>
+template<> // expected-warning {{extraneous template parameter list in template specialization}}
+void A<long>::g(auto) { }
diff --git a/clang/test/OpenMP/for_loop_auto.cpp b/clang/test/OpenMP/for_loop_auto.cpp
index b2c5540a7785ab..4467de6bba18dc 100644
--- a/clang/test/OpenMP/for_loop_auto.cpp
+++ b/clang/test/OpenMP/for_loop_auto.cpp
@@ -10,7 +10,7 @@
 #ifndef HEADER
 #define HEADER
 
-// CHECK:      template <> void do_loop(const auto &v) {
+// CHECK:      void do_loop(const auto &v) {
 // CHECK-NEXT: #pragma omp parallel for
 // CHECK-NEXT:    for (const auto &i : v)
 // CHECK-NEXT:      ;

>From 49f96507eaa09cdfaec7e5f03c5f5734c299011d Mon Sep 17 00:00:00 2001
From: Peiming Liu <36770114+PeimingLiu at users.noreply.github.com>
Date: Thu, 8 Feb 2024 12:12:24 -0600
Subject: [PATCH 64/72] [mlir][sparse] using non-static field to avoid data
 races. (#81165)

---
 .../Transforms/Utils/LoopEmitter.cpp          | 15 +++---
 .../Transforms/Utils/LoopEmitter.h            |  1 +
 .../Transforms/Utils/SparseTensorLevel.cpp    | 48 ++++++++++++-------
 .../Transforms/Utils/SparseTensorLevel.h      | 20 ++++----
 4 files changed, 50 insertions(+), 34 deletions(-)

diff --git a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.cpp b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.cpp
index 1c2857d868a604..0ead135c90d305 100644
--- a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.cpp
+++ b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.cpp
@@ -94,7 +94,7 @@ void LoopEmitter::initialize(ValueRange ts, StringAttr loopTag, bool hasOutput,
   this->loopTag = loopTag;
   this->hasOutput = hasOutput;
   this->isSparseOut = isSparseOut;
-  SparseIterator::setSparseEmitStrategy(emitStrategy);
+  this->emitStrategy = emitStrategy;
 
   const unsigned numManifestTensors = ts.size();
   const unsigned synTensorId = numManifestTensors;
@@ -166,13 +166,13 @@ void LoopEmitter::initialize(ValueRange ts, StringAttr loopTag, bool hasOutput,
 std::unique_ptr<SparseIterator>
 LoopEmitter::makeLevelIterator(OpBuilder &builder, Location loc, TensorId t,
                                Level l) {
-  auto it = makeSimpleIterator(*lvls[t][l]);
+  auto it = makeSimpleIterator(*lvls[t][l], emitStrategy);
   auto stt = getSparseTensorType(tensors[t]);
   if (stt.hasEncoding() && stt.getEncoding().isSlice()) {
     Value offset = genSliceOffset(builder, loc, tensors[t], l);
     Value stride = genSliceStride(builder, loc, tensors[t], l);
-    auto slicedIt = makeSlicedLevelIterator(std::move(it), offset, stride,
-                                            lvls[t][l]->getSize());
+    auto slicedIt = makeSlicedLevelIterator(
+        std::move(it), offset, stride, lvls[t][l]->getSize(), emitStrategy);
     return slicedIt;
   }
   return it;
@@ -186,7 +186,7 @@ void LoopEmitter::initializeLoopEmit(
     TensorId synId = getSynTensorId();
     for (unsigned i = 0, e = loopHighs.size(); i < e; i++) {
       Value sz = loopHighs[i] = synSetter(builder, loc, i);
-      auto [stl, it] = makeSynLevelAndIterator(sz, synId, i);
+      auto [stl, it] = makeSynLevelAndIterator(sz, synId, i, emitStrategy);
       lvls[synId][i] = std::move(stl);
       iters[synId][i].emplace_back(std::move(it));
     }
@@ -317,12 +317,13 @@ void LoopEmitter::initSubSectIterator(OpBuilder &builder, Location loc) {
           size = ADDI(size, ADDI(MULI(idxMax, C_IDX(stride)), C_IDX(1)));
         }
         it = makeNonEmptySubSectIterator(builder, loc, parent, loopHighs[loop],
-                                         std::move(lvlIt), size, curDep.second);
+                                         std::move(lvlIt), size, curDep.second,
+                                         emitStrategy);
       } else {
         const SparseIterator &subSectIter = *iters[t][lvl].back();
         it = makeTraverseSubSectIterator(builder, loc, subSectIter, *parent,
                                          std::move(lvlIt), loopHighs[loop],
-                                         curDep.second);
+                                         curDep.second, emitStrategy);
       }
       lastIter[t] = it.get();
       iters[t][lvl].emplace_back(std::move(it));
diff --git a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.h b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.h
index 5bab2c6a86081f..7bfe713cdd9f74 100644
--- a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.h
+++ b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.h
@@ -380,6 +380,7 @@ class LoopEmitter {
   /// tensor.
   bool hasOutput;
   bool isSparseOut;
+  SparseEmitStrategy emitStrategy;
 
   //
   // Fields which have `numTensor` many entries.
diff --git a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp
index 04b49c320f07a5..4ba9ecbe03c72d 100644
--- a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp
+++ b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp
@@ -773,9 +773,6 @@ class SubSectIterator : public SparseIterator {
 // SparseIterator derived classes implementation.
 //===----------------------------------------------------------------------===//
 
-SparseEmitStrategy SparseIterator::emitStrategy =
-    SparseEmitStrategy::kFunctional;
-
 void SparseIterator::genInit(OpBuilder &b, Location l,
                              const SparseIterator *p) {
   if (emitStrategy == SparseEmitStrategy::kDebugInterface) {
@@ -1303,27 +1300,38 @@ sparse_tensor::makeSparseTensorLevel(OpBuilder &b, Location l, Value t,
 }
 
 std::pair<std::unique_ptr<SparseTensorLevel>, std::unique_ptr<SparseIterator>>
-sparse_tensor::makeSynLevelAndIterator(Value sz, unsigned tid, unsigned lvl) {
+sparse_tensor::makeSynLevelAndIterator(Value sz, unsigned tid, unsigned lvl,
+                                       SparseEmitStrategy strategy) {
   auto stl = std::make_unique<DenseLevel>(tid, lvl, sz, /*encoded=*/false);
   auto it = std::make_unique<TrivialIterator>(*stl);
+  it->setSparseEmitStrategy(strategy);
   return std::make_pair(std::move(stl), std::move(it));
 }
 
 std::unique_ptr<SparseIterator>
-sparse_tensor::makeSimpleIterator(const SparseTensorLevel &stl) {
+sparse_tensor::makeSimpleIterator(const SparseTensorLevel &stl,
+                                  SparseEmitStrategy strategy) {
+  std::unique_ptr<SparseIterator> ret;
   if (!isUniqueLT(stl.getLT())) {
     // We always dedupliate the non-unique level, but we should optimize it away
     // if possible.
-    return std::make_unique<DedupIterator>(stl);
+    ret = std::make_unique<DedupIterator>(stl);
+  } else {
+    ret = std::make_unique<TrivialIterator>(stl);
   }
-  return std::make_unique<TrivialIterator>(stl);
+  ret->setSparseEmitStrategy(strategy);
+  return ret;
 }
 
 std::unique_ptr<SparseIterator>
 sparse_tensor::makeSlicedLevelIterator(std::unique_ptr<SparseIterator> &&sit,
-                                       Value offset, Value stride, Value size) {
+                                       Value offset, Value stride, Value size,
+                                       SparseEmitStrategy strategy) {
 
-  return std::make_unique<FilterIterator>(std::move(sit), offset, stride, size);
+  auto ret =
+      std::make_unique<FilterIterator>(std::move(sit), offset, stride, size);
+  ret->setSparseEmitStrategy(strategy);
+  return ret;
 }
 
 static const SparseIterator *tryUnwrapFilter(const SparseIterator *it) {
@@ -1335,38 +1343,42 @@ static const SparseIterator *tryUnwrapFilter(const SparseIterator *it) {
 
 std::unique_ptr<SparseIterator> sparse_tensor::makeNonEmptySubSectIterator(
     OpBuilder &b, Location l, const SparseIterator *parent, Value loopBound,
-    std::unique_ptr<SparseIterator> &&delegate, Value size, unsigned stride) {
+    std::unique_ptr<SparseIterator> &&delegate, Value size, unsigned stride,
+    SparseEmitStrategy strategy) {
 
   // Try unwrap the NonEmptySubSectIterator from a filter parent.
   parent = tryUnwrapFilter(parent);
-  auto it = std::make_unique<NonEmptySubSectIterator>(
-      b, l, parent, std::move(delegate), size);
+  std::unique_ptr<SparseIterator> it =
+      std::make_unique<NonEmptySubSectIterator>(b, l, parent,
+                                                std::move(delegate), size);
 
   if (stride != 1) {
     // TODO: We can safely skip bound checking on sparse levels, but for dense
     // iteration space, we need the bound to infer the dense loop range.
-    return std::make_unique<FilterIterator>(std::move(it), /*offset=*/C_IDX(0),
-                                            C_IDX(stride), /*size=*/loopBound);
+    it = std::make_unique<FilterIterator>(std::move(it), /*offset=*/C_IDX(0),
+                                          C_IDX(stride), /*size=*/loopBound);
   }
+  it->setSparseEmitStrategy(strategy);
   return it;
 }
 
 std::unique_ptr<SparseIterator> sparse_tensor::makeTraverseSubSectIterator(
     OpBuilder &b, Location l, const SparseIterator &subSectIter,
     const SparseIterator &parent, std::unique_ptr<SparseIterator> &&wrap,
-    Value loopBound, unsigned stride) {
+    Value loopBound, unsigned stride, SparseEmitStrategy strategy) {
 
   // This must be a subsection iterator or a filtered subsection iterator.
   auto &subSect =
       llvm::cast<NonEmptySubSectIterator>(*tryUnwrapFilter(&subSectIter));
 
-  auto it = std::make_unique<SubSectIterator>(
+  std::unique_ptr<SparseIterator> it = std::make_unique<SubSectIterator>(
       subSect, *tryUnwrapFilter(&parent), std::move(wrap));
 
   if (stride != 1) {
-    return std::make_unique<FilterIterator>(std::move(it), /*offset=*/C_IDX(0),
-                                            C_IDX(stride), /*size=*/loopBound);
+    it = std::make_unique<FilterIterator>(std::move(it), /*offset=*/C_IDX(0),
+                                          C_IDX(stride), /*size=*/loopBound);
   }
+  it->setSparseEmitStrategy(strategy);
   return it;
 }
 
diff --git a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.h b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.h
index fc2d9de66cfe72..d1e94b790bea6b 100644
--- a/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.h
+++ b/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.h
@@ -111,8 +111,8 @@ class SparseIterator {
 public:
   virtual ~SparseIterator() = default;
 
-  static void setSparseEmitStrategy(SparseEmitStrategy strategy) {
-    SparseIterator::emitStrategy = strategy;
+  void setSparseEmitStrategy(SparseEmitStrategy strategy) {
+    emitStrategy = strategy;
   }
 
   virtual std::string getDebugInterfacePrefix() const = 0;
@@ -248,7 +248,7 @@ class SparseIterator {
     return ref.take_front(cursorValsCnt);
   }
 
-  static SparseEmitStrategy emitStrategy;
+  SparseEmitStrategy emitStrategy;
 
 public:
   const IterKind kind;     // For LLVM-style RTTI.
@@ -277,32 +277,34 @@ std::unique_ptr<SparseTensorLevel> makeSparseTensorLevel(OpBuilder &builder,
 
 /// Helper function to create a simple SparseIterator object that iterate over
 /// the SparseTensorLevel.
-std::unique_ptr<SparseIterator>
-makeSimpleIterator(const SparseTensorLevel &stl);
+std::unique_ptr<SparseIterator> makeSimpleIterator(const SparseTensorLevel &stl,
+                                                   SparseEmitStrategy strategy);
 
 /// Helper function to create a synthetic SparseIterator object that iterate
 /// over a dense space specified by [0,`sz`).
 std::pair<std::unique_ptr<SparseTensorLevel>, std::unique_ptr<SparseIterator>>
-makeSynLevelAndIterator(Value sz, unsigned tid, unsigned lvl);
+makeSynLevelAndIterator(Value sz, unsigned tid, unsigned lvl,
+                        SparseEmitStrategy strategy);
 
 /// Helper function to create a SparseIterator object that iterate over a
 /// sliced space, the orignal space (before slicing) is traversed by `sit`.
 std::unique_ptr<SparseIterator>
 makeSlicedLevelIterator(std::unique_ptr<SparseIterator> &&sit, Value offset,
-                        Value stride, Value size);
+                        Value stride, Value size, SparseEmitStrategy strategy);
 
 /// Helper function to create a SparseIterator object that iterate over the
 /// non-empty subsections set.
 std::unique_ptr<SparseIterator> makeNonEmptySubSectIterator(
     OpBuilder &b, Location l, const SparseIterator *parent, Value loopBound,
-    std::unique_ptr<SparseIterator> &&delegate, Value size, unsigned stride);
+    std::unique_ptr<SparseIterator> &&delegate, Value size, unsigned stride,
+    SparseEmitStrategy strategy);
 
 /// Helper function to create a SparseIterator object that iterate over a
 /// non-empty subsection created by NonEmptySubSectIterator.
 std::unique_ptr<SparseIterator> makeTraverseSubSectIterator(
     OpBuilder &b, Location l, const SparseIterator &subsectIter,
     const SparseIterator &parent, std::unique_ptr<SparseIterator> &&wrap,
-    Value loopBound, unsigned stride);
+    Value loopBound, unsigned stride, SparseEmitStrategy strategy);
 
 } // namespace sparse_tensor
 } // namespace mlir

>From 84a30459df885da37c3e5a8c161ec0b7924fa773 Mon Sep 17 00:00:00 2001
From: Jan Svoboda <jan_svoboda at apple.com>
Date: Thu, 8 Feb 2024 19:19:18 +0100
Subject: [PATCH 65/72] [clang][lex] Always pass suggested module to
 `InclusionDirective()` callback (#81061)

This patch provides more information to the
`PPCallbacks::InclusionDirective()` hook. We now always pass the
suggested module, regardless of whether it was actually imported or not.
The extra `bool ModuleImported` parameter then denotes whether the
header `#include` will be automatically translated into import the the
module.

The main change is in `clang/lib/Lex/PPDirectives.cpp`, where we take
care to not modify `SuggestedModule` after it's been populated by
`LookupHeaderIncludeOrImport()`. We now exclusively use the `SM`
(`ModuleToImport`) variable instead, which has been equivalent to
`SuggestedModule` until now. This allows us to use the original
non-modified `SuggestedModule` for the callback itself.

(This patch turns out to be necessary for
https://github.com/apple/llvm-project/pull/8011).
---
 clang-tools-extra/clang-move/Move.cpp         |  3 +-
 .../ExpandModularHeadersPPCallbacks.cpp       |  6 +-
 .../ExpandModularHeadersPPCallbacks.h         |  2 +-
 .../altera/KernelNameRestrictionCheck.cpp     |  5 +-
 .../bugprone/SuspiciousIncludeCheck.cpp       |  7 +-
 .../clang-tidy/llvm/IncludeOrderCheck.cpp     |  7 +-
 .../RestrictSystemLibcHeadersCheck.cpp        |  9 +--
 .../misc/HeaderIncludeCycleCheck.cpp          |  2 +-
 .../modernize/DeprecatedHeadersCheck.cpp      |  7 +-
 .../clang-tidy/modernize/MacroToEnumCheck.cpp |  3 +-
 .../RestrictSystemIncludesCheck.cpp           |  4 +-
 .../portability/RestrictSystemIncludesCheck.h |  3 +-
 .../readability/DuplicateIncludeCheck.cpp     |  7 +-
 .../clang-tidy/utils/IncludeInserter.cpp      |  3 +-
 clang-tools-extra/clangd/Headers.cpp          |  3 +-
 clang-tools-extra/clangd/ParsedAST.cpp        |  2 +-
 .../clangd/index/IndexAction.cpp              |  3 +-
 .../clangd/unittests/ReplayPeambleTests.cpp   |  2 +-
 .../include-cleaner/lib/Record.cpp            |  6 +-
 .../modularize/CoverageChecker.cpp            |  3 +-
 .../modularize/PreprocessorTracker.cpp        | 20 +++---
 .../pp-trace/PPCallbacksTracker.cpp           |  6 +-
 .../pp-trace/PPCallbacksTracker.h             |  3 +-
 .../test/pp-trace/pp-trace-include.cpp        | 12 ++--
 clang/include/clang/Lex/PPCallbacks.h         | 16 +++--
 clang/include/clang/Lex/PreprocessingRecord.h |  3 +-
 .../DependencyScanning/ModuleDepCollector.h   |  3 +-
 clang/lib/CodeGen/MacroPPCallbacks.cpp        |  4 +-
 clang/lib/CodeGen/MacroPPCallbacks.h          |  3 +-
 clang/lib/Frontend/DependencyFile.cpp         |  3 +-
 clang/lib/Frontend/DependencyGraph.cpp        |  7 +-
 .../Frontend/ModuleDependencyCollector.cpp    |  3 +-
 clang/lib/Frontend/PrecompiledPreamble.cpp    |  3 +-
 .../lib/Frontend/PrintPreprocessedOutput.cpp  | 11 +--
 .../Frontend/Rewrite/InclusionRewriter.cpp    | 10 +--
 clang/lib/Lex/PPDirectives.cpp                | 70 +++++++++----------
 clang/lib/Lex/PreprocessingRecord.cpp         | 11 ++-
 .../DependencyScanning/ModuleDepCollector.cpp |  8 +--
 clang/tools/libclang/Indexing.cpp             |  5 +-
 clang/unittests/Lex/PPCallbacksTest.cpp       |  9 ++-
 40 files changed, 168 insertions(+), 129 deletions(-)

diff --git a/clang-tools-extra/clang-move/Move.cpp b/clang-tools-extra/clang-move/Move.cpp
index 1d10348430c281..ac16803b46783e 100644
--- a/clang-tools-extra/clang-move/Move.cpp
+++ b/clang-tools-extra/clang-move/Move.cpp
@@ -133,7 +133,8 @@ class FindAllIncludes : public PPCallbacks {
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef /*File*/, StringRef SearchPath,
                           StringRef /*RelativePath*/,
-                          const Module * /*Imported*/,
+                          const Module * /*SuggestedModule*/,
+                          bool /*ModuleImported*/,
                           SrcMgr::CharacteristicKind /*FileType*/) override {
     if (auto FileEntry = SM.getFileEntryRefForID(SM.getFileID(HashLoc)))
       MoveTool->addIncludes(FileName, IsAngled, SearchPath,
diff --git a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
index 5ecd4fb19131e4..5e2cc207560d33 100644
--- a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
+++ b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
@@ -166,12 +166,12 @@ void ExpandModularHeadersPPCallbacks::InclusionDirective(
     SourceLocation DirectiveLoc, const Token &IncludeToken,
     StringRef IncludedFilename, bool IsAngled, CharSourceRange FilenameRange,
     OptionalFileEntryRef IncludedFile, StringRef SearchPath,
-    StringRef RelativePath, const Module *Imported,
+    StringRef RelativePath, const Module *SuggestedModule, bool ModuleImported,
     SrcMgr::CharacteristicKind FileType) {
-  if (Imported) {
+  if (ModuleImported) {
     serialization::ModuleFile *MF =
         Compiler.getASTReader()->getModuleManager().lookup(
-            *Imported->getASTFile());
+            *SuggestedModule->getASTFile());
     handleModuleFile(MF);
   }
   parseToLocation(DirectiveLoc);
diff --git a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.h b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.h
index 3f6abc315e5b90..0742c21bc43720 100644
--- a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.h
+++ b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.h
@@ -69,7 +69,7 @@ class ExpandModularHeadersPPCallbacks : public PPCallbacks {
                           bool IsAngled, CharSourceRange FilenameRange,
                           OptionalFileEntryRef IncludedFile,
                           StringRef SearchPath, StringRef RelativePath,
-                          const Module *Imported,
+                          const Module *SuggestedModule, bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
   void EndOfMainFile() override;
diff --git a/clang-tools-extra/clang-tidy/altera/KernelNameRestrictionCheck.cpp b/clang-tools-extra/clang-tidy/altera/KernelNameRestrictionCheck.cpp
index 084e44a714d1ff..fb1e0e82a3149b 100644
--- a/clang-tools-extra/clang-tidy/altera/KernelNameRestrictionCheck.cpp
+++ b/clang-tools-extra/clang-tidy/altera/KernelNameRestrictionCheck.cpp
@@ -29,7 +29,8 @@ class KernelNameRestrictionPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FileNameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
   void EndOfMainFile() override;
@@ -61,7 +62,7 @@ void KernelNameRestrictionCheck::registerPPCallbacks(const SourceManager &SM,
 void KernelNameRestrictionPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &, StringRef FileName, bool,
     CharSourceRange, OptionalFileEntryRef, StringRef, StringRef, const Module *,
-    SrcMgr::CharacteristicKind) {
+    bool, SrcMgr::CharacteristicKind) {
   IncludeDirective ID = {HashLoc, FileName};
   IncludeDirectives.push_back(std::move(ID));
 }
diff --git a/clang-tools-extra/clang-tidy/bugprone/SuspiciousIncludeCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/SuspiciousIncludeCheck.cpp
index 61d89cf3081306..09ba79f0557525 100644
--- a/clang-tools-extra/clang-tidy/bugprone/SuspiciousIncludeCheck.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/SuspiciousIncludeCheck.cpp
@@ -26,7 +26,8 @@ class SuspiciousIncludePPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
 private:
@@ -51,8 +52,8 @@ void SuspiciousIncludeCheck::registerPPCallbacks(
 void SuspiciousIncludePPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   if (IncludeTok.getIdentifierInfo()->getPPKeywordID() == tok::pp_import)
     return;
 
diff --git a/clang-tools-extra/clang-tidy/llvm/IncludeOrderCheck.cpp b/clang-tools-extra/clang-tidy/llvm/IncludeOrderCheck.cpp
index bdd72f85e2a27c..4246c8c574c50d 100644
--- a/clang-tools-extra/clang-tidy/llvm/IncludeOrderCheck.cpp
+++ b/clang-tools-extra/clang-tidy/llvm/IncludeOrderCheck.cpp
@@ -27,7 +27,8 @@ class IncludeOrderPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void EndOfMainFile() override;
 
@@ -81,8 +82,8 @@ static int getPriority(StringRef Filename, bool IsAngled, bool IsMainModule) {
 void IncludeOrderPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   // We recognize the first include as a special main module header and want
   // to leave it in the top position.
   IncludeDirective ID = {HashLoc, FilenameRange, std::string(FileName),
diff --git a/clang-tools-extra/clang-tidy/llvmlibc/RestrictSystemLibcHeadersCheck.cpp b/clang-tools-extra/clang-tidy/llvmlibc/RestrictSystemLibcHeadersCheck.cpp
index 3451d3474fd906..b656917071a6ca 100644
--- a/clang-tools-extra/clang-tidy/llvmlibc/RestrictSystemLibcHeadersCheck.cpp
+++ b/clang-tools-extra/clang-tidy/llvmlibc/RestrictSystemLibcHeadersCheck.cpp
@@ -33,7 +33,8 @@ class RestrictedIncludesPPCallbacks
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
 private:
@@ -45,14 +46,14 @@ class RestrictedIncludesPPCallbacks
 void RestrictedIncludesPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   // Compiler provided headers are allowed (e.g stddef.h).
   if (SrcMgr::isSystem(FileType) && SearchPath == CompilerIncudeDir)
     return;
   portability::RestrictedIncludesPPCallbacks::InclusionDirective(
       HashLoc, IncludeTok, FileName, IsAngled, FilenameRange, File, SearchPath,
-      RelativePath, Imported, FileType);
+      RelativePath, SuggestedModule, ModuleImported, FileType);
 }
 
 void RestrictSystemLibcHeadersCheck::registerPPCallbacks(
diff --git a/clang-tools-extra/clang-tidy/misc/HeaderIncludeCycleCheck.cpp b/clang-tools-extra/clang-tidy/misc/HeaderIncludeCycleCheck.cpp
index bebd6e390ed53c..fadfdc869d37b0 100644
--- a/clang-tools-extra/clang-tidy/misc/HeaderIncludeCycleCheck.cpp
+++ b/clang-tools-extra/clang-tidy/misc/HeaderIncludeCycleCheck.cpp
@@ -83,7 +83,7 @@ class CyclicDependencyCallbacks : public PPCallbacks {
   void InclusionDirective(SourceLocation, const Token &, StringRef FilePath,
                           bool, CharSourceRange Range,
                           OptionalFileEntryRef File, StringRef, StringRef,
-                          const Module *,
+                          const Module *, bool,
                           SrcMgr::CharacteristicKind FileType) override {
     if (FileType != clang::SrcMgr::C_User)
       return;
diff --git a/clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp b/clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp
index 030a781e2099be..6d287eb3642dfa 100644
--- a/clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/DeprecatedHeadersCheck.cpp
@@ -32,7 +32,8 @@ class IncludeModernizePPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
 private:
@@ -178,8 +179,8 @@ IncludeModernizePPCallbacks::IncludeModernizePPCallbacks(
 void IncludeModernizePPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
 
   // If we don't want to warn for non-main file reports and this is one, skip
   // it.
diff --git a/clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp b/clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
index b197c22dca410e..0b47ed316ca271 100644
--- a/clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
@@ -117,7 +117,8 @@ class MacroToEnumCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     clearCurrentEnum(HashLoc);
   }
diff --git a/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.cpp b/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.cpp
index 9ee0b4e6d3ccb8..db5693e3b7cb7d 100644
--- a/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.cpp
+++ b/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.cpp
@@ -21,8 +21,8 @@ namespace clang::tidy::portability {
 void RestrictedIncludesPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   if (!Check.contains(FileName) && SrcMgr::isSystem(FileType)) {
     SmallString<256> FullPath;
     llvm::sys::path::append(FullPath, SearchPath);
diff --git a/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.h b/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.h
index ad18e6f411dbbd..60fae5e73a6026 100644
--- a/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.h
+++ b/clang-tools-extra/clang-tidy/portability/RestrictSystemIncludesCheck.h
@@ -50,7 +50,8 @@ class RestrictedIncludesPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void EndOfMainFile() override;
 
diff --git a/clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp b/clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
index d1f41e0ec79e21..67147164946ab4 100644
--- a/clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
+++ b/clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
@@ -47,7 +47,8 @@ class DuplicateIncludeCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
   void MacroDefined(const Token &MacroNameTok,
@@ -76,8 +77,8 @@ void DuplicateIncludeCallbacks::FileChanged(SourceLocation Loc,
 void DuplicateIncludeCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   if (llvm::is_contained(Files.back(), FileName)) {
     // We want to delete the entire line, so make sure that [Start,End] covers
     // everything.
diff --git a/clang-tools-extra/clang-tidy/utils/IncludeInserter.cpp b/clang-tools-extra/clang-tidy/utils/IncludeInserter.cpp
index d0b7474992abd0..b53016f331b793 100644
--- a/clang-tools-extra/clang-tidy/utils/IncludeInserter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/IncludeInserter.cpp
@@ -25,7 +25,8 @@ class IncludeInserterCallback : public PPCallbacks {
                           bool IsAngled, CharSourceRange FileNameRange,
                           OptionalFileEntryRef /*IncludedFile*/,
                           StringRef /*SearchPath*/, StringRef /*RelativePath*/,
-                          const Module * /*ImportedModule*/,
+                          const Module * /*SuggestedModule*/,
+                          bool /*ModuleImported*/,
                           SrcMgr::CharacteristicKind /*FileType*/) override {
     Inserter->addInclude(FileNameRef, IsAngled, HashLocation,
                          IncludeToken.getEndLoc());
diff --git a/clang-tools-extra/clangd/Headers.cpp b/clang-tools-extra/clangd/Headers.cpp
index 076e636e0e2819..75f8668e7bef06 100644
--- a/clang-tools-extra/clangd/Headers.cpp
+++ b/clang-tools-extra/clangd/Headers.cpp
@@ -41,7 +41,8 @@ class IncludeStructure::RecordHeaders : public PPCallbacks {
                           OptionalFileEntryRef File,
                           llvm::StringRef /*SearchPath*/,
                           llvm::StringRef /*RelativePath*/,
-                          const clang::Module * /*Imported*/,
+                          const clang::Module * /*SuggestedModule*/,
+                          bool /*ModuleImported*/,
                           SrcMgr::CharacteristicKind FileKind) override {
     auto MainFID = SM.getMainFileID();
     // If an include is part of the preamble patch, translate #line directives.
diff --git a/clang-tools-extra/clangd/ParsedAST.cpp b/clang-tools-extra/clangd/ParsedAST.cpp
index 14a91797f4d2ea..bbb0e2c77b3f31 100644
--- a/clang-tools-extra/clangd/ParsedAST.cpp
+++ b/clang-tools-extra/clangd/ParsedAST.cpp
@@ -244,7 +244,7 @@ class ReplayPreamble : private PPCallbacks {
                             SynthesizedFilenameTok.getEndLoc())
               .toCharRange(SM),
           File, "SearchPath", "RelPath",
-          /*Imported=*/nullptr, Inc.FileKind);
+          /*SuggestedModule=*/nullptr, /*ModuleImported=*/false, Inc.FileKind);
       if (File)
         Delegate->FileSkipped(*File, SynthesizedFilenameTok, Inc.FileKind);
     }
diff --git a/clang-tools-extra/clangd/index/IndexAction.cpp b/clang-tools-extra/clangd/index/IndexAction.cpp
index 5d56285a839614..ed56c2a9d2e811 100644
--- a/clang-tools-extra/clangd/index/IndexAction.cpp
+++ b/clang-tools-extra/clangd/index/IndexAction.cpp
@@ -89,7 +89,8 @@ struct IncludeGraphCollector : public PPCallbacks {
                           llvm::StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, llvm::StringRef SearchPath,
-                          llvm::StringRef RelativePath, const Module *Imported,
+                          llvm::StringRef RelativePath,
+                          const Module *SuggestedModule, bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     auto IncludeURI = toURI(File);
     if (!IncludeURI)
diff --git a/clang-tools-extra/clangd/unittests/ReplayPeambleTests.cpp b/clang-tools-extra/clangd/unittests/ReplayPeambleTests.cpp
index 472fe30ee46ed4..147d9abe691372 100644
--- a/clang-tools-extra/clangd/unittests/ReplayPeambleTests.cpp
+++ b/clang-tools-extra/clangd/unittests/ReplayPeambleTests.cpp
@@ -72,7 +72,7 @@ struct ReplayPreamblePPCallback : public PPCallbacks {
   void InclusionDirective(SourceLocation HashLoc, const Token &IncludeTok,
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange, OptionalFileEntryRef,
-                          StringRef, StringRef, const clang::Module *,
+                          StringRef, StringRef, const clang::Module *, bool,
                           SrcMgr::CharacteristicKind) override {
     Includes.emplace_back(SM, HashLoc, IncludeTok, FileName, IsAngled,
                           FilenameRange);
diff --git a/clang-tools-extra/include-cleaner/lib/Record.cpp b/clang-tools-extra/include-cleaner/lib/Record.cpp
index c93c56adf650d9..78a4df6cc40ea2 100644
--- a/clang-tools-extra/include-cleaner/lib/Record.cpp
+++ b/clang-tools-extra/include-cleaner/lib/Record.cpp
@@ -65,7 +65,8 @@ class PPRecorder : public PPCallbacks {
                           StringRef SpelledFilename, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind) override {
     if (!Active)
       return;
@@ -214,7 +215,8 @@ class PragmaIncludes::RecordPragma : public PPCallbacks, public CommentHandler {
                           OptionalFileEntryRef File,
                           llvm::StringRef /*SearchPath*/,
                           llvm::StringRef /*RelativePath*/,
-                          const clang::Module * /*Imported*/,
+                          const clang::Module * /*SuggestedModule*/,
+                          bool /*ModuleImported*/,
                           SrcMgr::CharacteristicKind FileKind) override {
     FileID HashFID = SM.getFileID(HashLoc);
     int HashLine = SM.getLineNumber(HashFID, SM.getFileOffset(HashLoc));
diff --git a/clang-tools-extra/modularize/CoverageChecker.cpp b/clang-tools-extra/modularize/CoverageChecker.cpp
index 1e8b0aa37ca309..0e76c539aa3c83 100644
--- a/clang-tools-extra/modularize/CoverageChecker.cpp
+++ b/clang-tools-extra/modularize/CoverageChecker.cpp
@@ -90,7 +90,8 @@ class CoverageCheckerCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     Checker.collectUmbrellaHeaderHeader(File->getName());
   }
diff --git a/clang-tools-extra/modularize/PreprocessorTracker.cpp b/clang-tools-extra/modularize/PreprocessorTracker.cpp
index 7557fb177ceb48..85e3aab041e49d 100644
--- a/clang-tools-extra/modularize/PreprocessorTracker.cpp
+++ b/clang-tools-extra/modularize/PreprocessorTracker.cpp
@@ -730,15 +730,14 @@ class PreprocessorCallbacks : public clang::PPCallbacks {
   ~PreprocessorCallbacks() override {}
 
   // Overridden handlers.
-  void InclusionDirective(clang::SourceLocation HashLoc,
-                          const clang::Token &IncludeTok,
-                          llvm::StringRef FileName, bool IsAngled,
-                          clang::CharSourceRange FilenameRange,
-                          clang::OptionalFileEntryRef File,
-                          llvm::StringRef SearchPath,
-                          llvm::StringRef RelativePath,
-                          const clang::Module *Imported,
-                          clang::SrcMgr::CharacteristicKind FileType) override;
+  void
+  InclusionDirective(clang::SourceLocation HashLoc,
+                     const clang::Token &IncludeTok, llvm::StringRef FileName,
+                     bool IsAngled, clang::CharSourceRange FilenameRange,
+                     clang::OptionalFileEntryRef File,
+                     llvm::StringRef SearchPath, llvm::StringRef RelativePath,
+                     const clang::Module *SuggestedModule, bool ModuleImported,
+                     clang::SrcMgr::CharacteristicKind FileType) override;
   void FileChanged(clang::SourceLocation Loc,
                    clang::PPCallbacks::FileChangeReason Reason,
                    clang::SrcMgr::CharacteristicKind FileType,
@@ -1275,7 +1274,8 @@ void PreprocessorCallbacks::InclusionDirective(
     llvm::StringRef FileName, bool IsAngled,
     clang::CharSourceRange FilenameRange, clang::OptionalFileEntryRef File,
     llvm::StringRef SearchPath, llvm::StringRef RelativePath,
-    const clang::Module *Imported, clang::SrcMgr::CharacteristicKind FileType) {
+    const clang::Module *SuggestedModule, bool ModuleImported,
+    clang::SrcMgr::CharacteristicKind FileType) {
   int DirectiveLine, DirectiveColumn;
   std::string HeaderPath = getSourceLocationFile(PP, HashLoc);
   getSourceLocationLineAndColumn(PP, HashLoc, DirectiveLine, DirectiveColumn);
diff --git a/clang-tools-extra/pp-trace/PPCallbacksTracker.cpp b/clang-tools-extra/pp-trace/PPCallbacksTracker.cpp
index a59a8278682b23..3bb30fd15b2e1d 100644
--- a/clang-tools-extra/pp-trace/PPCallbacksTracker.cpp
+++ b/clang-tools-extra/pp-trace/PPCallbacksTracker.cpp
@@ -135,7 +135,8 @@ void PPCallbacksTracker::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, llvm::StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
     llvm::StringRef SearchPath, llvm::StringRef RelativePath,
-    const Module *Imported, SrcMgr::CharacteristicKind FileType) {
+    const Module *SuggestedModule, bool ModuleImported,
+    SrcMgr::CharacteristicKind FileType) {
   beginCallback("InclusionDirective");
   appendArgument("HashLoc", HashLoc);
   appendArgument("IncludeTok", IncludeTok);
@@ -145,7 +146,8 @@ void PPCallbacksTracker::InclusionDirective(
   appendArgument("File", File);
   appendFilePathArgument("SearchPath", SearchPath);
   appendFilePathArgument("RelativePath", RelativePath);
-  appendArgument("Imported", Imported);
+  appendArgument("SuggestedModule", SuggestedModule);
+  appendArgument("ModuleImported", ModuleImported);
 }
 
 // Callback invoked whenever there was an explicit module-import
diff --git a/clang-tools-extra/pp-trace/PPCallbacksTracker.h b/clang-tools-extra/pp-trace/PPCallbacksTracker.h
index c195a72b08c1aa..04590a919369ae 100644
--- a/clang-tools-extra/pp-trace/PPCallbacksTracker.h
+++ b/clang-tools-extra/pp-trace/PPCallbacksTracker.h
@@ -95,7 +95,8 @@ class PPCallbacksTracker : public PPCallbacks {
                           llvm::StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, llvm::StringRef SearchPath,
-                          llvm::StringRef RelativePath, const Module *Imported,
+                          llvm::StringRef RelativePath,
+                          const Module *SuggestedModule, bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void moduleImport(SourceLocation ImportLoc, ModuleIdPath Path,
                     const Module *Imported) override;
diff --git a/clang-tools-extra/test/pp-trace/pp-trace-include.cpp b/clang-tools-extra/test/pp-trace/pp-trace-include.cpp
index db0b2c89430a21..ea9896e1cfde25 100644
--- a/clang-tools-extra/test/pp-trace/pp-trace-include.cpp
+++ b/clang-tools-extra/test/pp-trace/pp-trace-include.cpp
@@ -59,7 +59,8 @@
 // CHECK-NEXT:   File: "{{.*}}{{[/\\]}}Inputs/Level1A.h"
 // CHECK-NEXT:   SearchPath: "{{.*}}{{[/\\]}}pp-trace"
 // CHECK-NEXT:   RelativePath: "Inputs/Level1A.h"
-// CHECK-NEXT:   Imported: (null)
+// CHECK-NEXT:   SuggestedModule: (null)
+// CHECK-NEXT:   ModuleImported: false
 // CHECK-NEXT: - Callback: FileChanged
 // CHECK-NEXT:   Loc: "{{.*}}{{[/\\]}}Inputs/Level1A.h:1:1"
 // CHECK-NEXT:   Reason: EnterFile
@@ -74,7 +75,8 @@
 // CHECK-NEXT:   File: "{{.*}}{{[/\\]}}Inputs/Level2A.h"
 // CHECK-NEXT:   SearchPath: "{{.*}}{{[/\\]}}Inputs"
 // CHECK-NEXT:   RelativePath: "Level2A.h"
-// CHECK-NEXT:   Imported: (null)
+// CHECK-NEXT:   SuggestedModule: (null)
+// CHECK-NEXT:   ModuleImported: false
 // CHECK-NEXT: - Callback: FileChanged
 // CHECK-NEXT:   Loc: "{{.*}}{{[/\\]}}Inputs/Level2A.h:1:1"
 // CHECK-NEXT:   Reason: EnterFile
@@ -105,7 +107,8 @@
 // CHECK-NEXT:   File: "{{.*}}{{[/\\]}}Inputs/Level1B.h"
 // CHECK-NEXT:   SearchPath: "{{.*}}{{[/\\]}}pp-trace"
 // CHECK-NEXT:   RelativePath: "Inputs/Level1B.h"
-// CHECK-NEXT:   Imported: (null)
+// CHECK-NEXT:   SuggestedModule: (null)
+// CHECK-NEXT:   ModuleImported: false
 // CHECK-NEXT: - Callback: FileChanged
 // CHECK-NEXT:   Loc: "{{.*}}{{[/\\]}}Inputs/Level1B.h:1:1"
 // CHECK-NEXT:   Reason: EnterFile
@@ -120,7 +123,8 @@
 // CHECK-NEXT:   File: "{{.*}}{{[/\\]}}Inputs/Level2B.h"
 // CHECK-NEXT:   SearchPath: "{{.*}}{{[/\\]}}Inputs"
 // CHECK-NEXT:   RelativePath: "Level2B.h"
-// CHECK-NEXT:   Imported: (null)
+// CHECK-NEXT:   SuggestedModule: (null)
+// CHECK-NEXT:   ModuleImported: false
 // CHECK-NEXT: - Callback: FileChanged
 // CHECK-NEXT:   Loc: "{{.*}}{{[/\\]}}Inputs/Level2B.h:1:1"
 // CHECK-NEXT:   Reason: EnterFile
diff --git a/clang/include/clang/Lex/PPCallbacks.h b/clang/include/clang/Lex/PPCallbacks.h
index e3942af7be2803..dfc74b52686f1e 100644
--- a/clang/include/clang/Lex/PPCallbacks.h
+++ b/clang/include/clang/Lex/PPCallbacks.h
@@ -127,8 +127,10 @@ class PPCallbacks {
   /// \param RelativePath The path relative to SearchPath, at which the include
   /// file was found. This is equal to FileName except for framework includes.
   ///
-  /// \param Imported The module, whenever an inclusion directive was
-  /// automatically turned into a module import or null otherwise.
+  /// \param SuggestedModule The module suggested for this header, if any.
+  ///
+  /// \param ModuleImported Whether this include was translated into import of
+  /// \p SuggestedModule.
   ///
   /// \param FileType The characteristic kind, indicates whether a file or
   /// directory holds normal user code, system code, or system code which is
@@ -139,7 +141,8 @@ class PPCallbacks {
                                   bool IsAngled, CharSourceRange FilenameRange,
                                   OptionalFileEntryRef File,
                                   StringRef SearchPath, StringRef RelativePath,
-                                  const Module *Imported,
+                                  const Module *SuggestedModule,
+                                  bool ModuleImported,
                                   SrcMgr::CharacteristicKind FileType) {}
 
   /// Callback invoked whenever a submodule was entered.
@@ -473,14 +476,15 @@ class PPChainedCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     First->InclusionDirective(HashLoc, IncludeTok, FileName, IsAngled,
                               FilenameRange, File, SearchPath, RelativePath,
-                              Imported, FileType);
+                              SuggestedModule, ModuleImported, FileType);
     Second->InclusionDirective(HashLoc, IncludeTok, FileName, IsAngled,
                                FilenameRange, File, SearchPath, RelativePath,
-                               Imported, FileType);
+                               SuggestedModule, ModuleImported, FileType);
   }
 
   void EnteredSubmodule(Module *M, SourceLocation ImportLoc,
diff --git a/clang/include/clang/Lex/PreprocessingRecord.h b/clang/include/clang/Lex/PreprocessingRecord.h
index 5ddf024186f865..437d8e4cc174ed 100644
--- a/clang/include/clang/Lex/PreprocessingRecord.h
+++ b/clang/include/clang/Lex/PreprocessingRecord.h
@@ -532,7 +532,8 @@ class Token;
                             StringRef FileName, bool IsAngled,
                             CharSourceRange FilenameRange,
                             OptionalFileEntryRef File, StringRef SearchPath,
-                            StringRef RelativePath, const Module *Imported,
+                            StringRef RelativePath,
+                            const Module *SuggestedModule, bool ModuleImported,
                             SrcMgr::CharacteristicKind FileType) override;
     void Ifdef(SourceLocation Loc, const Token &MacroNameTok,
                const MacroDefinition &MD) override;
diff --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 051363b075de99..13ad2530864927 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -166,7 +166,8 @@ class ModuleDepCollectorPP final : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void moduleImport(SourceLocation ImportLoc, ModuleIdPath Path,
                     const Module *Imported) override;
diff --git a/clang/lib/CodeGen/MacroPPCallbacks.cpp b/clang/lib/CodeGen/MacroPPCallbacks.cpp
index 8589869f6e2fb5..c5d1e3ad5a2054 100644
--- a/clang/lib/CodeGen/MacroPPCallbacks.cpp
+++ b/clang/lib/CodeGen/MacroPPCallbacks.cpp
@@ -168,8 +168,8 @@ void MacroPPCallbacks::FileChanged(SourceLocation Loc, FileChangeReason Reason,
 void MacroPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
 
   // Record the line location of the current included file.
   LastHashLoc = HashLoc;
diff --git a/clang/lib/CodeGen/MacroPPCallbacks.h b/clang/lib/CodeGen/MacroPPCallbacks.h
index 5af177d0c3fa21..5f468648da0448 100644
--- a/clang/lib/CodeGen/MacroPPCallbacks.h
+++ b/clang/lib/CodeGen/MacroPPCallbacks.h
@@ -102,7 +102,8 @@ class MacroPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
   /// Hook called whenever a macro definition is seen.
diff --git a/clang/lib/Frontend/DependencyFile.cpp b/clang/lib/Frontend/DependencyFile.cpp
index 19abcac2befbdd..369816e89e1d6c 100644
--- a/clang/lib/Frontend/DependencyFile.cpp
+++ b/clang/lib/Frontend/DependencyFile.cpp
@@ -66,7 +66,8 @@ struct DepCollectorPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     if (!File)
       DepCollector.maybeAddDependency(FileName, /*FromModule*/ false,
diff --git a/clang/lib/Frontend/DependencyGraph.cpp b/clang/lib/Frontend/DependencyGraph.cpp
index b471471f3528a7..20e5f233e224e2 100644
--- a/clang/lib/Frontend/DependencyGraph.cpp
+++ b/clang/lib/Frontend/DependencyGraph.cpp
@@ -49,7 +49,8 @@ class DependencyGraphCallback : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
 
   void EndOfMainFile() override {
@@ -68,8 +69,8 @@ void clang::AttachDependencyGraphGen(Preprocessor &PP, StringRef OutputFile,
 void DependencyGraphCallback::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   if (!File)
     return;
 
diff --git a/clang/lib/Frontend/ModuleDependencyCollector.cpp b/clang/lib/Frontend/ModuleDependencyCollector.cpp
index 939e611e548998..b88cb60ebdd2a5 100644
--- a/clang/lib/Frontend/ModuleDependencyCollector.cpp
+++ b/clang/lib/Frontend/ModuleDependencyCollector.cpp
@@ -55,7 +55,8 @@ struct ModuleDependencyPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     if (!File)
       return;
diff --git a/clang/lib/Frontend/PrecompiledPreamble.cpp b/clang/lib/Frontend/PrecompiledPreamble.cpp
index 62373b23b82efb..9b0ef30a14121b 100644
--- a/clang/lib/Frontend/PrecompiledPreamble.cpp
+++ b/clang/lib/Frontend/PrecompiledPreamble.cpp
@@ -98,7 +98,8 @@ class MissingFileCollector : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     // File is std::nullopt if it wasn't found.
     // (We have some false negatives if PP recovered e.g. <foo> -> "foo")
diff --git a/clang/lib/Frontend/PrintPreprocessedOutput.cpp b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
index 7f5f6690682300..a26d2c3ab8582b 100644
--- a/clang/lib/Frontend/PrintPreprocessedOutput.cpp
+++ b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
@@ -153,7 +153,8 @@ class PrintPPOutputPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void Ident(SourceLocation Loc, StringRef str) override;
   void PragmaMessage(SourceLocation Loc, StringRef Namespace,
@@ -401,8 +402,8 @@ void PrintPPOutputPPCallbacks::FileChanged(SourceLocation Loc,
 void PrintPPOutputPPCallbacks::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   // In -dI mode, dump #include directives prior to dumping their content or
   // interpretation. Similar for -fkeep-system-includes.
   if (DumpIncludeDirectives || (KeepSystemIncludes && isSystem(FileType))) {
@@ -418,14 +419,14 @@ void PrintPPOutputPPCallbacks::InclusionDirective(
   }
 
   // When preprocessing, turn implicit imports into module import pragmas.
-  if (Imported) {
+  if (ModuleImported) {
     switch (IncludeTok.getIdentifierInfo()->getPPKeywordID()) {
     case tok::pp_include:
     case tok::pp_import:
     case tok::pp_include_next:
       MoveToLine(HashLoc, /*RequireStartOfLine=*/true);
       *OS << "#pragma clang module import "
-          << Imported->getFullModuleName(true)
+          << SuggestedModule->getFullModuleName(true)
           << " /* clang -E: implicit import for "
           << "#" << PP.getSpelling(IncludeTok) << " "
           << (IsAngled ? '<' : '"') << FileName << (IsAngled ? '>' : '"')
diff --git a/clang/lib/Frontend/Rewrite/InclusionRewriter.cpp b/clang/lib/Frontend/Rewrite/InclusionRewriter.cpp
index b6b37461089e48..1462058003b3d4 100644
--- a/clang/lib/Frontend/Rewrite/InclusionRewriter.cpp
+++ b/clang/lib/Frontend/Rewrite/InclusionRewriter.cpp
@@ -75,7 +75,8 @@ class InclusionRewriter : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override;
   void If(SourceLocation Loc, SourceRange ConditionRange,
           ConditionValueKind ConditionValue) override;
@@ -189,9 +190,10 @@ void InclusionRewriter::InclusionDirective(
     StringRef /*FileName*/, bool /*IsAngled*/,
     CharSourceRange /*FilenameRange*/, OptionalFileEntryRef /*File*/,
     StringRef /*SearchPath*/, StringRef /*RelativePath*/,
-    const Module *Imported, SrcMgr::CharacteristicKind FileType) {
-  if (Imported) {
-    auto P = ModuleIncludes.insert(std::make_pair(HashLoc, Imported));
+    const Module *SuggestedModule, bool ModuleImported,
+    SrcMgr::CharacteristicKind FileType) {
+  if (ModuleImported) {
+    auto P = ModuleIncludes.insert(std::make_pair(HashLoc, SuggestedModule));
     (void)P;
     assert(P.second && "Unexpected revisitation of the same include directive");
   } else
diff --git a/clang/lib/Lex/PPDirectives.cpp b/clang/lib/Lex/PPDirectives.cpp
index a980f4bcbae124..97f9c0ada91de7 100644
--- a/clang/lib/Lex/PPDirectives.cpp
+++ b/clang/lib/Lex/PPDirectives.cpp
@@ -2253,26 +2253,27 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
 
   // FIXME: We do not have a good way to disambiguate C++ clang modules from
   // C++ standard modules (other than use/non-use of Header Units).
-  Module *SM = SuggestedModule.getModule();
 
-  bool MaybeTranslateInclude =
-      Action == Enter && File && SM && !SM->isForBuilding(getLangOpts());
+  Module *ModuleToImport = SuggestedModule.getModule();
+
+  bool MaybeTranslateInclude = Action == Enter && File && ModuleToImport &&
+                               !ModuleToImport->isForBuilding(getLangOpts());
 
   // Maybe a usable Header Unit
   bool UsableHeaderUnit = false;
-  if (getLangOpts().CPlusPlusModules && SM && SM->isHeaderUnit()) {
+  if (getLangOpts().CPlusPlusModules && ModuleToImport &&
+      ModuleToImport->isHeaderUnit()) {
     if (TrackGMFState.inGMF() || IsImportDecl)
       UsableHeaderUnit = true;
     else if (!IsImportDecl) {
       // This is a Header Unit that we do not include-translate
-      SuggestedModule = ModuleMap::KnownHeader();
-      SM = nullptr;
+      ModuleToImport = nullptr;
     }
   }
   // Maybe a usable clang header module.
   bool UsableClangHeaderModule =
-      (getLangOpts().CPlusPlusModules || getLangOpts().Modules) && SM &&
-      !SM->isHeaderUnit();
+      (getLangOpts().CPlusPlusModules || getLangOpts().Modules) &&
+      ModuleToImport && !ModuleToImport->isHeaderUnit();
 
   // Determine whether we should try to import the module for this #include, if
   // there is one. Don't do so if precompiled module support is disabled or we
@@ -2282,12 +2283,11 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     // unavailable, diagnose the situation and bail out.
     // FIXME: Remove this; loadModule does the same check (but produces
     // slightly worse diagnostics).
-    if (checkModuleIsAvailable(getLangOpts(), getTargetInfo(),
-                               *SuggestedModule.getModule(),
+    if (checkModuleIsAvailable(getLangOpts(), getTargetInfo(), *ModuleToImport,
                                getDiagnostics())) {
       Diag(FilenameTok.getLocation(),
            diag::note_implicit_top_level_module_import_here)
-          << SuggestedModule.getModule()->getTopLevelModuleName();
+          << ModuleToImport->getTopLevelModuleName();
       return {ImportAction::None};
     }
 
@@ -2295,7 +2295,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     // FIXME: Should we have a second loadModule() overload to avoid this
     // extra lookup step?
     SmallVector<std::pair<IdentifierInfo *, SourceLocation>, 2> Path;
-    for (Module *Mod = SM; Mod; Mod = Mod->Parent)
+    for (Module *Mod = ModuleToImport; Mod; Mod = Mod->Parent)
       Path.push_back(std::make_pair(getIdentifierInfo(Mod->Name),
                                     FilenameTok.getLocation()));
     std::reverse(Path.begin(), Path.end());
@@ -2306,12 +2306,12 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
 
     // Load the module to import its macros. We'll make the declarations
     // visible when the parser gets here.
-    // FIXME: Pass SuggestedModule in here rather than converting it to a path
-    // and making the module loader convert it back again.
+    // FIXME: Pass SM in here rather than converting it to a path and making the
+    // module loader convert it back again.
     ModuleLoadResult Imported = TheModuleLoader.loadModule(
         IncludeTok.getLocation(), Path, Module::Hidden,
         /*IsInclusionDirective=*/true);
-    assert((Imported == nullptr || Imported == SuggestedModule.getModule()) &&
+    assert((Imported == nullptr || Imported == SM) &&
            "the imported module is different than the suggested one");
 
     if (Imported) {
@@ -2323,8 +2323,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
       // was in the directory of an umbrella header, for instance), but no
       // actual module containing it exists (because the umbrella header is
       // incomplete).  Treat this as a textual inclusion.
-      SuggestedModule = ModuleMap::KnownHeader();
-      SM = nullptr;
+      ModuleToImport = nullptr;
     } else if (Imported.isConfigMismatch()) {
       // On a configuration mismatch, enter the header textually. We still know
       // that it's part of the corresponding module.
@@ -2365,7 +2364,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
   // this file will have no effect.
   if (Action == Enter && File &&
       !HeaderInfo.ShouldEnterIncludeFile(*this, *File, EnterOnce,
-                                         getLangOpts().Modules, SM,
+                                         getLangOpts().Modules, ModuleToImport,
                                          IsFirstIncludeOfFile)) {
     // C++ standard modules:
     // If we are not in the GMF, then we textually include only
@@ -2380,7 +2379,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     if (UsableHeaderUnit && !getLangOpts().CompilingPCH)
       Action = TrackGMFState.inGMF() ? Import : Skip;
     else
-      Action = (SuggestedModule && !getLangOpts().CompilingPCH) ? Import : Skip;
+      Action = (ModuleToImport && !getLangOpts().CompilingPCH) ? Import : Skip;
   }
 
   // Check for circular inclusion of the main file.
@@ -2400,8 +2399,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     // FIXME: Use a different callback for a pp-import?
     Callbacks->InclusionDirective(HashLoc, IncludeTok, LookupFilename, isAngled,
                                   FilenameRange, File, SearchPath, RelativePath,
-                                  Action == Import ? SuggestedModule.getModule()
-                                                   : nullptr,
+                                  SuggestedModule.getModule(), Action == Import,
                                   FileCharacter);
     if (Action == Skip && File)
       Callbacks->FileSkipped(*File, FilenameTok, FileCharacter);
@@ -2412,7 +2410,7 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
 
   // If this is a C++20 pp-import declaration, diagnose if we didn't find any
   // module corresponding to the named header.
-  if (IsImportDecl && !SuggestedModule) {
+  if (IsImportDecl && !ModuleToImport) {
     Diag(FilenameTok, diag::err_header_import_not_header_unit)
       << OriginalFilename << File->getName();
     return {ImportAction::None};
@@ -2517,8 +2515,8 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
   switch (Action) {
   case Skip:
     // If we don't need to enter the file, stop now.
-    if (SM)
-      return {ImportAction::SkippedModuleImport, SM};
+    if (ModuleToImport)
+      return {ImportAction::SkippedModuleImport, ModuleToImport};
     return {ImportAction::None};
 
   case IncludeLimitReached:
@@ -2530,13 +2528,13 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     // If this is a module import, make it visible if needed.
     assert(SM && "no module to import");
 
-    makeModuleVisible(SM, EndLoc);
+    makeModuleVisible(ModuleToImport, EndLoc);
 
     if (IncludeTok.getIdentifierInfo()->getPPKeywordID() ==
         tok::pp___include_macros)
       return {ImportAction::None};
 
-    return {ImportAction::ModuleImport, SM};
+    return {ImportAction::ModuleImport, ModuleToImport};
   }
 
   case Enter:
@@ -2573,13 +2571,14 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
 
   // Determine if we're switching to building a new submodule, and which one.
   // This does not apply for C++20 modules header units.
-  if (SM && !SM->isHeaderUnit()) {
-    if (SM->getTopLevelModule()->ShadowingModule) {
+  if (ModuleToImport && !ModuleToImport->isHeaderUnit()) {
+    if (ModuleToImport->getTopLevelModule()->ShadowingModule) {
       // We are building a submodule that belongs to a shadowed module. This
       // means we find header files in the shadowed module.
-      Diag(SM->DefinitionLoc, diag::err_module_build_shadowed_submodule)
-          << SM->getFullModuleName();
-      Diag(SM->getTopLevelModule()->ShadowingModule->DefinitionLoc,
+      Diag(ModuleToImport->DefinitionLoc,
+           diag::err_module_build_shadowed_submodule)
+          << ModuleToImport->getFullModuleName();
+      Diag(ModuleToImport->getTopLevelModule()->ShadowingModule->DefinitionLoc,
            diag::note_previous_definition);
       return {ImportAction::None};
     }
@@ -2591,21 +2590,22 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     // that behaves the same as the header would behave in a compilation using
     // that PCH, which means we should enter the submodule. We need to teach
     // the AST serialization layer to deal with the resulting AST.
-    if (getLangOpts().CompilingPCH && SM->isForBuilding(getLangOpts()))
+    if (getLangOpts().CompilingPCH &&
+        ModuleToImport->isForBuilding(getLangOpts()))
       return {ImportAction::None};
 
     assert(!CurLexerSubmodule && "should not have marked this as a module yet");
-    CurLexerSubmodule = SM;
+    CurLexerSubmodule = ModuleToImport;
 
     // Let the macro handling code know that any future macros are within
     // the new submodule.
-    EnterSubmodule(SM, EndLoc, /*ForPragma*/ false);
+    EnterSubmodule(ModuleToImport, EndLoc, /*ForPragma*/ false);
 
     // Let the parser know that any future declarations are within the new
     // submodule.
     // FIXME: There's no point doing this if we're handling a #__include_macros
     // directive.
-    return {ImportAction::ModuleBegin, SM};
+    return {ImportAction::ModuleBegin, ModuleToImport};
   }
 
   assert(!IsImportDecl && "failed to diagnose missing module for import decl");
diff --git a/clang/lib/Lex/PreprocessingRecord.cpp b/clang/lib/Lex/PreprocessingRecord.cpp
index aab6a2bed89d95..be5aac7ef31b88 100644
--- a/clang/lib/Lex/PreprocessingRecord.cpp
+++ b/clang/lib/Lex/PreprocessingRecord.cpp
@@ -472,8 +472,8 @@ void PreprocessingRecord::MacroUndefined(const Token &Id,
 void PreprocessingRecord::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
   InclusionDirective::InclusionKind Kind = InclusionDirective::Include;
 
   switch (IncludeTok.getIdentifierInfo()->getPPKeywordID()) {
@@ -506,10 +506,9 @@ void PreprocessingRecord::InclusionDirective(
       EndLoc = EndLoc.getLocWithOffset(-1); // the InclusionDirective expects
                                             // a token range.
   }
-  clang::InclusionDirective *ID =
-      new (*this) clang::InclusionDirective(*this, Kind, FileName, !IsAngled,
-                                            (bool)Imported, File,
-                                            SourceRange(HashLoc, EndLoc));
+  clang::InclusionDirective *ID = new (*this) clang::InclusionDirective(
+      *this, Kind, FileName, !IsAngled, ModuleImported, File,
+      SourceRange(HashLoc, EndLoc));
   addPreprocessedEntity(ID);
 }
 
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 995d8b2899c8d0..5a9e563c2d5b26 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -430,14 +430,14 @@ void ModuleDepCollectorPP::LexedFileChanged(FileID FID,
 void ModuleDepCollectorPP::InclusionDirective(
     SourceLocation HashLoc, const Token &IncludeTok, StringRef FileName,
     bool IsAngled, CharSourceRange FilenameRange, OptionalFileEntryRef File,
-    StringRef SearchPath, StringRef RelativePath, const Module *Imported,
-    SrcMgr::CharacteristicKind FileType) {
-  if (!File && !Imported) {
+    StringRef SearchPath, StringRef RelativePath, const Module *SuggestedModule,
+    bool ModuleImported, SrcMgr::CharacteristicKind FileType) {
+  if (!File && !ModuleImported) {
     // This is a non-modular include that HeaderSearch failed to find. Add it
     // here as `FileChanged` will never see it.
     MDC.addFileDep(FileName);
   }
-  handleImport(Imported);
+  handleImport(SuggestedModule);
 }
 
 void ModuleDepCollectorPP::moduleImport(SourceLocation ImportLoc,
diff --git a/clang/tools/libclang/Indexing.cpp b/clang/tools/libclang/Indexing.cpp
index 17d393ef808425..05d88452209fb3 100644
--- a/clang/tools/libclang/Indexing.cpp
+++ b/clang/tools/libclang/Indexing.cpp
@@ -261,12 +261,13 @@ class IndexPPCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     bool isImport = (IncludeTok.is(tok::identifier) &&
             IncludeTok.getIdentifierInfo()->getPPKeywordID() == tok::pp_import);
     DataConsumer.ppIncludedFile(HashLoc, FileName, File, isImport, IsAngled,
-                            Imported);
+                                ModuleImported);
   }
 
   /// MacroDefined - This hook is called whenever a macro definition is seen.
diff --git a/clang/unittests/Lex/PPCallbacksTest.cpp b/clang/unittests/Lex/PPCallbacksTest.cpp
index e0a27b5111821b..f3cdb1dfb28742 100644
--- a/clang/unittests/Lex/PPCallbacksTest.cpp
+++ b/clang/unittests/Lex/PPCallbacksTest.cpp
@@ -37,7 +37,8 @@ class InclusionDirectiveCallbacks : public PPCallbacks {
                           StringRef FileName, bool IsAngled,
                           CharSourceRange FilenameRange,
                           OptionalFileEntryRef File, StringRef SearchPath,
-                          StringRef RelativePath, const Module *Imported,
+                          StringRef RelativePath, const Module *SuggestedModule,
+                          bool ModuleImported,
                           SrcMgr::CharacteristicKind FileType) override {
     this->HashLoc = HashLoc;
     this->IncludeTok = IncludeTok;
@@ -47,7 +48,8 @@ class InclusionDirectiveCallbacks : public PPCallbacks {
     this->File = File;
     this->SearchPath = SearchPath.str();
     this->RelativePath = RelativePath.str();
-    this->Imported = Imported;
+    this->SuggestedModule = SuggestedModule;
+    this->ModuleImported = ModuleImported;
     this->FileType = FileType;
   }
 
@@ -59,7 +61,8 @@ class InclusionDirectiveCallbacks : public PPCallbacks {
   OptionalFileEntryRef File;
   SmallString<16> SearchPath;
   SmallString<16> RelativePath;
-  const Module* Imported;
+  const Module *SuggestedModule;
+  bool ModuleImported;
   SrcMgr::CharacteristicKind FileType;
 };
 

>From 016c2b507e090760e8c922a2f8810685a2f2576e Mon Sep 17 00:00:00 2001
From: Jeremy Morse <jeremy.morse at sony.com>
Date: Thu, 8 Feb 2024 18:14:07 +0000
Subject: [PATCH 66/72] Revert "[DebugInfo][RemoveDIs] Turn on non-instrinsic
 debug-info by default"

This reverts commit bdde5f9bea75e897bcc31a95b9c3376988c211cc.

Two situations that are tripping a few buildbots:

  https://lab.llvm.org/buildbot/#/builders/205/builds/25126

Here, polly is currently presenting a DebugLoc attached to a debugging
intrinsic as a "true" source location in a user report, something that's
unreliable.

  https://lab.llvm.org/buildbot/#/builders/184/builds/10242

These HWAsan failures are probably (97% confidence) because in
StackInfoBuilder::visit we're not observing DPValues attached to lifetime
intrinsics because they're delt with higher up the function.

But it's late-o'clock here, so revert for now.
---
 llvm/lib/IR/BasicBlock.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index bf02eba9fb448d..fe9d0d08c5fe97 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -34,7 +34,7 @@ cl::opt<bool>
     UseNewDbgInfoFormat("experimental-debuginfo-iterators",
                         cl::desc("Enable communicating debuginfo positions "
                                  "through iterators, eliminating intrinsics"),
-                        cl::init(true));
+                        cl::init(false));
 
 DPMarker *BasicBlock::createMarker(Instruction *I) {
   assert(IsNewDbgInfoFormat &&

>From 844b0c4f88e6cdf3018d155ae5e744e6266baf93 Mon Sep 17 00:00:00 2001
From: Nikolas Klauser <nikolasklauser at berlin.de>
Date: Thu, 8 Feb 2024 19:22:16 +0100
Subject: [PATCH 67/72] [libc++] Use __is_pointer_in_range inside
 vector::insert (#80624)

---
 libcxx/include/vector | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libcxx/include/vector b/libcxx/include/vector
index 3934361e98cf69..ce7df7a9f04207 100644
--- a/libcxx/include/vector
+++ b/libcxx/include/vector
@@ -351,6 +351,7 @@ template<class T, class charT> requires is-vector-bool-reference<T> // Since C++
 #include <__type_traits/type_identity.h>
 #include <__utility/exception_guard.h>
 #include <__utility/forward.h>
+#include <__utility/is_pointer_in_range.h>
 #include <__utility/move.h>
 #include <__utility/pair.h>
 #include <__utility/swap.h>
@@ -1580,14 +1581,13 @@ template <class _Tp, class _Allocator>
 _LIBCPP_CONSTEXPR_SINCE_CXX20 typename vector<_Tp, _Allocator>::iterator
 vector<_Tp, _Allocator>::insert(const_iterator __position, const_reference __x) {
   pointer __p = this->__begin_ + (__position - begin());
-  // We can't compare unrelated pointers inside constant expressions
-  if (!__libcpp_is_constant_evaluated() && this->__end_ < this->__end_cap()) {
+  if (this->__end_ < this->__end_cap()) {
     if (__p == this->__end_) {
       __construct_one_at_end(__x);
     } else {
       __move_range(__p, this->__end_, __p + 1);
       const_pointer __xr = pointer_traits<const_pointer>::pointer_to(__x);
-      if (__p <= __xr && __xr < this->__end_)
+      if (std::__is_pointer_in_range(std::__to_address(__p), std::__to_address(__end_), std::addressof(__x)))
         ++__xr;
       *__p = *__xr;
     }

>From ecd36b4743152a481a86e5cc51e9752799d0f423 Mon Sep 17 00:00:00 2001
From: Nikolas Klauser <nikolasklauser at berlin.de>
Date: Thu, 8 Feb 2024 19:22:49 +0100
Subject: [PATCH 68/72] [libc++][NFC] Simplify the implementation of
 `numeric_limits` (#80425)

The cv specializations for `numeric_limits` inherited privately for some
reason. We can simplify the implementation by inheriting publicly and
removing the members that just replicate the values from the base class.
---
 libcxx/include/limits | 283 +-----------------------------------------
 1 file changed, 5 insertions(+), 278 deletions(-)

diff --git a/libcxx/include/limits b/libcxx/include/limits
index a240580c0132f0..c704b4dddaf8e2 100644
--- a/libcxx/include/limits
+++ b/libcxx/include/limits
@@ -436,8 +436,8 @@ protected:
 };
 
 template <class _Tp>
-class _LIBCPP_TEMPLATE_VIS numeric_limits : private __libcpp_numeric_limits<__remove_cv_t<_Tp> > {
-  typedef __libcpp_numeric_limits<__remove_cv_t<_Tp> > __base;
+class _LIBCPP_TEMPLATE_VIS numeric_limits : private __libcpp_numeric_limits<_Tp> {
+  typedef __libcpp_numeric_limits<_Tp> __base;
   typedef typename __base::type type;
 
 public:
@@ -530,286 +530,13 @@ template <class _Tp>
 _LIBCPP_CONSTEXPR const float_round_style numeric_limits<_Tp>::round_style;
 
 template <class _Tp>
-class _LIBCPP_TEMPLATE_VIS numeric_limits<const _Tp> : private numeric_limits<_Tp> {
-  typedef numeric_limits<_Tp> __base;
-  typedef _Tp type;
-
-public:
-  static _LIBCPP_CONSTEXPR const bool is_specialized = __base::is_specialized;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type min() _NOEXCEPT { return __base::min(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type max() _NOEXCEPT { return __base::max(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type lowest() _NOEXCEPT { return __base::lowest(); }
-
-  static _LIBCPP_CONSTEXPR const int digits       = __base::digits;
-  static _LIBCPP_CONSTEXPR const int digits10     = __base::digits10;
-  static _LIBCPP_CONSTEXPR const int max_digits10 = __base::max_digits10;
-  static _LIBCPP_CONSTEXPR const bool is_signed   = __base::is_signed;
-  static _LIBCPP_CONSTEXPR const bool is_integer  = __base::is_integer;
-  static _LIBCPP_CONSTEXPR const bool is_exact    = __base::is_exact;
-  static _LIBCPP_CONSTEXPR const int radix        = __base::radix;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type epsilon() _NOEXCEPT { return __base::epsilon(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type round_error() _NOEXCEPT { return __base::round_error(); }
+class _LIBCPP_TEMPLATE_VIS numeric_limits<const _Tp> : public numeric_limits<_Tp> {};
 
-  static _LIBCPP_CONSTEXPR const int min_exponent   = __base::min_exponent;
-  static _LIBCPP_CONSTEXPR const int min_exponent10 = __base::min_exponent10;
-  static _LIBCPP_CONSTEXPR const int max_exponent   = __base::max_exponent;
-  static _LIBCPP_CONSTEXPR const int max_exponent10 = __base::max_exponent10;
-
-  static _LIBCPP_CONSTEXPR const bool has_infinity      = __base::has_infinity;
-  static _LIBCPP_CONSTEXPR const bool has_quiet_NaN     = __base::has_quiet_NaN;
-  static _LIBCPP_CONSTEXPR const bool has_signaling_NaN = __base::has_signaling_NaN;
-  _LIBCPP_SUPPRESS_DEPRECATED_PUSH
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const float_denorm_style has_denorm = __base::has_denorm;
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const bool has_denorm_loss          = __base::has_denorm_loss;
-  _LIBCPP_SUPPRESS_DEPRECATED_POP
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type infinity() _NOEXCEPT { return __base::infinity(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type quiet_NaN() _NOEXCEPT { return __base::quiet_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type signaling_NaN() _NOEXCEPT { return __base::signaling_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type denorm_min() _NOEXCEPT { return __base::denorm_min(); }
-
-  static _LIBCPP_CONSTEXPR const bool is_iec559  = __base::is_iec559;
-  static _LIBCPP_CONSTEXPR const bool is_bounded = __base::is_bounded;
-  static _LIBCPP_CONSTEXPR const bool is_modulo  = __base::is_modulo;
-
-  static _LIBCPP_CONSTEXPR const bool traps                    = __base::traps;
-  static _LIBCPP_CONSTEXPR const bool tinyness_before          = __base::tinyness_before;
-  static _LIBCPP_CONSTEXPR const float_round_style round_style = __base::round_style;
-};
-
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_specialized;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::digits;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::max_digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_signed;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_integer;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_exact;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::radix;
 template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::min_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::min_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::max_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const _Tp>::max_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::has_infinity;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::has_quiet_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::has_signaling_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const float_denorm_style numeric_limits<const _Tp>::has_denorm;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::has_denorm_loss;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_iec559;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_bounded;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::is_modulo;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::traps;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const _Tp>::tinyness_before;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const float_round_style numeric_limits<const _Tp>::round_style;
-
-template <class _Tp>
-class _LIBCPP_TEMPLATE_VIS numeric_limits<volatile _Tp> : private numeric_limits<_Tp> {
-  typedef numeric_limits<_Tp> __base;
-  typedef _Tp type;
-
-public:
-  static _LIBCPP_CONSTEXPR const bool is_specialized = __base::is_specialized;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type min() _NOEXCEPT { return __base::min(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type max() _NOEXCEPT { return __base::max(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type lowest() _NOEXCEPT { return __base::lowest(); }
-
-  static _LIBCPP_CONSTEXPR const int digits       = __base::digits;
-  static _LIBCPP_CONSTEXPR const int digits10     = __base::digits10;
-  static _LIBCPP_CONSTEXPR const int max_digits10 = __base::max_digits10;
-  static _LIBCPP_CONSTEXPR const bool is_signed   = __base::is_signed;
-  static _LIBCPP_CONSTEXPR const bool is_integer  = __base::is_integer;
-  static _LIBCPP_CONSTEXPR const bool is_exact    = __base::is_exact;
-  static _LIBCPP_CONSTEXPR const int radix        = __base::radix;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type epsilon() _NOEXCEPT { return __base::epsilon(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type round_error() _NOEXCEPT { return __base::round_error(); }
-
-  static _LIBCPP_CONSTEXPR const int min_exponent   = __base::min_exponent;
-  static _LIBCPP_CONSTEXPR const int min_exponent10 = __base::min_exponent10;
-  static _LIBCPP_CONSTEXPR const int max_exponent   = __base::max_exponent;
-  static _LIBCPP_CONSTEXPR const int max_exponent10 = __base::max_exponent10;
-
-  static _LIBCPP_CONSTEXPR const bool has_infinity      = __base::has_infinity;
-  static _LIBCPP_CONSTEXPR const bool has_quiet_NaN     = __base::has_quiet_NaN;
-  static _LIBCPP_CONSTEXPR const bool has_signaling_NaN = __base::has_signaling_NaN;
-  _LIBCPP_SUPPRESS_DEPRECATED_PUSH
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const float_denorm_style has_denorm = __base::has_denorm;
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const bool has_denorm_loss          = __base::has_denorm_loss;
-  _LIBCPP_SUPPRESS_DEPRECATED_POP
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type infinity() _NOEXCEPT { return __base::infinity(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type quiet_NaN() _NOEXCEPT { return __base::quiet_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type signaling_NaN() _NOEXCEPT { return __base::signaling_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type denorm_min() _NOEXCEPT { return __base::denorm_min(); }
+class _LIBCPP_TEMPLATE_VIS numeric_limits<volatile _Tp> : public numeric_limits<_Tp> {};
 
-  static _LIBCPP_CONSTEXPR const bool is_iec559  = __base::is_iec559;
-  static _LIBCPP_CONSTEXPR const bool is_bounded = __base::is_bounded;
-  static _LIBCPP_CONSTEXPR const bool is_modulo  = __base::is_modulo;
-
-  static _LIBCPP_CONSTEXPR const bool traps                    = __base::traps;
-  static _LIBCPP_CONSTEXPR const bool tinyness_before          = __base::tinyness_before;
-  static _LIBCPP_CONSTEXPR const float_round_style round_style = __base::round_style;
-};
-
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_specialized;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::digits;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::max_digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_signed;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_integer;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_exact;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::radix;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::min_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::min_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::max_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<volatile _Tp>::max_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::has_infinity;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::has_quiet_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::has_signaling_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const float_denorm_style numeric_limits<volatile _Tp>::has_denorm;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::has_denorm_loss;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_iec559;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_bounded;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::is_modulo;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::traps;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<volatile _Tp>::tinyness_before;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const float_round_style numeric_limits<volatile _Tp>::round_style;
-
-template <class _Tp>
-class _LIBCPP_TEMPLATE_VIS numeric_limits<const volatile _Tp> : private numeric_limits<_Tp> {
-  typedef numeric_limits<_Tp> __base;
-  typedef _Tp type;
-
-public:
-  static _LIBCPP_CONSTEXPR const bool is_specialized = __base::is_specialized;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type min() _NOEXCEPT { return __base::min(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type max() _NOEXCEPT { return __base::max(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type lowest() _NOEXCEPT { return __base::lowest(); }
-
-  static _LIBCPP_CONSTEXPR const int digits       = __base::digits;
-  static _LIBCPP_CONSTEXPR const int digits10     = __base::digits10;
-  static _LIBCPP_CONSTEXPR const int max_digits10 = __base::max_digits10;
-  static _LIBCPP_CONSTEXPR const bool is_signed   = __base::is_signed;
-  static _LIBCPP_CONSTEXPR const bool is_integer  = __base::is_integer;
-  static _LIBCPP_CONSTEXPR const bool is_exact    = __base::is_exact;
-  static _LIBCPP_CONSTEXPR const int radix        = __base::radix;
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type epsilon() _NOEXCEPT { return __base::epsilon(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type round_error() _NOEXCEPT { return __base::round_error(); }
-
-  static _LIBCPP_CONSTEXPR const int min_exponent   = __base::min_exponent;
-  static _LIBCPP_CONSTEXPR const int min_exponent10 = __base::min_exponent10;
-  static _LIBCPP_CONSTEXPR const int max_exponent   = __base::max_exponent;
-  static _LIBCPP_CONSTEXPR const int max_exponent10 = __base::max_exponent10;
-
-  static _LIBCPP_CONSTEXPR const bool has_infinity      = __base::has_infinity;
-  static _LIBCPP_CONSTEXPR const bool has_quiet_NaN     = __base::has_quiet_NaN;
-  static _LIBCPP_CONSTEXPR const bool has_signaling_NaN = __base::has_signaling_NaN;
-  _LIBCPP_SUPPRESS_DEPRECATED_PUSH
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const float_denorm_style has_denorm = __base::has_denorm;
-  static _LIBCPP_DEPRECATED_IN_CXX23 _LIBCPP_CONSTEXPR const bool has_denorm_loss          = __base::has_denorm_loss;
-  _LIBCPP_SUPPRESS_DEPRECATED_POP
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type infinity() _NOEXCEPT { return __base::infinity(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type quiet_NaN() _NOEXCEPT { return __base::quiet_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type signaling_NaN() _NOEXCEPT { return __base::signaling_NaN(); }
-  _LIBCPP_HIDE_FROM_ABI static _LIBCPP_CONSTEXPR type denorm_min() _NOEXCEPT { return __base::denorm_min(); }
-
-  static _LIBCPP_CONSTEXPR const bool is_iec559  = __base::is_iec559;
-  static _LIBCPP_CONSTEXPR const bool is_bounded = __base::is_bounded;
-  static _LIBCPP_CONSTEXPR const bool is_modulo  = __base::is_modulo;
-
-  static _LIBCPP_CONSTEXPR const bool traps                    = __base::traps;
-  static _LIBCPP_CONSTEXPR const bool tinyness_before          = __base::tinyness_before;
-  static _LIBCPP_CONSTEXPR const float_round_style round_style = __base::round_style;
-};
-
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_specialized;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::digits;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::max_digits10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_signed;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_integer;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_exact;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::radix;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::min_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::min_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::max_exponent;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const int numeric_limits<const volatile _Tp>::max_exponent10;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::has_infinity;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::has_quiet_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::has_signaling_NaN;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const float_denorm_style numeric_limits<const volatile _Tp>::has_denorm;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::has_denorm_loss;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_iec559;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_bounded;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::is_modulo;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::traps;
-template <class _Tp>
-_LIBCPP_CONSTEXPR const bool numeric_limits<const volatile _Tp>::tinyness_before;
 template <class _Tp>
-_LIBCPP_CONSTEXPR const float_round_style numeric_limits<const volatile _Tp>::round_style;
+class _LIBCPP_TEMPLATE_VIS numeric_limits<const volatile _Tp> : public numeric_limits<_Tp> {};
 
 _LIBCPP_END_NAMESPACE_STD
 

>From ea624dcd6c392f86ed9cce5a49e3e1913849b4f3 Mon Sep 17 00:00:00 2001
From: Nikolas Klauser <nikolasklauser at berlin.de>
Date: Thu, 8 Feb 2024 19:23:10 +0100
Subject: [PATCH 69/72] [libc++] Avoid including <cmath> in <compare> (#80418)

This reduces the time to include `<compare>` from 84ms to 36ms.
---
 libcxx/include/__compare/strong_order.h       | 23 +++++++++++--------
 libcxx/include/__compare/weak_order.h         | 12 ++++++----
 libcxx/include/compare                        |  1 +
 .../test/libcxx/transitive_includes/cxx23.csv |  1 -
 .../test/libcxx/transitive_includes/cxx26.csv |  1 -
 5 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/libcxx/include/__compare/strong_order.h b/libcxx/include/__compare/strong_order.h
index 5f6ade5aef8e4a..3dc819e642515c 100644
--- a/libcxx/include/__compare/strong_order.h
+++ b/libcxx/include/__compare/strong_order.h
@@ -13,11 +13,14 @@
 #include <__compare/compare_three_way.h>
 #include <__compare/ordering.h>
 #include <__config>
+#include <__math/exponential_functions.h>
+#include <__math/traits.h>
 #include <__type_traits/conditional.h>
 #include <__type_traits/decay.h>
+#include <__type_traits/is_floating_point.h>
+#include <__type_traits/is_same.h>
 #include <__utility/forward.h>
 #include <__utility/priority_tag.h>
-#include <cmath>
 #include <cstdint>
 #include <limits>
 
@@ -66,27 +69,27 @@ struct __fn {
       return strong_ordering::greater;
     } else if (__t == __u) {
       if constexpr (numeric_limits<_Dp>::radix == 2) {
-        return std::signbit(__u) <=> std::signbit(__t);
+        return __math::signbit(__u) <=> __math::signbit(__t);
       } else {
         // This is bullet 3 of the IEEE754 algorithm, relevant
         // only for decimal floating-point;
         // see https://stackoverflow.com/questions/69068075/
-        if (__t == 0 || std::isinf(__t)) {
-          return std::signbit(__u) <=> std::signbit(__t);
+        if (__t == 0 || __math::isinf(__t)) {
+          return __math::signbit(__u) <=> __math::signbit(__t);
         } else {
           int __texp, __uexp;
-          (void)std::frexp(__t, &__texp);
-          (void)std::frexp(__u, &__uexp);
+          (void)__math::frexp(__t, &__texp);
+          (void)__math::frexp(__u, &__uexp);
           return (__t < 0) ? (__texp <=> __uexp) : (__uexp <=> __texp);
         }
       }
     } else {
       // They're unordered, so one of them must be a NAN.
       // The order is -QNAN, -SNAN, numbers, +SNAN, +QNAN.
-      bool __t_is_nan      = std::isnan(__t);
-      bool __u_is_nan      = std::isnan(__u);
-      bool __t_is_negative = std::signbit(__t);
-      bool __u_is_negative = std::signbit(__u);
+      bool __t_is_nan      = __math::isnan(__t);
+      bool __u_is_nan      = __math::isnan(__u);
+      bool __t_is_negative = __math::signbit(__t);
+      bool __u_is_negative = __math::signbit(__u);
       using _IntType =
           conditional_t< sizeof(__t) == sizeof(int32_t),
                          int32_t,
diff --git a/libcxx/include/__compare/weak_order.h b/libcxx/include/__compare/weak_order.h
index 9f719eb64bbca3..b82a708c29a146 100644
--- a/libcxx/include/__compare/weak_order.h
+++ b/libcxx/include/__compare/weak_order.h
@@ -13,10 +13,12 @@
 #include <__compare/ordering.h>
 #include <__compare/strong_order.h>
 #include <__config>
+#include <__math/traits.h>
 #include <__type_traits/decay.h>
+#include <__type_traits/is_floating_point.h>
+#include <__type_traits/is_same.h>
 #include <__utility/forward.h>
 #include <__utility/priority_tag.h>
-#include <cmath>
 
 #ifndef _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER
 #  pragma GCC system_header
@@ -51,10 +53,10 @@ struct __fn {
       return weak_ordering::greater;
     } else {
       // Otherwise, at least one of them is a NaN.
-      bool __t_is_nan      = std::isnan(__t);
-      bool __u_is_nan      = std::isnan(__u);
-      bool __t_is_negative = std::signbit(__t);
-      bool __u_is_negative = std::signbit(__u);
+      bool __t_is_nan      = __math::isnan(__t);
+      bool __u_is_nan      = __math::isnan(__u);
+      bool __t_is_negative = __math::signbit(__t);
+      bool __u_is_negative = __math::signbit(__u);
       if (__t_is_nan && __u_is_nan) {
         return (__u_is_negative <=> __t_is_negative);
       } else if (__t_is_nan) {
diff --git a/libcxx/include/compare b/libcxx/include/compare
index 626c7435f5fd09..cc0cae8a544d62 100644
--- a/libcxx/include/compare
+++ b/libcxx/include/compare
@@ -162,6 +162,7 @@ namespace std {
 #endif
 
 #if !defined(_LIBCPP_REMOVE_TRANSITIVE_INCLUDES) && _LIBCPP_STD_VER <= 20
+#  include <cmath>
 #  include <type_traits>
 #endif
 
diff --git a/libcxx/test/libcxx/transitive_includes/cxx23.csv b/libcxx/test/libcxx/transitive_includes/cxx23.csv
index 7c7099d176f18b..bd8241118f4b91 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx23.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx23.csv
@@ -105,7 +105,6 @@ codecvt string
 codecvt tuple
 codecvt typeinfo
 codecvt version
-compare cmath
 compare cstddef
 compare cstdint
 compare limits
diff --git a/libcxx/test/libcxx/transitive_includes/cxx26.csv b/libcxx/test/libcxx/transitive_includes/cxx26.csv
index 7c7099d176f18b..bd8241118f4b91 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx26.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx26.csv
@@ -105,7 +105,6 @@ codecvt string
 codecvt tuple
 codecvt typeinfo
 codecvt version
-compare cmath
 compare cstddef
 compare cstdint
 compare limits

>From ecf04b07a6b840c37e7ffc5e208b4e6fa6869591 Mon Sep 17 00:00:00 2001
From: Valentin Clement <clementval at gmail.com>
Date: Thu, 8 Feb 2024 10:23:20 -0800
Subject: [PATCH 70/72] [flang][cuda] Fix warning in switch

---
 flang/lib/Lower/ConvertVariable.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp
index f761e14e64794d..d57bdd448da3f6 100644
--- a/flang/lib/Lower/ConvertVariable.cpp
+++ b/flang/lib/Lower/ConvertVariable.cpp
@@ -1603,7 +1603,7 @@ fir::CUDAAttributeAttr Fortran::lower::translateSymbolCUDAAttribute(
       break;
     case Fortran::common::CUDADataAttr::Texture:
       // Obsolete attribute
-      break;
+      return {};
     }
 
     return fir::CUDAAttributeAttr::get(mlirContext, attr);

>From 0afe5c583b537fffd5b77dcdf433f75b2fb9b497 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <nicolai.haehnle at amd.com>
Date: Thu, 8 Feb 2024 19:24:55 +0100
Subject: [PATCH 71/72] docs/GettingStarted: document linker-related cmake
 options (#80932)

Both LLVM_LINK_LLVM_DYLIB and LLVM_PARALLEL_LINK_JOBS help with some
common gotchas. It seems worth documenting them here explicitly.

Based on a review comment, also "refactor" the documentation to avoid duplication.
---
 llvm/docs/CMake.rst          |  2 +
 llvm/docs/GettingStarted.rst | 86 +++++++-----------------------------
 2 files changed, 19 insertions(+), 69 deletions(-)

diff --git a/llvm/docs/CMake.rst b/llvm/docs/CMake.rst
index 13d1912ceb2ab8..20f73c99bff89d 100644
--- a/llvm/docs/CMake.rst
+++ b/llvm/docs/CMake.rst
@@ -178,6 +178,8 @@ variable and type on the CMake command line:
 
   $ cmake -DVARIABLE:TYPE=value path/to/llvm/source
 
+.. _cmake_frequently_used_variables:
+
 Frequently-used CMake variables
 -------------------------------
 
diff --git a/llvm/docs/GettingStarted.rst b/llvm/docs/GettingStarted.rst
index 316fc6ad86b848..687d1f29b5a1fe 100644
--- a/llvm/docs/GettingStarted.rst
+++ b/llvm/docs/GettingStarted.rst
@@ -540,75 +540,23 @@ Variables are passed to ``cmake`` on the command line using the format
 ``-D<variable name>=<value>``. The following variables are some common options
 used by people developing LLVM.
 
-+-------------------------+----------------------------------------------------+
-| Variable                | Purpose                                            |
-+=========================+====================================================+
-| CMAKE_C_COMPILER        | Tells ``cmake`` which C compiler to use. By        |
-|                         | default, this will be /usr/bin/cc.                 |
-+-------------------------+----------------------------------------------------+
-| CMAKE_CXX_COMPILER      | Tells ``cmake`` which C++ compiler to use. By      |
-|                         | default, this will be /usr/bin/c++.                |
-+-------------------------+----------------------------------------------------+
-| CMAKE_BUILD_TYPE        | Tells ``cmake`` what type of build you are trying  |
-|                         | to generate files for. Valid options are Debug,    |
-|                         | Release, RelWithDebInfo, and MinSizeRel. Default   |
-|                         | is Debug.                                          |
-+-------------------------+----------------------------------------------------+
-| CMAKE_INSTALL_PREFIX    | Specifies the install directory to target when     |
-|                         | running the install action of the build files.     |
-+-------------------------+----------------------------------------------------+
-| Python3_EXECUTABLE      | Forces CMake to use a specific Python version by   |
-|                         | passing a path to a Python interpreter. By default |
-|                         | the Python version of the interpreter in your PATH |
-|                         | is used.                                           |
-+-------------------------+----------------------------------------------------+
-| LLVM_TARGETS_TO_BUILD   | A semicolon delimited list controlling which       |
-|                         | targets will be built and linked into llvm.        |
-|                         | The default list is defined as                     |
-|                         | ``LLVM_ALL_TARGETS``, and can be set to include    |
-|                         | out-of-tree targets. The default value includes:   |
-|                         | ``AArch64, AMDGPU, ARM, AVR, BPF, Hexagon, Lanai,  |
-|                         | Mips, MSP430, NVPTX, PowerPC, RISCV, Sparc,        |
-|                         | SystemZ, WebAssembly, X86, XCore``. Setting this   |
-|                         | to ``"host"`` will only compile the host           |
-|                         | architecture (e.g. equivalent to specifying ``X86``|
-|                         | on an x86 host machine) can                        |
-|                         | significantly speed up compile and test times.     |
-+-------------------------+----------------------------------------------------+
-| LLVM_ENABLE_DOXYGEN     | Build doxygen-based documentation from the source  |
-|                         | code This is disabled by default because it is     |
-|                         | slow and generates a lot of output.                |
-+-------------------------+----------------------------------------------------+
-| LLVM_ENABLE_PROJECTS    | A semicolon-delimited list selecting which of the  |
-|                         | other LLVM subprojects to additionally build. (Only|
-|                         | effective when using a side-by-side project layout |
-|                         | e.g. via git). The default list is empty. Can      |
-|                         | include: clang, clang-tools-extra,                 |
-|                         | cross-project-tests, flang, libc, libclc, lld,     |
-|                         | lldb, mlir, openmp, polly, or pstl.                |
-+-------------------------+----------------------------------------------------+
-| LLVM_ENABLE_RUNTIMES    | A semicolon-delimited list selecting which of the  |
-|                         | runtimes to build. (Only effective when using the  |
-|                         | full monorepo layout). The default list is empty.  |
-|                         | Can include: compiler-rt, libc, libcxx, libcxxabi, |
-|                         | libunwind, or openmp.                              |
-+-------------------------+----------------------------------------------------+
-| LLVM_ENABLE_SPHINX      | Build sphinx-based documentation from the source   |
-|                         | code. This is disabled by default because it is    |
-|                         | slow and generates a lot of output. Sphinx version |
-|                         | 1.5 or later recommended.                          |
-+-------------------------+----------------------------------------------------+
-| LLVM_BUILD_LLVM_DYLIB   | Generate libLLVM.so. This library contains a       |
-|                         | default set of LLVM components that can be         |
-|                         | overridden with ``LLVM_DYLIB_COMPONENTS``. The     |
-|                         | default contains most of LLVM and is defined in    |
-|                         | ``tools/llvm-shlib/CMakelists.txt``. This option is|
-|                         | not available on Windows.                          |
-+-------------------------+----------------------------------------------------+
-| LLVM_OPTIMIZED_TABLEGEN | Builds a release tablegen that gets used during    |
-|                         | the LLVM build. This can dramatically speed up     |
-|                         | debug builds.                                      |
-+-------------------------+----------------------------------------------------+
+* ``CMAKE_C_COMPILER``
+* ``CMAKE_CXX_COMPILER``
+* ``CMAKE_BUILD_TYPE``
+* ``CMAKE_INSTALL_PREFIX``
+* ``Python3_EXECUTABLE``
+* ``LLVM_TARGETS_TO_BUILD``
+* ``LLVM_ENABLE_PROJECTS``
+* ``LLVM_ENABLE_RUNTIMES``
+* ``LLVM_ENABLE_DOXYGEN``
+* ``LLVM_ENABLE_SPHINX``
+* ``LLVM_BUILD_LLVM_DYLIB``
+* ``LLVM_LINK_LLVM_DYLIB``
+* ``LLVM_PARALLEL_LINK_JOBS``
+* ``LLVM_OPTIMIZED_TABLEGEN``
+
+See :ref:`the list of frequently-used CMake variables <cmake_frequently_used_variables>`
+for more information.
 
 To configure LLVM, follow these steps:
 

>From 8375a4c2efc79afad234cfa33ac320dcd0b00122 Mon Sep 17 00:00:00 2001
From: mahtohappy <Happy.Kumar at windriver.com>
Date: Thu, 8 Feb 2024 00:55:58 -0800
Subject: [PATCH 72/72] [Clang][Sema] Diagnosis for constexpr constructor not
 initializing a union member

---
 clang/docs/ReleaseNotes.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0072495354b8eb..7a10b52527cf8c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -157,6 +157,9 @@ Improvements to Clang's diagnostics
 - The ``-Wshorten-64-to-32`` diagnostic is now grouped under ``-Wimplicit-int-conversion`` instead
    of ``-Wconversion``. Fixes `#69444 <https://github.com/llvm/llvm-project/issues/69444>`_.
 
+- Clang now diagnoses constexpr constructor for not initializing atleast one member of union
+- Fixes(`#46689 Constexpr constructor not initializing a union member is not diagnosed`)
+
 Improvements to Clang's time-trace
 ----------------------------------
 



More information about the Mlir-commits mailing list