[libcxx-commits] [clang] [compiler-rt] [flang] [libc] [libcxx] [libcxxabi] [lld] [llvm] release note is nullptr removal (PR #101638)

Fri Aug 2 01:54:46 PDT 2024

https://github.com/philnik777 created https://github.com/llvm/llvm-project/pull/101638

- **Bump version to 19.1.0git**
- **[Infra] Fix version-check workflow (#100090)**
- **[LV] Disable VPlan-based cost model for 19.x release.**
- **Revert " [LICM] Fold associative binary ops to promote code hoisting  (#81608)"**
- **[clang][test] Add function type discrimination tests to static destructor tests (#99604)**
- **[PAC][compiler-rt][UBSan] Strip signed vptr instead of authenticating it (#100153)**
- **[LoongArch] Fix codegen for ISD::ROTR (#100292)**
- **[NVPTX] Fix internal indirect call prototypes not obeying the ABI (#100131)**
- **[PowerPC] Add builtin_cpu_is P11 support (#99550)**
- **[libc++][math] Fix undue overflowing of `std::hypot(x,y,z)` (#93350)**
- **[libc++][vector<bool>] Tests shrink_to_fit requirement. (#98009)**
- **[libc++][string] Fixes shrink_to_fit. (#97961)**
- **[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511)**
- **[clang][OpenMP] Propoagate debug location to OMPIRBuilder reduction codegen (#100358)**
- **[clang] Define `ATOMIC_FLAG_INIT` correctly for C++. (#97534)**
- **Precommit vscale-fixups.ll test (NFC)**
- **[LSR] Fix matching vscale immediates (#100080)**
- **[ValueTracking] Don't use CondContext in dataflow analysis of phi nodes (#100316)**
- **[PAC] Define __builtin_ptrauth_type_discriminator (#100204)**
- **[flang] fix C_PTR function result lowering (#100082)**
- **[RISCV] Fix InsnCI register type (#100113)**
- **[ARM] Create mapping symbols with non-unique names**
- **[libc++][doc] Update the release notes for LLVM 19. (#99061)**
- **[clang][headers] Including stddef.h always redefines NULL (#99727)**
- **[LLVM] [MC] Update frame layout & CFI generation to handle frames larger than 2gb (#99263)**
- **[Clang] Fix an assertion failure introduced by #93430 (#100313)**
- **[Clang][NFC] Simplify initialization of `OverloadCandidate` objects. (#100318)**
- **[libc++] Improve behavior when using relative path for LIBCXX_ASSERTION_HANDLER_FILE (#100157)**
- **[libc++][spaceship] Implements X::iterator container requirements. (#99343)**
- **[ExprConstant] Handle shift overflow the same way as other kinds of overflow (#99579)**
- **[AArch64] Implement INIT/ADJUST_TRAMPOLINE (#70267)**
- **[Flang][Docs] Update information about AArch64 trampolines (#100391)**
- **[PAC][clang] Enable `-fptrauth-indirect-gotos` as part of pauthtest ABI (#100480)**
- **[libc] Only add '-fno-builtin-*' on the entrypoints that use them (#100481)**
- **[Flang][Driver] Enable config file options (#100343)**
- **[AArch64][SME] Rewrite __arm_get_current_vg to preserve required registers (#100143)**
- **[clang] Remove `__is_layout_compatible` from revertible type traits list (#100572)**
- **[libc++] Add missing xlocale.h include on Apple and FreeBSD (#99689)**
- **Normalize ptrauth handling in sanitizer runtime (#100483)**
- **[flang][OpenMP] Initialize privatised derived type variables (#100417)**
- **[compiler-rt][ubsan][nfc-ish] Fix a type conversion bug (#100665)**
- **[BasicAA] Fix handling of indirect assumption based results (#100130)**
- **[PAC] Sign LR with B key for non-leaf functions with ptrauth-returns attr (#100552)**
- **[flang][debug] Set scope of internal functions correctly. (#99531)**
- **[Utils] Updates to bump-version.py (#100089)**
- **[MLGO][Infra] Add mlgo-utils to bump-version script (#100186)**
- **Set version to 19.1.0-rc1**
- **Revert "[clang-format] Fix a bug in annotating `*` in `#define`s (#99433)"**
- **[LoongArch][MC] Support %[ld_/gd_/desc_]pcrel_20**
- **[libc++] Fix bug in atomic_ref's calculation of lock_free-ness (#99570)**
- **[RISCV] Don't crash in RISCVMergeBaseOffset if INLINE_ASM uses address register in a non-memory constraint. (#100790)**
- **[libc++][libc++abi] Minor follow-up changes after ptrauth upstreaming (#87481)**
- **Fix lifetimebound for field access (#100197)**
- **[ELF] Remove obsoleted comment after #99567**
- **[ELF,test] Improve negative linker script tests**
- **[ELF] Add Relocs and invokeOnRelocs. NFC**
- **[ELF] Use invokeOnRelocs. NFC**
- **[llvm-exegesis] Use correct rseq struct size (#100804)**
- **[lld][ELF][LoongArch] Support R_LARCH_TLS_{LD,GD,DESC}_PCREL_S2**
- **[compiler-rt][test] Disable lld tests on SPARC (#100533)**
- **[asan][cmake][test] Fix finding dynamic asan runtime lib (#100083)**
- **[libc] Fix leftover debug commandline argument**
- **Update libc/docs/configure.rst**
- **[StackFrameLayoutAnalysis] Use target-specific hook for SP offsets (#100386)**
- **[StackFrameLayoutAnalysis] Support more SlotTypes (#100562)**
- **[PAC][test] Add tests against Linux triples for auth/resign lowering (#100744)**
- **[PAC][clang][test] Implement missing tests for some PAuth features (#100206)**
- **[compiler-rt] Fix format string warnings in FreeBSD DumpAllRegisters (#101072)**
- **[nsan] Remove mallopt from nsan_interceptors (#101055)**
- **[clang-format] Fix misannotations of `<` in ternary expressions (#100980)**
- **[NVPTX] Fix DwarfFrameBase construction (#101000)**
- **[clang][ARM64EC] Add support for hybrid_patchable attribute. (#99478)**
- **[sanitizer_common][test] Always skip select allocator tests on SPARC V9 (#100530)**
- **[libc++][spaceship] Marks P1614 as complete. (#99375)**
- **[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. (#96839)**
- **[libunwind][AIX] Fix the wrong traceback from signal handler (#101069)**
- **[CodeGen][ARM64EC] Use alias symbol for exporting hybrid_patchable functions. (#100872)**
- **ReleaseNotes.rst: Fix typo "my" for "may"**
- **[clang][FMV][AArch64] Improve streaming mode compatibility.**
- **[Sanitizers] Avoid overload ambiguity for interceptors (#100986)**
- **Revert "[MC] Compute fragment offsets eagerly"**
- **Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85790)"**
- **[libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)"**
- **[Support] Silence warnings when retrieving exported functions (#97905)**
- **[InstrProf] Remove duplicate definition of IntPtrT**
- **workflows: Fix libclc-tests (#101524)**
- **[lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm,mips64,powerpc} declarations (#101403)**
- **[libc++] Increase atomic_ref's required alignment for small types (#99654)**
- **[NFC][libc++][libc++abi][libunwind][test] Fix/unify AIX triples used in LIT tests (#101196)**
- **[ELF] Support relocatable files using CREL with explicit addends**
- **[Clang] Add a release note deprecating __is_nullptr**


>From c2dbaeb91a45aeb6d26f22efef318b5f5a0eb629 Mon Sep 17 00:00:00 2001
From: Tobias Hieta <tobias at hieta.se>
Date: Tue, 23 Jul 2024 11:06:16 +0200
Subject: [PATCH 01/91] Bump version to 19.1.0git

---
 cmake/Modules/LLVMVersion.cmake          | 2 +-
 libcxx/include/__config                  | 2 +-
 llvm/utils/gn/secondary/llvm/version.gni | 2 +-
 llvm/utils/lit/lit/__init__.py           | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cmake/Modules/LLVMVersion.cmake b/cmake/Modules/LLVMVersion.cmake
index 5e28283fbc1c6..aea9b880180ab 100644
--- a/cmake/Modules/LLVMVersion.cmake
+++ b/cmake/Modules/LLVMVersion.cmake
@@ -4,7 +4,7 @@ if(NOT DEFINED LLVM_VERSION_MAJOR)
   set(LLVM_VERSION_MAJOR 19)
 endif()
 if(NOT DEFINED LLVM_VERSION_MINOR)
-  set(LLVM_VERSION_MINOR 0)
+  set(LLVM_VERSION_MINOR 1)
 endif()
 if(NOT DEFINED LLVM_VERSION_PATCH)
   set(LLVM_VERSION_PATCH 0)
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 108f700823cbf..661af5be3c225 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -27,7 +27,7 @@
 // _LIBCPP_VERSION represents the version of libc++, which matches the version of LLVM.
 // Given a LLVM release LLVM XX.YY.ZZ (e.g. LLVM 17.0.1 == 17.00.01), _LIBCPP_VERSION is
 // defined to XXYYZZ.
-#  define _LIBCPP_VERSION 190000
+#  define _LIBCPP_VERSION 190100
 
 #  define _LIBCPP_CONCAT_IMPL(_X, _Y) _X##_Y
 #  define _LIBCPP_CONCAT(_X, _Y) _LIBCPP_CONCAT_IMPL(_X, _Y)
diff --git a/llvm/utils/gn/secondary/llvm/version.gni b/llvm/utils/gn/secondary/llvm/version.gni
index 7c02ed396db5f..3f44a4645acf6 100644
--- a/llvm/utils/gn/secondary/llvm/version.gni
+++ b/llvm/utils/gn/secondary/llvm/version.gni
@@ -1,4 +1,4 @@
 llvm_version_major = 19
-llvm_version_minor = 0
+llvm_version_minor = 1
 llvm_version_patch = 0
 llvm_version = "$llvm_version_major.$llvm_version_minor.$llvm_version_patch"
diff --git a/llvm/utils/lit/lit/__init__.py b/llvm/utils/lit/lit/__init__.py
index a5a1ff66bf417..03edfc3360972 100644
--- a/llvm/utils/lit/lit/__init__.py
+++ b/llvm/utils/lit/lit/__init__.py
@@ -2,7 +2,7 @@
 
 __author__ = "Daniel Dunbar"
 __email__ = "daniel at minormatter.com"
-__versioninfo__ = (19, 0, 0)
+__versioninfo__ = (19, 1, 0)
 __version__ = ".".join(str(v) for v in __versioninfo__) + "dev"
 
 __all__ = []

>From fc9f6b0e0d8d8f876883883227da3cbd9ab2eb53 Mon Sep 17 00:00:00 2001
From: Tobias Hieta <tobias at hieta.se>
Date: Tue, 23 Jul 2024 13:03:27 +0200
Subject: [PATCH 02/91] [Infra] Fix version-check workflow (#100090)

---
 .github/workflows/version-check.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/version-check.yml b/.github/workflows/version-check.yml
index 4ce6119a407f5..894e07d323ca9 100644
--- a/.github/workflows/version-check.yml
+++ b/.github/workflows/version-check.yml
@@ -27,5 +27,5 @@ jobs:
 
       - name: Version Check
         run: |
-          version=$(grep -o 'LLVM_VERSION_\(MAJOR\|MINOR\|PATCH\) [0-9]\+' llvm/CMakeLists.txt  | cut -d ' ' -f 2 | tr "\n" "." | sed 's/.$//g')
+          version=$(grep -o 'LLVM_VERSION_\(MAJOR\|MINOR\|PATCH\) [0-9]\+' cmake/Modules/LLVMVersion.cmake  | cut -d ' ' -f 2 | tr "\n" "." | sed 's/.$//g')
           .github/workflows/version-check.py "$version"

>From 183e8ecc97a996c24e920e7e9668bc65a0d19439 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Tue, 23 Jul 2024 11:15:26 +0100
Subject: [PATCH 03/91] [LV] Disable VPlan-based cost model for 19.x release.

As discussed in  https://github.com/llvm/llvm-project/pull/92555 flip
the default for the option added in
https://github.com/llvm/llvm-project/pull/99536 to true.

This restores the original behavior for the release branch to give the
VPlan-based cost model more time to mature on main.
---
 llvm/lib/Transforms/Vectorize/LoopVectorize.cpp                 | 2 +-
 .../test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll | 2 --
 .../Inputs/x86-loopvectorize-costmodel.ll.expected              | 1 -
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6d28b8fabe42e..68363abdb817a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -206,7 +206,7 @@ static cl::opt<unsigned> VectorizeMemoryCheckThreshold(
     cl::desc("The maximum allowed number of runtime memory checks"));
 
 static cl::opt<bool> UseLegacyCostModel(
-    "vectorize-use-legacy-cost-model", cl::init(false), cl::Hidden,
+    "vectorize-use-legacy-cost-model", cl::init(true), cl::Hidden,
     cl::desc("Use the legacy cost model instead of the VPlan-based cost model. "
              "This option will be removed in the future."));
 
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
index fc310f4163082..1a78eaf644723 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
@@ -135,7 +135,6 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV: Interleaving is not beneficial.
 ; CHECK-NEXT:  LV: Found a vectorizable loop (vscale x 4) in <stdin>
 ; CHECK-NEXT:  LEV: Epilogue vectorization is not profitable for this loop
-; CHECK-NEXT:  VF picked by VPlan cost model: vscale x 4
 ; CHECK-NEXT:  Executing best plan with VF=vscale x 4, UF=1
 ; CHECK-NEXT:  VPlan 'Final VPlan for VF={vscale x 4},UF>=1' {
 ; CHECK-NEXT:  Live-in vp<%0> = VF * UF
@@ -339,7 +338,6 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV: Interleaving is not beneficial.
 ; CHECK-NEXT:  LV: Found a vectorizable loop (vscale x 4) in <stdin>
 ; CHECK-NEXT:  LEV: Epilogue vectorization is not profitable for this loop
-; CHECK-NEXT:  VF picked by VPlan cost model: vscale x 4
 ; CHECK-NEXT:  Executing best plan with VF=vscale x 4, UF=1
 ; CHECK-NEXT:  VPlan 'Final VPlan for VF={vscale x 4},UF>=1' {
 ; CHECK-NEXT:  Live-in vp<%0> = VF * UF
diff --git a/llvm/test/tools/UpdateTestChecks/update_analyze_test_checks/Inputs/x86-loopvectorize-costmodel.ll.expected b/llvm/test/tools/UpdateTestChecks/update_analyze_test_checks/Inputs/x86-loopvectorize-costmodel.ll.expected
index 5aa270e76f4c8..e862bf87d265c 100644
--- a/llvm/test/tools/UpdateTestChecks/update_analyze_test_checks/Inputs/x86-loopvectorize-costmodel.ll.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_analyze_test_checks/Inputs/x86-loopvectorize-costmodel.ll.expected
@@ -17,7 +17,6 @@ define void @test() {
 ; CHECK:  LV: Found an estimated cost of 5 for VF 16 For instruction: %v0 = load float, ptr %in0, align 4
 ; CHECK:  LV: Found an estimated cost of 22 for VF 32 For instruction: %v0 = load float, ptr %in0, align 4
 ; CHECK:  LV: Found an estimated cost of 92 for VF 64 For instruction: %v0 = load float, ptr %in0, align 4
-; CHECK:  LV: Found an estimated cost of 1 for VF 1 For instruction: %v0 = load float, ptr %in0, align 4
 ;
 entry:
   br label %for.body

>From c404c7ce7e917a42b8c7d677606c1e3dd476fbc7 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Tue, 23 Jul 2024 12:00:53 +0200
Subject: [PATCH 04/91] Revert " [LICM] Fold associative binary ops to promote
 code hoisting  (#81608)"

This reverts commit f2ccf80136a01ca69f766becafb329db6c54c0c8.

The flag propagation code is incorrect.

(cherry picked from commit b48819dbcdb48fc737dc22304ac343e4fdbae9ff)
---
 llvm/lib/Transforms/Scalar/LICM.cpp           |  62 ----
 llvm/test/CodeGen/PowerPC/common-chain.ll     | 315 +++++++++---------
 llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll   |  16 +-
 llvm/test/Transforms/LICM/hoist-binop.ll      |  99 ------
 llvm/test/Transforms/LICM/sink-foldable.ll    |   4 +-
 .../LICM/update-scev-after-hoist.ll           |   2 +-
 6 files changed, 163 insertions(+), 335 deletions(-)
 delete mode 100644 llvm/test/Transforms/LICM/hoist-binop.ll

diff --git a/llvm/lib/Transforms/Scalar/LICM.cpp b/llvm/lib/Transforms/Scalar/LICM.cpp
index fe264503dee9e..91ef2b4b7c183 100644
--- a/llvm/lib/Transforms/Scalar/LICM.cpp
+++ b/llvm/lib/Transforms/Scalar/LICM.cpp
@@ -113,8 +113,6 @@ STATISTIC(NumFPAssociationsHoisted, "Number of invariant FP expressions "
 STATISTIC(NumIntAssociationsHoisted,
           "Number of invariant int expressions "
           "reassociated and hoisted out of the loop");
-STATISTIC(NumBOAssociationsHoisted, "Number of invariant BinaryOp expressions "
-                                    "reassociated and hoisted out of the loop");
 
 /// Memory promotion is enabled by default.
 static cl::opt<bool>
@@ -2781,60 +2779,6 @@ static bool hoistMulAddAssociation(Instruction &I, Loop &L,
   return true;
 }
 
-/// Reassociate general associative binary expressions of the form
-///
-/// 1. "(LV op C1) op C2" ==> "LV op (C1 op C2)"
-///
-/// where op is an associative binary op, LV is a loop variant, and C1 and C2
-/// are loop invariants that we want to hoist.
-///
-/// TODO: This can be extended to more cases such as
-/// 2. "C1 op (C2 op LV)" ==> "(C1 op C2) op LV"
-/// 3. "(C1 op LV) op C2" ==> "LV op (C1 op C2)" if op is commutative
-/// 4. "C1 op (LV op C2)" ==> "(C1 op C2) op LV" if op is commutative
-static bool hoistBOAssociation(Instruction &I, Loop &L,
-                               ICFLoopSafetyInfo &SafetyInfo,
-                               MemorySSAUpdater &MSSAU, AssumptionCache *AC,
-                               DominatorTree *DT) {
-  BinaryOperator *BO = dyn_cast<BinaryOperator>(&I);
-  if (!BO || !BO->isAssociative())
-    return false;
-
-  Instruction::BinaryOps Opcode = BO->getOpcode();
-  BinaryOperator *Op0 = dyn_cast<BinaryOperator>(BO->getOperand(0));
-
-  // Transform: "(LV op C1) op C2" ==> "LV op (C1 op C2)"
-  if (Op0 && Op0->getOpcode() == Opcode) {
-    Value *LV = Op0->getOperand(0);
-    Value *C1 = Op0->getOperand(1);
-    Value *C2 = BO->getOperand(1);
-
-    if (L.isLoopInvariant(LV) || !L.isLoopInvariant(C1) ||
-        !L.isLoopInvariant(C2))
-      return false;
-
-    auto *Preheader = L.getLoopPreheader();
-    assert(Preheader && "Loop is not in simplify form?");
-    IRBuilder<> Builder(Preheader->getTerminator());
-    Value *Inv = Builder.CreateBinOp(Opcode, C1, C2, "invariant.op");
-
-    auto *NewBO =
-        BinaryOperator::Create(Opcode, LV, Inv, BO->getName() + ".reass", BO);
-    NewBO->copyIRFlags(BO);
-    BO->replaceAllUsesWith(NewBO);
-    eraseInstruction(*BO, SafetyInfo, MSSAU);
-
-    // Note: (LV op C1) might not be erased if it has more uses than the one we
-    //       just replaced.
-    if (Op0->use_empty())
-      eraseInstruction(*Op0, SafetyInfo, MSSAU);
-
-    return true;
-  }
-
-  return false;
-}
-
 static bool hoistArithmetics(Instruction &I, Loop &L,
                              ICFLoopSafetyInfo &SafetyInfo,
                              MemorySSAUpdater &MSSAU, AssumptionCache *AC,
@@ -2872,12 +2816,6 @@ static bool hoistArithmetics(Instruction &I, Loop &L,
     return true;
   }
 
-  if (hoistBOAssociation(I, L, SafetyInfo, MSSAU, AC, DT)) {
-    ++NumHoisted;
-    ++NumBOAssociationsHoisted;
-    return true;
-  }
-
   return false;
 }
 
diff --git a/llvm/test/CodeGen/PowerPC/common-chain.ll b/llvm/test/CodeGen/PowerPC/common-chain.ll
index ccf0e4520f468..5f8c21e30f8fd 100644
--- a/llvm/test/CodeGen/PowerPC/common-chain.ll
+++ b/llvm/test/CodeGen/PowerPC/common-chain.ll
@@ -642,8 +642,8 @@ define i64 @two_chain_two_bases_succ(ptr %p, i64 %offset, i64 %base1, i64 %base2
 ; CHECK-NEXT:    cmpdi r7, 0
 ; CHECK-NEXT:    ble cr0, .LBB6_4
 ; CHECK-NEXT:  # %bb.1: # %for.body.preheader
-; CHECK-NEXT:    add r5, r5, r4
 ; CHECK-NEXT:    add r6, r6, r4
+; CHECK-NEXT:    add r5, r5, r4
 ; CHECK-NEXT:    mtctr r7
 ; CHECK-NEXT:    sldi r4, r4, 1
 ; CHECK-NEXT:    add r5, r3, r5
@@ -743,219 +743,214 @@ define signext i32 @spill_reduce_succ(ptr %input1, ptr %input2, ptr %output, i64
 ; CHECK-NEXT:    std r9, -184(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    std r8, -176(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    std r7, -168(r1) # 8-byte Folded Spill
-; CHECK-NEXT:    std r4, -160(r1) # 8-byte Folded Spill
+; CHECK-NEXT:    std r3, -160(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    ble cr0, .LBB7_7
 ; CHECK-NEXT:  # %bb.1: # %for.body.preheader
-; CHECK-NEXT:    sldi r4, r6, 2
-; CHECK-NEXT:    li r6, 1
-; CHECK-NEXT:    mr r0, r10
-; CHECK-NEXT:    std r10, -192(r1) # 8-byte Folded Spill
-; CHECK-NEXT:    cmpdi r4, 1
-; CHECK-NEXT:    iselgt r4, r4, r6
-; CHECK-NEXT:    addi r7, r4, -1
-; CHECK-NEXT:    clrldi r6, r4, 63
-; CHECK-NEXT:    cmpldi r7, 3
+; CHECK-NEXT:    sldi r6, r6, 2
+; CHECK-NEXT:    li r7, 1
+; CHECK-NEXT:    mr r30, r10
+; CHECK-NEXT:    cmpdi r6, 1
+; CHECK-NEXT:    iselgt r7, r6, r7
+; CHECK-NEXT:    addi r8, r7, -1
+; CHECK-NEXT:    clrldi r6, r7, 63
+; CHECK-NEXT:    cmpldi r8, 3
 ; CHECK-NEXT:    blt cr0, .LBB7_4
 ; CHECK-NEXT:  # %bb.2: # %for.body.preheader.new
-; CHECK-NEXT:    ld r0, -192(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    ld r30, -184(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    ld r8, -176(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    rldicl r7, r4, 62, 2
-; CHECK-NEXT:    ld r9, -168(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    add r11, r0, r30
-; CHECK-NEXT:    add r4, r0, r0
-; CHECK-NEXT:    mulli r23, r0, 24
-; CHECK-NEXT:    add r14, r0, r8
-; CHECK-NEXT:    sldi r12, r0, 5
-; CHECK-NEXT:    add r31, r0, r9
-; CHECK-NEXT:    sldi r9, r9, 3
-; CHECK-NEXT:    sldi r18, r0, 4
-; CHECK-NEXT:    sldi r8, r8, 3
-; CHECK-NEXT:    add r10, r4, r4
-; CHECK-NEXT:    sldi r4, r30, 3
-; CHECK-NEXT:    sldi r11, r11, 3
-; CHECK-NEXT:    add r26, r12, r9
-; CHECK-NEXT:    add r16, r18, r9
-; CHECK-NEXT:    add r29, r12, r8
-; CHECK-NEXT:    add r19, r18, r8
-; CHECK-NEXT:    add r30, r12, r4
-; CHECK-NEXT:    mr r20, r4
-; CHECK-NEXT:    std r4, -200(r1) # 8-byte Folded Spill
-; CHECK-NEXT:    ld r4, -160(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    add r15, r5, r11
-; CHECK-NEXT:    sldi r11, r14, 3
-; CHECK-NEXT:    add r29, r5, r29
-; CHECK-NEXT:    add r28, r3, r26
-; CHECK-NEXT:    add r19, r5, r19
-; CHECK-NEXT:    add r21, r23, r9
-; CHECK-NEXT:    add r24, r23, r8
-; CHECK-NEXT:    add r14, r5, r11
-; CHECK-NEXT:    sldi r11, r31, 3
-; CHECK-NEXT:    add r25, r23, r20
-; CHECK-NEXT:    add r20, r18, r20
-; CHECK-NEXT:    add r30, r5, r30
-; CHECK-NEXT:    add r18, r3, r16
-; CHECK-NEXT:    add r24, r5, r24
-; CHECK-NEXT:    add r23, r3, r21
-; CHECK-NEXT:    add r27, r4, r26
-; CHECK-NEXT:    add r22, r4, r21
-; CHECK-NEXT:    add r17, r4, r16
-; CHECK-NEXT:    add r2, r4, r11
-; CHECK-NEXT:    rldicl r4, r7, 2, 1
-; CHECK-NEXT:    sub r7, r8, r9
-; CHECK-NEXT:    ld r8, -200(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    ld r14, -168(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    mulli r24, r30, 24
+; CHECK-NEXT:    ld r16, -184(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    ld r15, -176(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    ld r3, -160(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    rldicl r0, r7, 62, 2
+; CHECK-NEXT:    sldi r11, r30, 5
+; CHECK-NEXT:    sldi r19, r30, 4
+; CHECK-NEXT:    sldi r7, r14, 3
+; CHECK-NEXT:    add r14, r30, r14
+; CHECK-NEXT:    sldi r10, r16, 3
+; CHECK-NEXT:    sldi r12, r15, 3
+; CHECK-NEXT:    add r16, r30, r16
+; CHECK-NEXT:    add r15, r30, r15
+; CHECK-NEXT:    add r27, r11, r7
+; CHECK-NEXT:    add r22, r24, r7
+; CHECK-NEXT:    add r17, r19, r7
+; CHECK-NEXT:    sldi r2, r14, 3
+; CHECK-NEXT:    add r26, r24, r10
+; CHECK-NEXT:    add r25, r24, r12
+; CHECK-NEXT:    add r21, r19, r10
+; CHECK-NEXT:    add r20, r19, r12
+; CHECK-NEXT:    add r8, r11, r10
+; CHECK-NEXT:    sldi r16, r16, 3
+; CHECK-NEXT:    add r29, r5, r27
+; CHECK-NEXT:    add r28, r4, r27
+; CHECK-NEXT:    add r27, r3, r27
+; CHECK-NEXT:    add r24, r5, r22
+; CHECK-NEXT:    add r23, r4, r22
+; CHECK-NEXT:    add r22, r3, r22
+; CHECK-NEXT:    add r19, r5, r17
+; CHECK-NEXT:    add r18, r4, r17
+; CHECK-NEXT:    add r17, r3, r17
+; CHECK-NEXT:    add r14, r5, r2
+; CHECK-NEXT:    add r31, r4, r2
+; CHECK-NEXT:    add r2, r3, r2
+; CHECK-NEXT:    add r9, r5, r8
+; CHECK-NEXT:    add r8, r11, r12
 ; CHECK-NEXT:    add r26, r5, r26
 ; CHECK-NEXT:    add r25, r5, r25
 ; CHECK-NEXT:    add r21, r5, r21
 ; CHECK-NEXT:    add r20, r5, r20
 ; CHECK-NEXT:    add r16, r5, r16
-; CHECK-NEXT:    add r31, r5, r11
-; CHECK-NEXT:    add r11, r3, r11
-; CHECK-NEXT:    addi r4, r4, -4
-; CHECK-NEXT:    rldicl r4, r4, 62, 2
-; CHECK-NEXT:    sub r8, r8, r9
-; CHECK-NEXT:    li r9, 0
-; CHECK-NEXT:    addi r4, r4, 1
-; CHECK-NEXT:    mtctr r4
+; CHECK-NEXT:    add r8, r5, r8
+; CHECK-NEXT:    rldicl r3, r0, 2, 1
+; CHECK-NEXT:    addi r3, r3, -4
+; CHECK-NEXT:    sub r0, r12, r7
+; CHECK-NEXT:    sub r12, r10, r7
+; CHECK-NEXT:    li r7, 0
+; CHECK-NEXT:    mr r10, r30
+; CHECK-NEXT:    sldi r15, r15, 3
+; CHECK-NEXT:    add r15, r5, r15
+; CHECK-NEXT:    rldicl r3, r3, 62, 2
+; CHECK-NEXT:    addi r3, r3, 1
+; CHECK-NEXT:    mtctr r3
 ; CHECK-NEXT:    .p2align 4
 ; CHECK-NEXT:  .LBB7_3: # %for.body
 ; CHECK-NEXT:    #
-; CHECK-NEXT:    lfd f0, 0(r11)
-; CHECK-NEXT:    lfd f1, 0(r2)
-; CHECK-NEXT:    add r0, r0, r10
-; CHECK-NEXT:    xsmuldp f0, f0, f1
+; CHECK-NEXT:    lfd f0, 0(r2)
 ; CHECK-NEXT:    lfd f1, 0(r31)
+; CHECK-NEXT:    add r3, r10, r30
+; CHECK-NEXT:    add r3, r3, r30
+; CHECK-NEXT:    xsmuldp f0, f0, f1
+; CHECK-NEXT:    lfd f1, 0(r14)
+; CHECK-NEXT:    add r3, r3, r30
+; CHECK-NEXT:    add r10, r3, r30
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfd f0, 0(r31)
-; CHECK-NEXT:    add r31, r31, r12
-; CHECK-NEXT:    lfdx f0, r11, r7
-; CHECK-NEXT:    lfdx f1, r2, r7
+; CHECK-NEXT:    stfd f0, 0(r14)
+; CHECK-NEXT:    add r14, r14, r11
+; CHECK-NEXT:    lfdx f0, r2, r0
+; CHECK-NEXT:    lfdx f1, r31, r0
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r14, r9
+; CHECK-NEXT:    lfdx f1, r15, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r14, r9
-; CHECK-NEXT:    lfdx f0, r11, r8
-; CHECK-NEXT:    lfdx f1, r2, r8
-; CHECK-NEXT:    add r11, r11, r12
-; CHECK-NEXT:    add r2, r2, r12
+; CHECK-NEXT:    stfdx f0, r15, r7
+; CHECK-NEXT:    lfdx f0, r2, r12
+; CHECK-NEXT:    lfdx f1, r31, r12
+; CHECK-NEXT:    add r2, r2, r11
+; CHECK-NEXT:    add r31, r31, r11
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r15, r9
+; CHECK-NEXT:    lfdx f1, r16, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r15, r9
-; CHECK-NEXT:    lfd f0, 0(r18)
-; CHECK-NEXT:    lfd f1, 0(r17)
+; CHECK-NEXT:    stfdx f0, r16, r7
+; CHECK-NEXT:    lfd f0, 0(r17)
+; CHECK-NEXT:    lfd f1, 0(r18)
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r16, r9
+; CHECK-NEXT:    lfdx f1, r19, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r16, r9
-; CHECK-NEXT:    lfdx f0, r18, r7
-; CHECK-NEXT:    lfdx f1, r17, r7
+; CHECK-NEXT:    stfdx f0, r19, r7
+; CHECK-NEXT:    lfdx f0, r17, r0
+; CHECK-NEXT:    lfdx f1, r18, r0
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r19, r9
+; CHECK-NEXT:    lfdx f1, r20, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r19, r9
-; CHECK-NEXT:    lfdx f0, r18, r8
-; CHECK-NEXT:    lfdx f1, r17, r8
-; CHECK-NEXT:    add r18, r18, r12
-; CHECK-NEXT:    add r17, r17, r12
+; CHECK-NEXT:    stfdx f0, r20, r7
+; CHECK-NEXT:    lfdx f0, r17, r12
+; CHECK-NEXT:    lfdx f1, r18, r12
+; CHECK-NEXT:    add r17, r17, r11
+; CHECK-NEXT:    add r18, r18, r11
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r20, r9
+; CHECK-NEXT:    lfdx f1, r21, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r20, r9
-; CHECK-NEXT:    lfd f0, 0(r23)
-; CHECK-NEXT:    lfd f1, 0(r22)
+; CHECK-NEXT:    stfdx f0, r21, r7
+; CHECK-NEXT:    lfd f0, 0(r22)
+; CHECK-NEXT:    lfd f1, 0(r23)
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r21, r9
+; CHECK-NEXT:    lfdx f1, r24, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r21, r9
-; CHECK-NEXT:    lfdx f0, r23, r7
-; CHECK-NEXT:    lfdx f1, r22, r7
+; CHECK-NEXT:    stfdx f0, r24, r7
+; CHECK-NEXT:    lfdx f0, r22, r0
+; CHECK-NEXT:    lfdx f1, r23, r0
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r24, r9
+; CHECK-NEXT:    lfdx f1, r25, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r24, r9
-; CHECK-NEXT:    lfdx f0, r23, r8
-; CHECK-NEXT:    lfdx f1, r22, r8
-; CHECK-NEXT:    add r23, r23, r12
-; CHECK-NEXT:    add r22, r22, r12
+; CHECK-NEXT:    stfdx f0, r25, r7
+; CHECK-NEXT:    lfdx f0, r22, r12
+; CHECK-NEXT:    lfdx f1, r23, r12
+; CHECK-NEXT:    add r22, r22, r11
+; CHECK-NEXT:    add r23, r23, r11
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r25, r9
+; CHECK-NEXT:    lfdx f1, r26, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r25, r9
-; CHECK-NEXT:    lfd f0, 0(r28)
-; CHECK-NEXT:    lfd f1, 0(r27)
+; CHECK-NEXT:    stfdx f0, r26, r7
+; CHECK-NEXT:    lfd f0, 0(r27)
+; CHECK-NEXT:    lfd f1, 0(r28)
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r26, r9
+; CHECK-NEXT:    lfdx f1, r29, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r26, r9
-; CHECK-NEXT:    lfdx f0, r28, r7
-; CHECK-NEXT:    lfdx f1, r27, r7
+; CHECK-NEXT:    stfdx f0, r29, r7
+; CHECK-NEXT:    lfdx f0, r27, r0
+; CHECK-NEXT:    lfdx f1, r28, r0
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r29, r9
+; CHECK-NEXT:    lfdx f1, r8, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r29, r9
-; CHECK-NEXT:    lfdx f0, r28, r8
-; CHECK-NEXT:    lfdx f1, r27, r8
-; CHECK-NEXT:    add r28, r28, r12
-; CHECK-NEXT:    add r27, r27, r12
+; CHECK-NEXT:    stfdx f0, r8, r7
+; CHECK-NEXT:    lfdx f0, r27, r12
+; CHECK-NEXT:    lfdx f1, r28, r12
+; CHECK-NEXT:    add r27, r27, r11
+; CHECK-NEXT:    add r28, r28, r11
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r30, r9
+; CHECK-NEXT:    lfdx f1, r9, r7
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r30, r9
-; CHECK-NEXT:    add r9, r9, r12
+; CHECK-NEXT:    stfdx f0, r9, r7
+; CHECK-NEXT:    add r7, r7, r11
 ; CHECK-NEXT:    bdnz .LBB7_3
 ; CHECK-NEXT:  .LBB7_4: # %for.cond.cleanup.loopexit.unr-lcssa
-; CHECK-NEXT:    ld r7, -192(r1) # 8-byte Folded Reload
 ; CHECK-NEXT:    cmpldi r6, 0
 ; CHECK-NEXT:    beq cr0, .LBB7_7
 ; CHECK-NEXT:  # %bb.5: # %for.body.epil.preheader
-; CHECK-NEXT:    ld r4, -184(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    ld r29, -160(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    mr r30, r3
-; CHECK-NEXT:    sldi r7, r7, 3
-; CHECK-NEXT:    add r4, r0, r4
-; CHECK-NEXT:    sldi r4, r4, 3
-; CHECK-NEXT:    add r3, r5, r4
-; CHECK-NEXT:    add r8, r29, r4
-; CHECK-NEXT:    add r9, r30, r4
-; CHECK-NEXT:    ld r4, -176(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    add r4, r0, r4
-; CHECK-NEXT:    sldi r4, r4, 3
-; CHECK-NEXT:    add r10, r5, r4
-; CHECK-NEXT:    add r11, r29, r4
-; CHECK-NEXT:    add r12, r30, r4
-; CHECK-NEXT:    ld r4, -168(r1) # 8-byte Folded Reload
-; CHECK-NEXT:    add r4, r0, r4
-; CHECK-NEXT:    sldi r0, r4, 3
-; CHECK-NEXT:    add r5, r5, r0
-; CHECK-NEXT:    add r4, r29, r0
-; CHECK-NEXT:    add r30, r30, r0
-; CHECK-NEXT:    li r0, 0
+; CHECK-NEXT:    ld r3, -184(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    ld r0, -160(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    sldi r8, r30, 3
+; CHECK-NEXT:    add r3, r10, r3
+; CHECK-NEXT:    sldi r3, r3, 3
+; CHECK-NEXT:    add r7, r5, r3
+; CHECK-NEXT:    add r9, r4, r3
+; CHECK-NEXT:    add r11, r0, r3
+; CHECK-NEXT:    ld r3, -176(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    add r3, r10, r3
+; CHECK-NEXT:    sldi r3, r3, 3
+; CHECK-NEXT:    add r12, r5, r3
+; CHECK-NEXT:    add r30, r4, r3
+; CHECK-NEXT:    add r29, r0, r3
+; CHECK-NEXT:    ld r3, -168(r1) # 8-byte Folded Reload
+; CHECK-NEXT:    add r3, r10, r3
+; CHECK-NEXT:    li r10, 0
+; CHECK-NEXT:    sldi r3, r3, 3
+; CHECK-NEXT:    add r5, r5, r3
+; CHECK-NEXT:    add r4, r4, r3
+; CHECK-NEXT:    add r3, r0, r3
 ; CHECK-NEXT:    .p2align 4
 ; CHECK-NEXT:  .LBB7_6: # %for.body.epil
 ; CHECK-NEXT:    #
-; CHECK-NEXT:    lfdx f0, r30, r0
-; CHECK-NEXT:    lfdx f1, r4, r0
+; CHECK-NEXT:    lfdx f0, r3, r10
+; CHECK-NEXT:    lfdx f1, r4, r10
 ; CHECK-NEXT:    addi r6, r6, -1
 ; CHECK-NEXT:    cmpldi r6, 0
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
 ; CHECK-NEXT:    lfd f1, 0(r5)
 ; CHECK-NEXT:    xsadddp f0, f1, f0
 ; CHECK-NEXT:    stfd f0, 0(r5)
-; CHECK-NEXT:    add r5, r5, r7
-; CHECK-NEXT:    lfdx f0, r12, r0
-; CHECK-NEXT:    lfdx f1, r11, r0
+; CHECK-NEXT:    add r5, r5, r8
+; CHECK-NEXT:    lfdx f0, r29, r10
+; CHECK-NEXT:    lfdx f1, r30, r10
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r10, r0
+; CHECK-NEXT:    lfdx f1, r12, r10
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r10, r0
-; CHECK-NEXT:    lfdx f0, r9, r0
-; CHECK-NEXT:    lfdx f1, r8, r0
+; CHECK-NEXT:    stfdx f0, r12, r10
+; CHECK-NEXT:    lfdx f0, r11, r10
+; CHECK-NEXT:    lfdx f1, r9, r10
 ; CHECK-NEXT:    xsmuldp f0, f0, f1
-; CHECK-NEXT:    lfdx f1, r3, r0
+; CHECK-NEXT:    lfdx f1, r7, r10
 ; CHECK-NEXT:    xsadddp f0, f1, f0
-; CHECK-NEXT:    stfdx f0, r3, r0
-; CHECK-NEXT:    add r0, r0, r7
+; CHECK-NEXT:    stfdx f0, r7, r10
+; CHECK-NEXT:    add r10, r10, r8
 ; CHECK-NEXT:    bne cr0, .LBB7_6
 ; CHECK-NEXT:  .LBB7_7: # %for.cond.cleanup
 ; CHECK-NEXT:    ld r2, -152(r1) # 8-byte Folded Reload
diff --git a/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll b/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
index c733a01950603..4b032781c3764 100644
--- a/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
+++ b/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
@@ -30,16 +30,14 @@ define dso_local void @P10_Spill_CR_LT() local_unnamed_addr {
 ; CHECK-NEXT:    mflr r0
 ; CHECK-NEXT:    std r0, 16(r1)
 ; CHECK-NEXT:    stw r12, 8(r1)
-; CHECK-NEXT:    stdu r1, -64(r1)
-; CHECK-NEXT:    .cfi_def_cfa_offset 64
+; CHECK-NEXT:    stdu r1, -48(r1)
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
 ; CHECK-NEXT:    .cfi_offset lr, 16
-; CHECK-NEXT:    .cfi_offset r29, -24
 ; CHECK-NEXT:    .cfi_offset r30, -16
 ; CHECK-NEXT:    .cfi_offset cr2, 8
 ; CHECK-NEXT:    .cfi_offset cr3, 8
 ; CHECK-NEXT:    .cfi_offset cr4, 8
-; CHECK-NEXT:    std r29, 40(r1) # 8-byte Folded Spill
-; CHECK-NEXT:    std r30, 48(r1) # 8-byte Folded Spill
+; CHECK-NEXT:    std r30, 32(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    bl call_2 at notoc
 ; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_13
 ; CHECK-NEXT:  # %bb.1: # %bb
@@ -67,11 +65,10 @@ define dso_local void @P10_Spill_CR_LT() local_unnamed_addr {
 ; CHECK-NEXT:    bc 12, 4*cr3+eq, .LBB0_11
 ; CHECK-NEXT:  # %bb.6: # %bb32
 ; CHECK-NEXT:    #
+; CHECK-NEXT:    rlwinm r30, r30, 0, 24, 22
 ; CHECK-NEXT:    andi. r3, r30, 2
-; CHECK-NEXT:    rlwinm r29, r30, 0, 24, 22
 ; CHECK-NEXT:    mcrf cr2, cr0
 ; CHECK-NEXT:    bl call_4 at notoc
-; CHECK-NEXT:    mr r30, r29
 ; CHECK-NEXT:    beq+ cr2, .LBB0_3
 ; CHECK-NEXT:  # %bb.7: # %bb37
 ; CHECK-NEXT:  .LBB0_8: # %bb22
@@ -92,13 +89,11 @@ define dso_local void @P10_Spill_CR_LT() local_unnamed_addr {
 ; CHECK-BE-NEXT:    stdu r1, -144(r1)
 ; CHECK-BE-NEXT:    .cfi_def_cfa_offset 144
 ; CHECK-BE-NEXT:    .cfi_offset lr, 16
-; CHECK-BE-NEXT:    .cfi_offset r28, -32
 ; CHECK-BE-NEXT:    .cfi_offset r29, -24
 ; CHECK-BE-NEXT:    .cfi_offset r30, -16
 ; CHECK-BE-NEXT:    .cfi_offset cr2, 8
 ; CHECK-BE-NEXT:    .cfi_offset cr2, 8
 ; CHECK-BE-NEXT:    .cfi_offset cr2, 8
-; CHECK-BE-NEXT:    std r28, 112(r1) # 8-byte Folded Spill
 ; CHECK-BE-NEXT:    std r29, 120(r1) # 8-byte Folded Spill
 ; CHECK-BE-NEXT:    std r30, 128(r1) # 8-byte Folded Spill
 ; CHECK-BE-NEXT:    bl call_2
@@ -131,12 +126,11 @@ define dso_local void @P10_Spill_CR_LT() local_unnamed_addr {
 ; CHECK-BE-NEXT:    bc 12, 4*cr3+eq, .LBB0_11
 ; CHECK-BE-NEXT:  # %bb.6: # %bb32
 ; CHECK-BE-NEXT:    #
+; CHECK-BE-NEXT:    rlwinm r29, r29, 0, 24, 22
 ; CHECK-BE-NEXT:    andi. r3, r29, 2
-; CHECK-BE-NEXT:    rlwinm r28, r29, 0, 24, 22
 ; CHECK-BE-NEXT:    mcrf cr2, cr0
 ; CHECK-BE-NEXT:    bl call_4
 ; CHECK-BE-NEXT:    nop
-; CHECK-BE-NEXT:    mr r29, r28
 ; CHECK-BE-NEXT:    beq+ cr2, .LBB0_3
 ; CHECK-BE-NEXT:  # %bb.7: # %bb37
 ; CHECK-BE-NEXT:  .LBB0_8: # %bb22
diff --git a/llvm/test/Transforms/LICM/hoist-binop.ll b/llvm/test/Transforms/LICM/hoist-binop.ll
deleted file mode 100644
index 1fae3561e7809..0000000000000
--- a/llvm/test/Transforms/LICM/hoist-binop.ll
+++ /dev/null
@@ -1,99 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -S -passes=licm < %s | FileCheck %s
-
-; Adapted from:
-;   for(long i = 0; i < n; ++i)
-;     a[i] = (i*k) * v;
-define void @test(i64 %n, i64 %k) {
-; CHECK-LABEL: @test(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:    br label [[FOR_PH:%.*]]
-; CHECK:       for.ph:
-; CHECK-NEXT:    [[K_2:%.*]] = shl nuw nsw i64 [[K:%.*]], 1
-; CHECK-NEXT:    [[VEC_INIT:%.*]] = insertelement <2 x i64> zeroinitializer, i64 [[K]], i64 1
-; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[K_2]], i64 0
-; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
-; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add <2 x i64> [[DOTSPLAT]], [[DOTSPLAT]]
-; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
-; CHECK:       for.body:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[FOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <2 x i64> [ [[VEC_INIT]], [[FOR_PH]] ], [ [[VEC_IND_NEXT_REASS:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[STEP_ADD:%.*]] = add <2 x i64> [[VEC_IND]], [[DOTSPLAT]]
-; CHECK-NEXT:    call void @use(<2 x i64> [[STEP_ADD]])
-; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
-; CHECK-NEXT:    [[VEC_IND_NEXT_REASS]] = add <2 x i64> [[VEC_IND]], [[INVARIANT_OP]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N:%.*]]
-; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_END:%.*]], label [[FOR_BODY]]
-; CHECK:       for.end:
-; CHECK-NEXT:    ret void
-;
-entry:
-  br label %for.ph
-
-for.ph:
-  %k.2 = shl nuw nsw i64 %k, 1
-  %vec.init = insertelement <2 x i64> zeroinitializer, i64 %k, i64 1
-  %.splatinsert = insertelement <2 x i64> poison, i64 %k.2, i64 0
-  %.splat = shufflevector <2 x i64> %.splatinsert, <2 x i64> poison, <2 x i32> zeroinitializer
-  br label %for.body
-
-for.body:
-  %index = phi i64 [ 0, %for.ph ], [ %index.next, %for.body ]
-  %vec.ind = phi <2 x i64> [ %vec.init, %for.ph ], [ %vec.ind.next, %for.body ]
-  %step.add = add <2 x i64> %vec.ind, %.splat
-  call void @use(<2 x i64> %step.add)
-  %index.next = add nuw i64 %index, 4
-  %vec.ind.next = add <2 x i64> %step.add, %.splat
-  %cmp = icmp eq i64 %index.next, %n
-  br i1 %cmp, label %for.end, label %for.body
-
-for.end:
-  ret void
-}
-
-; Same as above but `%step.add` is unused and thus removed.
-define void @test_single_use(i64 %n, i64 %k) {
-; CHECK-LABEL: @test_single_use(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:    br label [[FOR_PH:%.*]]
-; CHECK:       for.ph:
-; CHECK-NEXT:    [[K_2:%.*]] = shl nuw nsw i64 [[K:%.*]], 1
-; CHECK-NEXT:    [[VEC_INIT:%.*]] = insertelement <2 x i64> zeroinitializer, i64 [[K]], i64 1
-; CHECK-NEXT:    [[DOTSPLATINSERT:%.*]] = insertelement <2 x i64> poison, i64 [[K_2]], i64 0
-; CHECK-NEXT:    [[DOTSPLAT:%.*]] = shufflevector <2 x i64> [[DOTSPLATINSERT]], <2 x i64> poison, <2 x i32> zeroinitializer
-; CHECK-NEXT:    [[INVARIANT_OP:%.*]] = add <2 x i64> [[DOTSPLAT]], [[DOTSPLAT]]
-; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
-; CHECK:       for.body:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[FOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <2 x i64> [ [[VEC_INIT]], [[FOR_PH]] ], [ [[VEC_IND_NEXT_REASS:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
-; CHECK-NEXT:    [[VEC_IND_NEXT_REASS]] = add <2 x i64> [[VEC_IND]], [[INVARIANT_OP]]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N:%.*]]
-; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_END:%.*]], label [[FOR_BODY]]
-; CHECK:       for.end:
-; CHECK-NEXT:    ret void
-;
-entry:
-  br label %for.ph
-
-for.ph:
-  %k.2 = shl nuw nsw i64 %k, 1
-  %vec.init = insertelement <2 x i64> zeroinitializer, i64 %k, i64 1
-  %.splatinsert = insertelement <2 x i64> poison, i64 %k.2, i64 0
-  %.splat = shufflevector <2 x i64> %.splatinsert, <2 x i64> poison, <2 x i32> zeroinitializer
-  br label %for.body
-
-for.body:
-  %index = phi i64 [ 0, %for.ph ], [ %index.next, %for.body ]
-  %vec.ind = phi <2 x i64> [ %vec.init, %for.ph ], [ %vec.ind.next, %for.body ]
-  %step.add = add <2 x i64> %vec.ind, %.splat
-  %index.next = add nuw i64 %index, 4
-  %vec.ind.next = add <2 x i64> %step.add, %.splat
-  %cmp = icmp eq i64 %index.next, %n
-  br i1 %cmp, label %for.end, label %for.body
-
-for.end:
-  ret void
-}
-
-declare void @use(<2 x i64>)
diff --git a/llvm/test/Transforms/LICM/sink-foldable.ll b/llvm/test/Transforms/LICM/sink-foldable.ll
index b0130dfbb0713..38577a5a12563 100644
--- a/llvm/test/Transforms/LICM/sink-foldable.ll
+++ b/llvm/test/Transforms/LICM/sink-foldable.ll
@@ -79,7 +79,7 @@ define ptr @test2(i32 %j, ptr readonly %P, ptr readnone %Q) {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.cond:
-; CHECK-NEXT:    [[I_ADDR_0:%.*]] = phi i32 [ [[ADD_REASS:%.*]], [[IF_END:%.*]] ]
+; CHECK-NEXT:    [[I_ADDR_0:%.*]] = phi i32 [ [[ADD:%.*]], [[IF_END:%.*]] ]
 ; CHECK-NEXT:    [[P_ADDR_0:%.*]] = phi ptr [ [[ADD_PTR:%.*]], [[IF_END]] ]
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[I_ADDR_0]], [[J:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_BODY]], label [[LOOPEXIT0:%.*]]
@@ -97,7 +97,7 @@ define ptr @test2(i32 %j, ptr readonly %P, ptr readnone %Q) {
 ; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr inbounds ptr, ptr [[ADD_PTR]], i64 [[IDX2_EXT]]
 ; CHECK-NEXT:    [[L1:%.*]] = load ptr, ptr [[ARRAYIDX2]], align 8
 ; CHECK-NEXT:    [[CMP2:%.*]] = icmp ugt ptr [[L1]], [[Q]]
-; CHECK-NEXT:    [[ADD_REASS]] = add nsw i32 [[I_ADDR]], 2
+; CHECK-NEXT:    [[ADD]] = add nsw i32 [[ADD_I]], 1
 ; CHECK-NEXT:    br i1 [[CMP2]], label [[LOOPEXIT2:%.*]], label [[FOR_COND]]
 ; CHECK:       loopexit0:
 ; CHECK-NEXT:    [[P0:%.*]] = phi ptr [ null, [[FOR_COND]] ]
diff --git a/llvm/test/Transforms/LICM/update-scev-after-hoist.ll b/llvm/test/Transforms/LICM/update-scev-after-hoist.ll
index f01008036e9da..fc45b8fce1766 100644
--- a/llvm/test/Transforms/LICM/update-scev-after-hoist.ll
+++ b/llvm/test/Transforms/LICM/update-scev-after-hoist.ll
@@ -2,7 +2,7 @@
 
 define i16 @main() {
 ; SCEV-EXPR:      Classifying expressions for: @main
-; SCEV-EXPR-NEXT:  %mul = phi i16 [ 1, %entry ], [ %mul.n.3.reass, %loop ]
+; SCEV-EXPR-NEXT:  %mul = phi i16 [ 1, %entry ], [ %mul.n.3, %loop ]
 ; SCEV-EXPR-NEXT:  -->  %mul U: [0,-15) S: [-32768,32753)		Exits: 4096		LoopDispositions: { %loop: Variant }
 ; SCEV-EXPR-NEXT:  %div = phi i16 [ 32767, %entry ], [ %div.n.3, %loop ]
 ; SCEV-EXPR-NEXT:  -->  %div U: [-2048,-32768) S: [-2048,-32768)		Exits: 7		LoopDispositions: { %loop: Variant }

>From 843276aa2c97f46581309a441a7d55a45dd3d6c0 Mon Sep 17 00:00:00 2001
From: Oliver Hunt <oliver at apple.com>
Date: Tue, 23 Jul 2024 14:18:53 -0700
Subject: [PATCH 05/91] [clang][test] Add function type discrimination tests to
 static destructor tests (#99604)

I accidentally did not include tests for the setting up runtime calls when compiling with -fptrauth-function-pointer-type-discrimination

(cherry picked from commit 8be1325cb1903797ba3dce67087e395f9e080576)
---
 .../CodeGenCXX/ptrauth-static-destructors.cpp | 37 ++++++++++++++++---
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/clang/test/CodeGenCXX/ptrauth-static-destructors.cpp b/clang/test/CodeGenCXX/ptrauth-static-destructors.cpp
index 1240f26d329da..634450bf62ea9 100644
--- a/clang/test/CodeGenCXX/ptrauth-static-destructors.cpp
+++ b/clang/test/CodeGenCXX/ptrauth-static-destructors.cpp
@@ -2,13 +2,27 @@
 // RUN:  | FileCheck %s --check-prefix=CXAATEXIT
 
 // RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -emit-llvm -std=c++11 %s -o - \
-// RUN:    -fno-use-cxa-atexit | FileCheck %s --check-prefixes=ATEXIT,DARWIN
+// RUN:    -fno-use-cxa-atexit | FileCheck %s --check-prefixes=ATEXIT,ATEXIT_DARWIN
 
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -emit-llvm -std=c++11 %s -o - \
 // RUN:  | FileCheck %s --check-prefix=CXAATEXIT
 
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -emit-llvm -std=c++11 %s -o - \
-// RUN:    -fno-use-cxa-atexit | FileCheck %s --check-prefixes=ATEXIT,ELF
+// RUN:    -fno-use-cxa-atexit | FileCheck %s --check-prefixes=ATEXIT,ATEXIT_ELF
+
+// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -emit-llvm -std=c++11 %s \
+// RUN:  -fptrauth-function-pointer-type-discrimination  -o - | FileCheck %s --check-prefix=CXAATEXIT_DISC
+
+// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -emit-llvm -std=c++11 %s -o - \
+// RUN:   -fptrauth-function-pointer-type-discrimination  -fno-use-cxa-atexit \
+// RUN:  | FileCheck %s --check-prefixes=ATEXIT_DISC,ATEXIT_DISC_DARWIN
+
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -emit-llvm -std=c++11 %s \
+// RUN:  -fptrauth-function-pointer-type-discrimination  -o - | FileCheck %s --check-prefix=CXAATEXIT_DISC
+
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -emit-llvm -std=c++11 %s -o - \
+// RUN:   -fptrauth-function-pointer-type-discrimination -fno-use-cxa-atexit \
+// RUN:  | FileCheck %s --check-prefixes=ATEXIT_DISC,ATEXIT_DISC_ELF
 
 class Foo {
  public:
@@ -21,11 +35,22 @@ Foo global;
 // CXAATEXIT: define internal void @__cxx_global_var_init()
 // CXAATEXIT:   call i32 @__cxa_atexit(ptr ptrauth (ptr @_ZN3FooD1Ev, i32 0), ptr @global, ptr @__dso_handle)
 
+// CXAATEXIT_DISC: define internal void @__cxx_global_var_init()
+// CXAATEXIT_DISC:   call i32 @__cxa_atexit(ptr ptrauth (ptr @_ZN3FooD1Ev, i32 0, i64 10942), ptr @global, ptr @__dso_handle)
 
 // ATEXIT: define internal void @__cxx_global_var_init()
 // ATEXIT:   %{{.*}} = call i32 @atexit(ptr ptrauth (ptr @__dtor_global, i32 0))
 
-// DARWIN: define internal void @__dtor_global() {{.*}} section "__TEXT,__StaticInit,regular,pure_instructions" {
-// ELF:    define internal void @__dtor_global() {{.*}} section ".text.startup" {
-// DARWIN:   %{{.*}} = call ptr @_ZN3FooD1Ev(ptr @global)
-// ELF:      call void @_ZN3FooD1Ev(ptr @global)
+// ATEXIT_DARWIN: define internal void @__dtor_global() {{.*}} section "__TEXT,__StaticInit,regular,pure_instructions" {
+// ATEXIT_ELF:    define internal void @__dtor_global() {{.*}} section ".text.startup" {
+// ATEXIT_DARWIN:   %{{.*}} = call ptr @_ZN3FooD1Ev(ptr @global)
+// ATEXIT_ELF:      call void @_ZN3FooD1Ev(ptr @global)
+
+// ATEXIT_DISC: define internal void @__cxx_global_var_init()
+// ATEXIT_DISC:   %{{.*}} = call i32 @atexit(ptr ptrauth (ptr @__dtor_global, i32 0, i64 10942))
+
+
+// ATEXIT_DISC_DARWIN: define internal void @__dtor_global() {{.*}} section "__TEXT,__StaticInit,regular,pure_instructions" {
+// ATEXIT_DISC_ELF:    define internal void @__dtor_global() {{.*}} section ".text.startup" {
+// ATEXIT_DISC_DARWIN:   %{{.*}} = call ptr @_ZN3FooD1Ev(ptr @global)
+// ATEXIT_DISC_ELF:      call void @_ZN3FooD1Ev(ptr @global)

>From 95ed2d007951374c8ae905b2cce4be262865e442 Mon Sep 17 00:00:00 2001
From: Akira Hatanaka <ahatanak at gmail.com>
Date: Tue, 23 Jul 2024 14:39:58 -0700
Subject: [PATCH 06/91] [PAC][compiler-rt][UBSan] Strip signed vptr instead of
 authenticating it (#100153)

vptr cannot be authenticated without knowing the class type if it was
signed with type discrimination.

Co-authored-by: Oliver Hunt <oliver at apple.com>
(cherry picked from commit 0a6a3c152faf56e07dd4f9e89e534d2b97eeab56)
---
 compiler-rt/lib/ubsan/ubsan_type_hash_itanium.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compiler-rt/lib/ubsan/ubsan_type_hash_itanium.cpp b/compiler-rt/lib/ubsan/ubsan_type_hash_itanium.cpp
index 468a8fcd603f0..15788574dd995 100644
--- a/compiler-rt/lib/ubsan/ubsan_type_hash_itanium.cpp
+++ b/compiler-rt/lib/ubsan/ubsan_type_hash_itanium.cpp
@@ -207,7 +207,7 @@ struct VtablePrefix {
   std::type_info *TypeInfo;
 };
 VtablePrefix *getVtablePrefix(void *Vtable) {
-  Vtable = ptrauth_auth_data(Vtable, ptrauth_key_cxx_vtable_pointer, 0);
+  Vtable = ptrauth_strip(Vtable, ptrauth_key_cxx_vtable_pointer);
   VtablePrefix *Vptr = reinterpret_cast<VtablePrefix*>(Vtable);
   VtablePrefix *Prefix = Vptr - 1;
   if (!IsAccessibleMemoryRange((uptr)Prefix, sizeof(VtablePrefix)))

>From aa425eb0e2f5174e50ec84766861ae3f6186d39b Mon Sep 17 00:00:00 2001
From: hev <wangrui at loongson.cn>
Date: Wed, 24 Jul 2024 12:08:43 +0800
Subject: [PATCH 07/91] [LoongArch] Fix codegen for ISD::ROTR (#100292)

This patch fixes the code generation for IR:

sext i32 (trunc i64 (rotr i64 %x, i64 %y) to i32) to i64

(cherry picked from commit e386aacb747b4512dedf481ad83e054d3dd641e6)
---
 .../Target/LoongArch/LoongArchInstrInfo.td    |  1 -
 llvm/test/CodeGen/LoongArch/rotl-rotr.ll      | 36 +++++++++++++++++++
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
index 97f0e8d6a10c7..ec0d071453c3f 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
@@ -1144,7 +1144,6 @@ def : PatGprGpr<urem, MOD_DU>;
 def : PatGprGpr<loongarch_mod_wu, MOD_WU>;
 def : PatGprGpr<rotr, ROTR_D>;
 def : PatGprGpr<loongarch_rotr_w, ROTR_W>;
-def : PatGprGpr_32<rotr, ROTR_W>;
 def : PatGprImm<rotr, ROTRI_D, uimm6>;
 def : PatGprImm_32<rotr, ROTRI_W, uimm5>;
 def : PatGprImm<loongarch_rotr_w, ROTRI_W, uimm5>;
diff --git a/llvm/test/CodeGen/LoongArch/rotl-rotr.ll b/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
index b2d46f5c088ba..75461f5820984 100644
--- a/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
+++ b/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
@@ -504,6 +504,42 @@ define i64 @rotr_64_mask_or_128_or_64(i64 %x, i64 %y) nounwind {
   ret i64 %f
 }
 
+define signext i32 @rotr_64_trunc_32(i64 %x, i64 %y) nounwind {
+; LA32-LABEL: rotr_64_trunc_32:
+; LA32:       # %bb.0:
+; LA32-NEXT:    srl.w $a3, $a0, $a2
+; LA32-NEXT:    xori $a4, $a2, 31
+; LA32-NEXT:    slli.w $a5, $a1, 1
+; LA32-NEXT:    sll.w $a4, $a5, $a4
+; LA32-NEXT:    or $a3, $a3, $a4
+; LA32-NEXT:    addi.w $a4, $a2, -32
+; LA32-NEXT:    slti $a5, $a4, 0
+; LA32-NEXT:    maskeqz $a3, $a3, $a5
+; LA32-NEXT:    srl.w $a1, $a1, $a4
+; LA32-NEXT:    masknez $a1, $a1, $a5
+; LA32-NEXT:    or $a1, $a3, $a1
+; LA32-NEXT:    sub.w $a3, $zero, $a2
+; LA32-NEXT:    sll.w $a0, $a0, $a3
+; LA32-NEXT:    ori $a3, $zero, 32
+; LA32-NEXT:    sub.w $a2, $a3, $a2
+; LA32-NEXT:    srai.w $a2, $a2, 31
+; LA32-NEXT:    and $a0, $a2, $a0
+; LA32-NEXT:    or $a0, $a1, $a0
+; LA32-NEXT:    ret
+;
+; LA64-LABEL: rotr_64_trunc_32:
+; LA64:       # %bb.0:
+; LA64-NEXT:    rotr.d $a0, $a0, $a1
+; LA64-NEXT:    addi.w $a0, $a0, 0
+; LA64-NEXT:    ret
+  %z = sub i64 64, %y
+  %b = lshr i64 %x, %y
+  %c = shl i64 %x, %z
+  %d = or i64 %b, %c
+  %e = trunc i64 %d to i32
+  ret i32 %e
+}
+
 define signext i32 @rotri_i32(i32 signext %a) nounwind {
 ; LA32-LABEL: rotri_i32:
 ; LA32:       # %bb.0:

>From dcc22f984454ef3e390b6a9183b3f79ac4b860e7 Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 23 Jul 2024 12:54:00 -0500
Subject: [PATCH 08/91] [NVPTX] Fix internal indirect call prototypes not
 obeying the ABI (#100131)

Summary:
The NVPTX backend optimizes the ABI for functions that are internal,
however, this is not legal for indirect call prototypes. Previously, we
would modify the ABI on an aggregate byval type passed to an indirect
call prototype, which would make PTXAS error. This patch just passes the
function as a nullptr to force strict ABI compliance without
modification in the helper function.

Fixes https://github.com/llvm/llvm-project/issues/100055

(cherry picked from commit e0649a5dfc6b859d652318f578bc3d49674787a4)
---
 libc/config/gpu/entrypoints.txt             | 15 +---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp |  5 +-
 llvm/test/CodeGen/NVPTX/indirect_byval.ll   | 94 +++++++++++++++++++++
 3 files changed, 101 insertions(+), 13 deletions(-)
 create mode 100644 llvm/test/CodeGen/NVPTX/indirect_byval.ll

diff --git a/libc/config/gpu/entrypoints.txt b/libc/config/gpu/entrypoints.txt
index 42909cec55890..fa878d8999227 100644
--- a/libc/config/gpu/entrypoints.txt
+++ b/libc/config/gpu/entrypoints.txt
@@ -1,13 +1,3 @@
-if(LIBC_TARGET_ARCHITECTURE_IS_AMDGPU)
-  set(extra_entrypoints
-      # stdio.h entrypoints
-      libc.src.stdio.snprintf
-      libc.src.stdio.sprintf
-      libc.src.stdio.vsnprintf
-      libc.src.stdio.vsprintf
-  )
-endif()
-
 set(TARGET_LIBC_ENTRYPOINTS
     # assert.h entrypoints
     libc.src.assert.__assert_fail
@@ -186,13 +176,16 @@ set(TARGET_LIBC_ENTRYPOINTS
     libc.src.errno.errno
 
     # stdio.h entrypoints
-    ${extra_entrypoints}
     libc.src.stdio.clearerr
     libc.src.stdio.fclose
     libc.src.stdio.printf
     libc.src.stdio.vprintf
     libc.src.stdio.fprintf
     libc.src.stdio.vfprintf
+    libc.src.stdio.snprintf
+    libc.src.stdio.sprintf
+    libc.src.stdio.vsnprintf
+    libc.src.stdio.vsprintf
     libc.src.stdio.feof
     libc.src.stdio.ferror
     libc.src.stdio.fflush
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 44c1a2e50486c..6975412ce5d35 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -1429,7 +1429,6 @@ std::string NVPTXTargetLowering::getPrototype(
 
   bool first = true;
 
-  const Function *F = CB.getFunction();
   unsigned NumArgs = VAInfo ? VAInfo->first : Args.size();
   for (unsigned i = 0, OIdx = 0; i != NumArgs; ++i, ++OIdx) {
     Type *Ty = Args[i].Ty;
@@ -1471,10 +1470,12 @@ std::string NVPTXTargetLowering::getPrototype(
       continue;
     }
 
+    // Indirect calls need strict ABI alignment so we disable optimizations by
+    // not providing a function to optimize.
     Type *ETy = Args[i].IndirectType;
     Align InitialAlign = Outs[OIdx].Flags.getNonZeroByValAlign();
     Align ParamByValAlign =
-        getFunctionByValParamAlign(F, ETy, InitialAlign, DL);
+        getFunctionByValParamAlign(/*F=*/nullptr, ETy, InitialAlign, DL);
 
     O << ".param .align " << ParamByValAlign.value() << " .b8 ";
     O << "_";
diff --git a/llvm/test/CodeGen/NVPTX/indirect_byval.ll b/llvm/test/CodeGen/NVPTX/indirect_byval.ll
new file mode 100644
index 0000000000000..ac6c4e262fd60
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/indirect_byval.ll
@@ -0,0 +1,94 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_52 -mattr=+ptx64 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_52 -mattr=+ptx64 | %ptxas-verify %}
+
+target triple = "nvptx64-nvidia-cuda"
+
+%struct.S = type { i8 }
+%struct.U = type { i64 }
+
+ at ptr = external global ptr, align 8
+
+define internal i32 @foo() {
+; CHECK-LABEL: foo(
+; CHECK:       {
+; CHECK-NEXT:    .local .align 1 .b8 __local_depot0[2];
+; CHECK-NEXT:    .reg .b64 %SP;
+; CHECK-NEXT:    .reg .b64 %SPL;
+; CHECK-NEXT:    .reg .b16 %rs<2>;
+; CHECK-NEXT:    .reg .b32 %r<3>;
+; CHECK-NEXT:    .reg .b64 %rd<3>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0: // %entry
+; CHECK-NEXT:    mov.u64 %SPL, __local_depot0;
+; CHECK-NEXT:    cvta.local.u64 %SP, %SPL;
+; CHECK-NEXT:    ld.global.u64 %rd1, [ptr];
+; CHECK-NEXT:    ld.u8 %rs1, [%SP+1];
+; CHECK-NEXT:    add.u64 %rd2, %SP, 0;
+; CHECK-NEXT:    { // callseq 0, 0
+; CHECK-NEXT:    .param .align 1 .b8 param0[1];
+; CHECK-NEXT:    st.param.b8 [param0+0], %rs1;
+; CHECK-NEXT:    .param .b64 param1;
+; CHECK-NEXT:    st.param.b64 [param1+0], %rd2;
+; CHECK-NEXT:    .param .b32 retval0;
+; CHECK-NEXT:    prototype_0 : .callprototype (.param .b32 _) _ (.param .align 1 .b8 _[1], .param .b64 _);
+; CHECK-NEXT:    call (retval0),
+; CHECK-NEXT:    %rd1,
+; CHECK-NEXT:    (
+; CHECK-NEXT:    param0,
+; CHECK-NEXT:    param1
+; CHECK-NEXT:    )
+; CHECK-NEXT:    , prototype_0;
+; CHECK-NEXT:    ld.param.b32 %r1, [retval0+0];
+; CHECK-NEXT:    } // callseq 0
+; CHECK-NEXT:    st.param.b32 [func_retval0+0], %r1;
+; CHECK-NEXT:    ret;
+entry:
+  %s = alloca %struct.S, align 1
+  %agg.tmp = alloca %struct.S, align 1
+  %0 = load ptr, ptr @ptr, align 8
+  %call = call i32 %0(ptr byval(%struct.S) align 1 %agg.tmp, ptr noundef %s)
+  ret i32 %call
+}
+
+define internal i32 @bar() {
+; CHECK-LABEL: bar(
+; CHECK:         // @bar
+; CHECK-NEXT:  {
+; CHECK-NEXT:    .local .align 8 .b8 __local_depot1[16];
+; CHECK-NEXT:    .reg .b64 %SP;
+; CHECK-NEXT:    .reg .b64 %SPL;
+; CHECK-NEXT:    .reg .b32 %r<3>;
+; CHECK-NEXT:    .reg .b64 %rd<4>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0: // %entry
+; CHECK-NEXT:    mov.u64 %SPL, __local_depot1;
+; CHECK-NEXT:    cvta.local.u64 %SP, %SPL;
+; CHECK-NEXT:    ld.global.u64 %rd1, [ptr];
+; CHECK-NEXT:    ld.u64 %rd2, [%SP+8];
+; CHECK-NEXT:    add.u64 %rd3, %SP, 0;
+; CHECK-NEXT:    { // callseq 1, 0
+; CHECK-NEXT:    .param .align 8 .b8 param0[8];
+; CHECK-NEXT:    st.param.b64 [param0+0], %rd2;
+; CHECK-NEXT:    .param .b64 param1;
+; CHECK-NEXT:    st.param.b64 [param1+0], %rd3;
+; CHECK-NEXT:    .param .b32 retval0;
+; CHECK-NEXT:    prototype_1 : .callprototype (.param .b32 _) _ (.param .align 8 .b8 _[8], .param .b64 _);
+; CHECK-NEXT:    call (retval0),
+; CHECK-NEXT:    %rd1,
+; CHECK-NEXT:    (
+; CHECK-NEXT:    param0,
+; CHECK-NEXT:    param1
+; CHECK-NEXT:    )
+; CHECK-NEXT:    , prototype_1;
+; CHECK-NEXT:    ld.param.b32 %r1, [retval0+0];
+; CHECK-NEXT:    } // callseq 1
+; CHECK-NEXT:    st.param.b32 [func_retval0+0], %r1;
+; CHECK-NEXT:    ret;
+entry:
+  %s = alloca %struct.U, align 8
+  %agg.tmp = alloca %struct.U, align 8
+  %0 = load ptr, ptr @ptr, align 8
+  %call = call noundef i32 %0(ptr byval(%struct.U) align 8 %agg.tmp, ptr %s)
+  ret i32 %call
+}

>From f1472feccdd0f626112f77882e580a79b385b184 Mon Sep 17 00:00:00 2001
From: azhan92 <alisonxzhang at gmail.com>
Date: Tue, 23 Jul 2024 09:51:13 -0400
Subject: [PATCH 09/91] [PowerPC] Add builtin_cpu_is P11 support (#99550)

This PR adds support for __builtin_cpu_is ("power11")

(cherry picked from commit 63b382bbde5994e8f2cec75883320e3ad9fd618f)
---
 clang/test/CodeGen/aix-builtin-cpu-is.c       |  4 ++
 clang/test/CodeGen/builtin-cpu-supports.c     | 72 ++++++++++++++++---
 .../llvm/TargetParser/PPCTargetParser.def     |  3 +
 3 files changed, 69 insertions(+), 10 deletions(-)

diff --git a/clang/test/CodeGen/aix-builtin-cpu-is.c b/clang/test/CodeGen/aix-builtin-cpu-is.c
index e17cf7353511a..04644dd7020e0 100644
--- a/clang/test/CodeGen/aix-builtin-cpu-is.c
+++ b/clang/test/CodeGen/aix-builtin-cpu-is.c
@@ -50,6 +50,10 @@
 // RUN: %clang_cc1 -triple powerpc-ibm-aix7.2.0.0 -emit-llvm -o - %t.c | FileCheck %s -DVALUE=262144 \
 // RUN:   --check-prefix=CHECKOP
 
+// RUN: echo "int main() { return __builtin_cpu_is(\"power11\");}" > %t.c
+// RUN: %clang_cc1 -triple powerpc-ibm-aix7.2.0.0 -emit-llvm -o - %t.c | FileCheck %s -DVALUE=524288 \
+// RUN:   --check-prefix=CHECKOP
+
 // CHECK:     define i32 @main() #0 {
 // CHECK-NEXT: entry:
 // CHECK-NEXT:   %retval = alloca i32, align 4
diff --git a/clang/test/CodeGen/builtin-cpu-supports.c b/clang/test/CodeGen/builtin-cpu-supports.c
index 88eb7b0fa786e..f960040ab094b 100644
--- a/clang/test/CodeGen/builtin-cpu-supports.c
+++ b/clang/test/CodeGen/builtin-cpu-supports.c
@@ -129,25 +129,69 @@ int v4() { return __builtin_cpu_supports("x86-64-v4"); }
 // CHECK-PPC:       if.else3:
 // CHECK-PPC-NEXT:    [[CPU_IS:%.*]] = call i32 @llvm.ppc.fixed.addr.ld(i32 3)
 // CHECK-PPC-NEXT:    [[TMP6:%.*]] = icmp eq i32 [[CPU_IS]], 39
-// CHECK-PPC-NEXT:    br i1 [[TMP6]], label [[IF_THEN4:%.*]], label [[IF_END:%.*]]
+// CHECK-PPC-NEXT:    br i1 [[TMP6]], label [[IF_THEN4:%.*]], label [[IF_ELSE5:%.*]]
 // CHECK-PPC:       if.then4:
 // CHECK-PPC-NEXT:    [[TMP7:%.*]] = load i32, ptr [[A_ADDR]], align 4
 // CHECK-PPC-NEXT:    [[TMP8:%.*]] = load i32, ptr [[A_ADDR]], align 4
 // CHECK-PPC-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP7]], [[TMP8]]
 // CHECK-PPC-NEXT:    store i32 [[ADD]], ptr [[RETVAL]], align 4
 // CHECK-PPC-NEXT:    br label [[RETURN]]
+// CHECK-PPC:       if.else5:
+// CHECK-PPC-NEXT:    [[CPU_IS6:%.*]] = call i32 @llvm.ppc.fixed.addr.ld(i32 3)
+// CHECK-PPC-NEXT:    [[TMP9:%.*]] = icmp eq i32 [[CPU_IS6]], 45
+// CHECK-PPC-NEXT:    br i1 [[TMP9]], label [[IF_THEN7:%.*]], label [[IF_ELSE9:%.*]]
+// CHECK-PPC:       if.then7:
+// CHECK-PPC-NEXT:    [[TMP10:%.*]] = load i32, ptr [[A_ADDR]], align 4
+// CHECK-PPC-NEXT:    [[ADD8:%.*]] = add nsw i32 [[TMP10]], 3
+// CHECK-PPC-NEXT:    store i32 [[ADD8]], ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    br label [[RETURN]]
+// CHECK-PPC:       if.else9:
+// CHECK-PPC-NEXT:    [[CPU_IS10:%.*]] = call i32 @llvm.ppc.fixed.addr.ld(i32 3)
+// CHECK-PPC-NEXT:    [[TMP11:%.*]] = icmp eq i32 [[CPU_IS10]], 46
+// CHECK-PPC-NEXT:    br i1 [[TMP11]], label [[IF_THEN11:%.*]], label [[IF_ELSE13:%.*]]
+// CHECK-PPC:       if.then11:
+// CHECK-PPC-NEXT:    [[TMP12:%.*]] = load i32, ptr [[A_ADDR]], align 4
+// CHECK-PPC-NEXT:    [[SUB12:%.*]] = sub nsw i32 [[TMP12]], 3
+// CHECK-PPC-NEXT:    store i32 [[SUB12]], ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    br label [[RETURN]]
+// CHECK-PPC:       if.else13:
+// CHECK-PPC-NEXT:    [[CPU_IS14:%.*]] = call i32 @llvm.ppc.fixed.addr.ld(i32 3)
+// CHECK-PPC-NEXT:    [[TMP13:%.*]] = icmp eq i32 [[CPU_IS14]], 47
+// CHECK-PPC-NEXT:    br i1 [[TMP13]], label [[IF_THEN15:%.*]], label [[IF_ELSE17:%.*]]
+// CHECK-PPC:       if.then15:
+// CHECK-PPC-NEXT:    [[TMP14:%.*]] = load i32, ptr [[A_ADDR]], align 4
+// CHECK-PPC-NEXT:    [[ADD16:%.*]] = add nsw i32 [[TMP14]], 7
+// CHECK-PPC-NEXT:    store i32 [[ADD16]], ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    br label [[RETURN]]
+// CHECK-PPC:       if.else17:
+// CHECK-PPC-NEXT:    [[CPU_IS18:%.*]] = call i32 @llvm.ppc.fixed.addr.ld(i32 3)
+// CHECK-PPC-NEXT:    [[TMP15:%.*]] = icmp eq i32 [[CPU_IS18]], 48
+// CHECK-PPC-NEXT:    br i1 [[TMP15]], label [[IF_THEN19:%.*]], label [[IF_END:%.*]]
+// CHECK-PPC:       if.then19:
+// CHECK-PPC-NEXT:    [[TMP16:%.*]] = load i32, ptr [[A_ADDR]], align 4
+// CHECK-PPC-NEXT:    [[SUB20:%.*]] = sub nsw i32 [[TMP16]], 7
+// CHECK-PPC-NEXT:    store i32 [[SUB20]], ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    br label [[RETURN]]
 // CHECK-PPC:       if.end:
-// CHECK-PPC-NEXT:    br label [[IF_END5:%.*]]
-// CHECK-PPC:       if.end5:
-// CHECK-PPC-NEXT:    br label [[IF_END6:%.*]]
-// CHECK-PPC:       if.end6:
-// CHECK-PPC-NEXT:    [[TMP9:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-PPC-NEXT:    [[ADD7:%.*]] = add nsw i32 [[TMP9]], 5
-// CHECK-PPC-NEXT:    store i32 [[ADD7]], ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    br label [[IF_END21:%.*]]
+// CHECK-PPC:       if.end21:
+// CHECK-PPC-NEXT:    br label [[IF_END22:%.*]]
+// CHECK-PPC:       if.end22:
+// CHECK-PPC-NEXT:    br label [[IF_END23:%.*]]
+// CHECK-PPC:       if.end23:
+// CHECK-PPC-NEXT:    br label [[IF_END24:%.*]]
+// CHECK-PPC:       if.end24:
+// CHECK-PPC-NEXT:    br label [[IF_END25:%.*]]
+// CHECK-PPC:       if.end25:
+// CHECK-PPC-NEXT:    br label [[IF_END26:%.*]]
+// CHECK-PPC:       if.end26:
+// CHECK-PPC-NEXT:    [[TMP17:%.*]] = load i32, ptr [[A_ADDR]], align 4
+// CHECK-PPC-NEXT:    [[ADD27:%.*]] = add nsw i32 [[TMP17]], 5
+// CHECK-PPC-NEXT:    store i32 [[ADD27]], ptr [[RETVAL]], align 4
 // CHECK-PPC-NEXT:    br label [[RETURN]]
 // CHECK-PPC:       return:
-// CHECK-PPC-NEXT:    [[TMP10:%.*]] = load i32, ptr [[RETVAL]], align 4
-// CHECK-PPC-NEXT:    ret i32 [[TMP10]]
+// CHECK-PPC-NEXT:    [[TMP18:%.*]] = load i32, ptr [[RETVAL]], align 4
+// CHECK-PPC-NEXT:    ret i32 [[TMP18]]
 //
 int test(int a) {
   if (__builtin_cpu_supports("arch_3_00")) // HWCAP2
@@ -156,6 +200,14 @@ int test(int a) {
     return a - 5;
   else if (__builtin_cpu_is("power7"))     // CPUID
     return a + a;
+  else if (__builtin_cpu_is("power8"))
+    return a + 3;
+  else if (__builtin_cpu_is("power9"))
+    return a - 3;
+  else if (__builtin_cpu_is("power10"))
+    return a + 7;
+  else if (__builtin_cpu_is("power11"))
+    return a - 7;
   return a + 5;
 }
 #endif
diff --git a/llvm/include/llvm/TargetParser/PPCTargetParser.def b/llvm/include/llvm/TargetParser/PPCTargetParser.def
index 44e97d56a059c..df956a68d75d6 100644
--- a/llvm/include/llvm/TargetParser/PPCTargetParser.def
+++ b/llvm/include/llvm/TargetParser/PPCTargetParser.def
@@ -40,6 +40,7 @@
 #undef AIX_PPC8_VALUE
 #undef AIX_PPC9_VALUE
 #undef AIX_PPC10_VALUE
+#undef AIX_PPC11_VALUE
 #else
 #ifndef PPC_LNX_FEATURE
 #define PPC_LNX_FEATURE(NAME, DESC, ENUMNAME, ENUMVAL, HWCAPN)
@@ -84,6 +85,7 @@
 #define AIX_PPC8_VALUE 0x00010000
 #define AIX_PPC9_VALUE 0x00020000
 #define AIX_PPC10_VALUE 0x00040000
+#define AIX_PPC11_VALUE 0x00080000
 
 // __builtin_cpu_is() and __builtin_cpu_supports() are supported only on Power7 and up on AIX.
 // PPC_CPU(Name, Linux_SUPPORT_METHOD, LinuxID, AIX_SUPPORT_METHOD, AIXID)
@@ -103,6 +105,7 @@ PPC_CPU("ppc476",SYS_CALL,44,BUILTIN_PPC_FALSE,0)
 PPC_CPU("power8",SYS_CALL,45,USE_SYS_CONF,AIX_PPC8_VALUE)
 PPC_CPU("power9",SYS_CALL,46,USE_SYS_CONF,AIX_PPC9_VALUE)
 PPC_CPU("power10",SYS_CALL,47,USE_SYS_CONF,AIX_PPC10_VALUE)
+PPC_CPU("power11",SYS_CALL,48,USE_SYS_CONF,AIX_PPC11_VALUE)
 #undef PPC_CPU
 
 // PPC features on Linux:

>From 40af7ee9c176ad96465a6055369646e574522501 Mon Sep 17 00:00:00 2001
From: PaulXiCao <paulxicao7 at gmail.com>
Date: Tue, 23 Jul 2024 15:11:44 +0000
Subject: [PATCH 10/91] [libc++][math] Fix undue overflowing of
 `std::hypot(x,y,z)` (#93350)

The 3-dimentionsional `std::hypot(x,y,z)` was sub-optimally implemented.
This lead to possible over-/underflows in (intermediate) results which
can be circumvented by this proposed change.

The idea is to to scale the arguments (see linked issue for full
discussion).

Tests have been added for problematic over- and underflows.

Closes #92782

(cherry picked from commit 9628777479a970db5d0c2d0b456dac6633864760)
---
 libcxx/include/__math/hypot.h                 | 89 ++++++++++++++++++
 libcxx/include/cmath                          | 25 +----
 .../test/libcxx/transitive_includes/cxx17.csv |  3 +
 .../test/libcxx/transitive_includes/cxx20.csv |  3 +
 .../test/libcxx/transitive_includes/cxx23.csv |  3 +
 .../test/libcxx/transitive_includes/cxx26.csv |  3 +
 .../test/std/numerics/c.math/cmath.pass.cpp   | 91 +++++++++++++++----
 libcxx/test/support/fp_compare.h              | 45 ++++-----
 8 files changed, 197 insertions(+), 65 deletions(-)

diff --git a/libcxx/include/__math/hypot.h b/libcxx/include/__math/hypot.h
index 1bf193a9ab7ee..61fd260c59409 100644
--- a/libcxx/include/__math/hypot.h
+++ b/libcxx/include/__math/hypot.h
@@ -15,10 +15,21 @@
 #include <__type_traits/is_same.h>
 #include <__type_traits/promote.h>
 
+#if _LIBCPP_STD_VER >= 17
+#  include <__algorithm/max.h>
+#  include <__math/abs.h>
+#  include <__math/roots.h>
+#  include <__utility/pair.h>
+#  include <limits>
+#endif
+
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
 #endif
 
+_LIBCPP_PUSH_MACROS
+#include <__undef_macros>
+
 _LIBCPP_BEGIN_NAMESPACE_STD
 
 namespace __math {
@@ -41,8 +52,86 @@ inline _LIBCPP_HIDE_FROM_ABI typename __promote<_A1, _A2>::type hypot(_A1 __x, _
   return __math::hypot((__result_type)__x, (__result_type)__y);
 }
 
+#if _LIBCPP_STD_VER >= 17
+// Factors needed to determine if over-/underflow might happen for `std::hypot(x,y,z)`.
+// returns [overflow_threshold, overflow_scale]
+template <class _Real>
+_LIBCPP_HIDE_FROM_ABI std::pair<_Real, _Real> __hypot_factors() {
+  static_assert(std::numeric_limits<_Real>::is_iec559);
+
+  if constexpr (std::is_same_v<_Real, float>) {
+    static_assert(-125 == std::numeric_limits<_Real>::min_exponent);
+    static_assert(+128 == std::numeric_limits<_Real>::max_exponent);
+    return {0x1.0p+62f, 0x1.0p-70f};
+  } else if constexpr (std::is_same_v<_Real, double>) {
+    static_assert(-1021 == std::numeric_limits<_Real>::min_exponent);
+    static_assert(+1024 == std::numeric_limits<_Real>::max_exponent);
+    return {0x1.0p+510, 0x1.0p-600};
+  } else { // long double
+    static_assert(std::is_same_v<_Real, long double>);
+
+    // preprocessor guard necessary, otherwise literals (e.g. `0x1.0p+8'190l`) throw warnings even when shielded by `if
+    // constexpr`
+#  if __DBL_MAX_EXP__ == __LDBL_MAX_EXP__
+    static_assert(sizeof(_Real) == sizeof(double));
+    return static_cast<std::pair<_Real, _Real>>(__math::__hypot_factors<double>());
+#  else
+    static_assert(sizeof(_Real) > sizeof(double));
+    static_assert(-16381 == std::numeric_limits<_Real>::min_exponent);
+    static_assert(+16384 == std::numeric_limits<_Real>::max_exponent);
+    return {0x1.0p+8190l, 0x1.0p-9000l};
+#  endif
+  }
+}
+
+// Computes the three-dimensional hypotenuse: `std::hypot(x,y,z)`.
+// The naive implementation might over-/underflow which is why this implementation is more involved:
+//    If the square of an argument might run into issues, we scale the arguments appropriately.
+// See https://github.com/llvm/llvm-project/issues/92782 for a detailed discussion and summary.
+template <class _Real>
+_LIBCPP_HIDE_FROM_ABI _Real __hypot(_Real __x, _Real __y, _Real __z) {
+  const _Real __max_abs = std::max(__math::fabs(__x), std::max(__math::fabs(__y), __math::fabs(__z)));
+  const auto [__overflow_threshold, __overflow_scale] = __math::__hypot_factors<_Real>();
+  _Real __scale;
+  if (__max_abs > __overflow_threshold) { // x*x + y*y + z*z might overflow
+    __scale = __overflow_scale;
+    __x *= __scale;
+    __y *= __scale;
+    __z *= __scale;
+  } else if (__max_abs < 1 / __overflow_threshold) { // x*x + y*y + z*z might underflow
+    __scale = 1 / __overflow_scale;
+    __x *= __scale;
+    __y *= __scale;
+    __z *= __scale;
+  } else
+    __scale = 1;
+  return __math::sqrt(__x * __x + __y * __y + __z * __z) / __scale;
+}
+
+inline _LIBCPP_HIDE_FROM_ABI float hypot(float __x, float __y, float __z) { return __math::__hypot(__x, __y, __z); }
+
+inline _LIBCPP_HIDE_FROM_ABI double hypot(double __x, double __y, double __z) { return __math::__hypot(__x, __y, __z); }
+
+inline _LIBCPP_HIDE_FROM_ABI long double hypot(long double __x, long double __y, long double __z) {
+  return __math::__hypot(__x, __y, __z);
+}
+
+template <class _A1,
+          class _A2,
+          class _A3,
+          std::enable_if_t< is_arithmetic_v<_A1> && is_arithmetic_v<_A2> && is_arithmetic_v<_A3>, int> = 0 >
+_LIBCPP_HIDE_FROM_ABI typename __promote<_A1, _A2, _A3>::type hypot(_A1 __x, _A2 __y, _A3 __z) _NOEXCEPT {
+  using __result_type = typename __promote<_A1, _A2, _A3>::type;
+  static_assert(!(
+      std::is_same_v<_A1, __result_type> && std::is_same_v<_A2, __result_type> && std::is_same_v<_A3, __result_type>));
+  return __math::__hypot(
+      static_cast<__result_type>(__x), static_cast<__result_type>(__y), static_cast<__result_type>(__z));
+}
+#endif
+
 } // namespace __math
 
 _LIBCPP_END_NAMESPACE_STD
+_LIBCPP_POP_MACROS
 
 #endif // _LIBCPP___MATH_HYPOT_H
diff --git a/libcxx/include/cmath b/libcxx/include/cmath
index 3c22604a683c3..6480c4678ce33 100644
--- a/libcxx/include/cmath
+++ b/libcxx/include/cmath
@@ -313,6 +313,7 @@ constexpr long double lerp(long double a, long double b, long double t) noexcept
 */
 
 #include <__config>
+#include <__math/hypot.h>
 #include <__type_traits/enable_if.h>
 #include <__type_traits/is_arithmetic.h>
 #include <__type_traits/is_constant_evaluated.h>
@@ -553,30 +554,6 @@ using ::scalbnl _LIBCPP_USING_IF_EXISTS;
 using ::tgammal _LIBCPP_USING_IF_EXISTS;
 using ::truncl _LIBCPP_USING_IF_EXISTS;
 
-#if _LIBCPP_STD_VER >= 17
-inline _LIBCPP_HIDE_FROM_ABI float hypot(float __x, float __y, float __z) {
-  return sqrt(__x * __x + __y * __y + __z * __z);
-}
-inline _LIBCPP_HIDE_FROM_ABI double hypot(double __x, double __y, double __z) {
-  return sqrt(__x * __x + __y * __y + __z * __z);
-}
-inline _LIBCPP_HIDE_FROM_ABI long double hypot(long double __x, long double __y, long double __z) {
-  return sqrt(__x * __x + __y * __y + __z * __z);
-}
-
-template <class _A1, class _A2, class _A3>
-inline _LIBCPP_HIDE_FROM_ABI
-typename enable_if_t< is_arithmetic<_A1>::value && is_arithmetic<_A2>::value && is_arithmetic<_A3>::value,
-                      __promote<_A1, _A2, _A3> >::type
-hypot(_A1 __lcpp_x, _A2 __lcpp_y, _A3 __lcpp_z) _NOEXCEPT {
-  typedef typename __promote<_A1, _A2, _A3>::type __result_type;
-  static_assert(
-      !(is_same<_A1, __result_type>::value && is_same<_A2, __result_type>::value && is_same<_A3, __result_type>::value),
-      "");
-  return std::hypot((__result_type)__lcpp_x, (__result_type)__lcpp_y, (__result_type)__lcpp_z);
-}
-#endif
-
 template <class _A1, __enable_if_t<is_floating_point<_A1>::value, int> = 0>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __constexpr_isnan(_A1 __lcpp_x) _NOEXCEPT {
 #if __has_builtin(__builtin_isnan)
diff --git a/libcxx/test/libcxx/transitive_includes/cxx17.csv b/libcxx/test/libcxx/transitive_includes/cxx17.csv
index 2c028462144ee..8099d2b79c4be 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx17.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx17.csv
@@ -130,6 +130,9 @@ chrono type_traits
 chrono vector
 chrono version
 cinttypes cstdint
+cmath cstddef
+cmath cstdint
+cmath initializer_list
 cmath limits
 cmath type_traits
 cmath version
diff --git a/libcxx/test/libcxx/transitive_includes/cxx20.csv b/libcxx/test/libcxx/transitive_includes/cxx20.csv
index 982c2013e3417..384e51b101f31 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx20.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx20.csv
@@ -135,6 +135,9 @@ chrono type_traits
 chrono vector
 chrono version
 cinttypes cstdint
+cmath cstddef
+cmath cstdint
+cmath initializer_list
 cmath limits
 cmath type_traits
 cmath version
diff --git a/libcxx/test/libcxx/transitive_includes/cxx23.csv b/libcxx/test/libcxx/transitive_includes/cxx23.csv
index 8ffb71d8b566b..46b833d143f39 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx23.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx23.csv
@@ -83,6 +83,9 @@ chrono string_view
 chrono vector
 chrono version
 cinttypes cstdint
+cmath cstddef
+cmath cstdint
+cmath initializer_list
 cmath limits
 cmath version
 codecvt cctype
diff --git a/libcxx/test/libcxx/transitive_includes/cxx26.csv b/libcxx/test/libcxx/transitive_includes/cxx26.csv
index 8ffb71d8b566b..46b833d143f39 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx26.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx26.csv
@@ -83,6 +83,9 @@ chrono string_view
 chrono vector
 chrono version
 cinttypes cstdint
+cmath cstddef
+cmath cstdint
+cmath initializer_list
 cmath limits
 cmath version
 codecvt cctype
diff --git a/libcxx/test/std/numerics/c.math/cmath.pass.cpp b/libcxx/test/std/numerics/c.math/cmath.pass.cpp
index 9379084499792..19b5fd0cf8996 100644
--- a/libcxx/test/std/numerics/c.math/cmath.pass.cpp
+++ b/libcxx/test/std/numerics/c.math/cmath.pass.cpp
@@ -12,14 +12,17 @@
 
 // <cmath>
 
+#include <array>
 #include <cmath>
 #include <limits>
 #include <type_traits>
 #include <cassert>
 
+#include "fp_compare.h"
 #include "test_macros.h"
 #include "hexfloat.h"
 #include "truncate_fp.h"
+#include "type_algorithms.h"
 
 // convertible to int/float/double/etc
 template <class T, int N=0>
@@ -1113,6 +1116,56 @@ void test_fmin()
     assert(std::fmin(1,0) == 0);
 }
 
+#if TEST_STD_VER >= 17
+struct TestHypot3 {
+  template <class Real>
+  void operator()() const {
+    const auto check = [](Real elem, Real abs_tol) {
+      assert(std::isfinite(std::hypot(elem, Real(0), Real(0))));
+      assert(fptest_close(std::hypot(elem, Real(0), Real(0)), elem, abs_tol));
+      assert(std::isfinite(std::hypot(elem, elem, Real(0))));
+      assert(fptest_close(std::hypot(elem, elem, Real(0)), std::sqrt(Real(2)) * elem, abs_tol));
+      assert(std::isfinite(std::hypot(elem, elem, elem)));
+      assert(fptest_close(std::hypot(elem, elem, elem), std::sqrt(Real(3)) * elem, abs_tol));
+    };
+
+    { // check for overflow
+      const auto [elem, abs_tol] = []() -> std::array<Real, 2> {
+        if constexpr (std::is_same_v<Real, float>)
+          return {1e20f, 1e16f};
+        else if constexpr (std::is_same_v<Real, double>)
+          return {1e300, 1e287};
+        else { // long double
+#  if __DBL_MAX_EXP__ == __LDBL_MAX_EXP__
+          return {1e300l, 1e287l}; // 64-bit
+#  else
+          return {1e4000l, 1e3985l}; // 80- or 128-bit
+#  endif
+        }
+      }();
+      check(elem, abs_tol);
+    }
+
+    { // check for underflow
+      const auto [elem, abs_tol] = []() -> std::array<Real, 2> {
+        if constexpr (std::is_same_v<Real, float>)
+          return {1e-20f, 1e-24f};
+        else if constexpr (std::is_same_v<Real, double>)
+          return {1e-287, 1e-300};
+        else { // long double
+#  if __DBL_MAX_EXP__ == __LDBL_MAX_EXP__
+          return {1e-287l, 1e-300l}; // 64-bit
+#  else
+          return {1e-3985l, 1e-4000l}; // 80- or 128-bit
+#  endif
+        }
+      }();
+      check(elem, abs_tol);
+    }
+  }
+};
+#endif
+
 void test_hypot()
 {
     static_assert((std::is_same<decltype(std::hypot((float)0, (float)0)), float>::value), "");
@@ -1135,25 +1188,31 @@ void test_hypot()
     static_assert((std::is_same<decltype(hypot(Ambiguous(), Ambiguous())), Ambiguous>::value), "");
     assert(std::hypot(3,4) == 5);
 
-#if TEST_STD_VER > 14
-    static_assert((std::is_same<decltype(std::hypot((float)0, (float)0, (float)0)), float>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (bool)0, (float)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (unsigned short)0, (double)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (int)0, (long double)0)), long double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (double)0, (long)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (long double)0, (unsigned long)0)), long double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (int)0, (long long)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (int)0, (unsigned long long)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (double)0, (double)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (long double)0, (long double)0)), long double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (float)0, (double)0)), double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (float)0, (long double)0)), long double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((float)0, (double)0, (long double)0)), long double>::value), "");
-    static_assert((std::is_same<decltype(std::hypot((int)0, (int)0, (int)0)), double>::value), "");
-    static_assert((std::is_same<decltype(hypot(Ambiguous(), Ambiguous(), Ambiguous())), Ambiguous>::value), "");
+#if TEST_STD_VER >= 17
+    // clang-format off
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (float)0,          (float)0)),              float>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (bool)0,           (float)0)),              double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (unsigned short)0, (double)0)),             double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (int)0,            (long double)0)),        long double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (double)0,         (long)0)),               double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (long double)0,    (unsigned long)0)),      long double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (int)0,            (long long)0)),          double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (int)0,            (unsigned long long)0)), double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (double)0,         (double)0)),             double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (long double)0,    (long double)0)),        long double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (float)0,          (double)0)),             double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (float)0,          (long double)0)),        long double>));
+    static_assert((std::is_same_v<decltype(std::hypot((float)0, (double)0,         (long double)0)),        long double>));
+    static_assert((std::is_same_v<decltype(std::hypot((int)0,   (int)0,            (int)0)),                double>));
+    static_assert((std::is_same_v<decltype(hypot(Ambiguous(), Ambiguous(), Ambiguous())), Ambiguous>));
+    // clang-format on
 
     assert(std::hypot(2,3,6) == 7);
     assert(std::hypot(1,4,8) == 9);
+
+    // Check for undue over-/underflows of intermediate results.
+    // See discussion at https://github.com/llvm/llvm-project/issues/92782.
+    types::for_each(types::floating_point_types(), TestHypot3());
 #endif
 }
 
diff --git a/libcxx/test/support/fp_compare.h b/libcxx/test/support/fp_compare.h
index 1d1933b0bcd81..3088a211dadc3 100644
--- a/libcxx/test/support/fp_compare.h
+++ b/libcxx/test/support/fp_compare.h
@@ -9,39 +9,34 @@
 #ifndef SUPPORT_FP_COMPARE_H
 #define SUPPORT_FP_COMPARE_H
 
-#include <cmath>      // for std::abs
-#include <algorithm>  // for std::max
+#include <cmath>     // for std::abs
+#include <algorithm> // for std::max
 #include <cassert>
+#include <__config>
 
 // See https://www.boost.org/doc/libs/1_70_0/libs/test/doc/html/boost_test/testing_tools/extended_comparison/floating_point/floating_points_comparison_theory.html
 
-template<typename T>
-bool fptest_close(T val, T expected, T eps)
-{
-    constexpr T zero = T(0);
-    assert(eps >= zero);
+template <typename T>
+bool fptest_close(T val, T expected, T eps) {
+  _LIBCPP_CONSTEXPR T zero = T(0);
+  assert(eps >= zero);
 
-    // Handle the zero cases
-    if (eps      == zero) return val == expected;
-    if (val      == zero) return std::abs(expected) <= eps;
-    if (expected == zero) return std::abs(val)      <= eps;
+  // Handle the zero cases
+  if (eps == zero)
+    return val == expected;
+  if (val == zero)
+    return std::abs(expected) <= eps;
+  if (expected == zero)
+    return std::abs(val) <= eps;
 
-    return std::abs(val - expected) < eps
-        && std::abs(val - expected)/std::abs(val) < eps;
+  return std::abs(val - expected) < eps && std::abs(val - expected) / std::abs(val) < eps;
 }
 
-template<typename T>
-bool fptest_close_pct(T val, T expected, T percent)
-{
-    constexpr T zero = T(0);
-    assert(percent >= zero);
-
-    // Handle the zero cases
-    if (percent == zero) return val == expected;
-    T eps = (percent / T(100)) * std::max(std::abs(val), std::abs(expected));
-
-    return fptest_close(val, expected, eps);
+template <typename T>
+bool fptest_close_pct(T val, T expected, T percent) {
+  assert(percent >= T(0));
+  T eps = (percent / T(100)) * std::max(std::abs(val), std::abs(expected));
+  return fptest_close(val, expected, eps);
 }
 
-
 #endif // SUPPORT_FP_COMPARE_H

>From a930d1f322674d03d9d88638a89fc4adcf3178b6 Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Tue, 23 Jul 2024 18:03:28 +0200
Subject: [PATCH 11/91] [libc++][vector<bool>] Tests shrink_to_fit requirement.
 (#98009)

`vector<bool>`'s shrink_to_fit implementation is using the
"swap-to-free-container-resources-trick" which only shrinks when the
input vector is empty. Since the request to shrink_to_fit is
non-binding, this is a valid implementation. It is not a high-quality
implementation. Since `vector<bool>` is not a very popular container the
implementation has not been changed and only a test to validate the
non-growing property has been added.

This was discovered while investigating #95161.

(cherry picked from commit c2e438675754b83c31d7d5ba40cb13fe77e795de)
---
 .../vector.bool/shrink_to_fit.pass.cpp        | 45 ++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp b/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
index b39245cab7bf4..f8bcee31964bb 100644
--- a/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector.bool/shrink_to_fit.pass.cpp
@@ -39,11 +39,54 @@ TEST_CONSTEXPR_CXX20 bool tests()
     return true;
 }
 
+#if TEST_STD_VER >= 23
+template <typename T>
+struct increasing_allocator {
+  using value_type         = T;
+  std::size_t min_elements = 1000;
+  increasing_allocator()   = default;
+
+  template <typename U>
+  constexpr increasing_allocator(const increasing_allocator<U>& other) noexcept : min_elements(other.min_elements) {}
+
+  constexpr std::allocation_result<T*> allocate_at_least(std::size_t n) {
+    if (n < min_elements)
+      n = min_elements;
+    min_elements += 1000;
+    return std::allocator<T>{}.allocate_at_least(n);
+  }
+  constexpr T* allocate(std::size_t n) { return allocate_at_least(n).ptr; }
+  constexpr void deallocate(T* p, std::size_t n) noexcept { std::allocator<T>{}.deallocate(p, n); }
+};
+
+template <typename T, typename U>
+bool operator==(increasing_allocator<T>, increasing_allocator<U>) {
+  return true;
+}
+
+// https://github.com/llvm/llvm-project/issues/95161
+constexpr bool test_increasing_allocator() {
+  std::vector<bool, increasing_allocator<bool>> v;
+  v.push_back(1);
+  std::size_t capacity = v.capacity();
+  v.shrink_to_fit();
+  assert(v.capacity() <= capacity);
+  assert(v.size() == 1);
+
+  return true;
+}
+#endif // TEST_STD_VER >= 23
+
 int main(int, char**)
 {
-    tests();
+  tests();
 #if TEST_STD_VER > 17
     static_assert(tests());
 #endif
+#if TEST_STD_VER >= 23
+    test_increasing_allocator();
+    static_assert(test_increasing_allocator());
+#endif // TEST_STD_VER >= 23
+
     return 0;
 }

>From c5cd826c05b6ed1760a5be2d0996fbb912bad6c7 Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Tue, 23 Jul 2024 18:13:22 +0200
Subject: [PATCH 12/91] [libc++][string] Fixes shrink_to_fit. (#97961)

This ensures that shrink_to_fit does not increase the allocated size.

Partly addresses #95161

(cherry picked from commit d0ca9f23e8f25b0509c3ff34ed215508b39ea6e7)
---
 libcxx/include/string                         | 17 ++++++--
 .../string.capacity/shrink_to_fit.pass.cpp    | 41 +++++++++++++++++++
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/libcxx/include/string b/libcxx/include/string
index ba86a32090825..9fa979e3a5178 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -3358,23 +3358,34 @@ basic_string<_CharT, _Traits, _Allocator>::__shrink_or_extend(size_type __target
     __p        = __get_long_pointer();
   } else {
     if (__target_capacity > __cap) {
+      // Extend
+      // - called from reserve should propagate the exception thrown.
       auto __allocation = std::__allocate_at_least(__alloc(), __target_capacity + 1);
       __new_data        = __allocation.ptr;
       __target_capacity = __allocation.count - 1;
     } else {
+      // Shrink
+      // - called from shrink_to_fit should not throw.
+      // - called from reserve may throw but is not required to.
 #ifndef _LIBCPP_HAS_NO_EXCEPTIONS
       try {
 #endif // _LIBCPP_HAS_NO_EXCEPTIONS
         auto __allocation = std::__allocate_at_least(__alloc(), __target_capacity + 1);
+
+        // The Standard mandates shrink_to_fit() does not increase the capacity.
+        // With equal capacity keep the existing buffer. This avoids extra work
+        // due to swapping the elements.
+        if (__allocation.count - 1 > __target_capacity) {
+          __alloc_traits::deallocate(__alloc(), __allocation.ptr, __allocation.count);
+          __annotate_new(__sz); // Undoes the __annotate_delete()
+          return;
+        }
         __new_data        = __allocation.ptr;
         __target_capacity = __allocation.count - 1;
 #ifndef _LIBCPP_HAS_NO_EXCEPTIONS
       } catch (...) {
         return;
       }
-#else  // _LIBCPP_HAS_NO_EXCEPTIONS
-      if (__new_data == nullptr)
-        return;
 #endif // _LIBCPP_HAS_NO_EXCEPTIONS
     }
     __begin_lifetime(__new_data, __target_capacity + 1);
diff --git a/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp b/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
index 057050cdcf7fa..6f5e43d1341f5 100644
--- a/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
+++ b/libcxx/test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp
@@ -63,8 +63,49 @@ TEST_CONSTEXPR_CXX20 bool test() {
   return true;
 }
 
+#if TEST_STD_VER >= 23
+std::size_t min_bytes = 1000;
+
+template <typename T>
+struct increasing_allocator {
+  using value_type       = T;
+  increasing_allocator() = default;
+  template <typename U>
+  increasing_allocator(const increasing_allocator<U>&) noexcept {}
+  std::allocation_result<T*> allocate_at_least(std::size_t n) {
+    std::size_t allocation_amount = n * sizeof(T);
+    if (allocation_amount < min_bytes)
+      allocation_amount = min_bytes;
+    min_bytes += 1000;
+    return {static_cast<T*>(::operator new(allocation_amount)), allocation_amount / sizeof(T)};
+  }
+  T* allocate(std::size_t n) { return allocate_at_least(n).ptr; }
+  void deallocate(T* p, std::size_t) noexcept { ::operator delete(static_cast<void*>(p)); }
+};
+
+template <typename T, typename U>
+bool operator==(increasing_allocator<T>, increasing_allocator<U>) {
+  return true;
+}
+
+// https://github.com/llvm/llvm-project/issues/95161
+void test_increasing_allocator() {
+  std::basic_string<char, std::char_traits<char>, increasing_allocator<char>> s{
+      "String does not fit in the internal buffer"};
+  std::size_t capacity = s.capacity();
+  std::size_t size     = s.size();
+  s.shrink_to_fit();
+  assert(s.capacity() <= capacity);
+  assert(s.size() == size);
+  LIBCPP_ASSERT(is_string_asan_correct(s));
+}
+#endif // TEST_STD_VER >= 23
+
 int main(int, char**) {
   test();
+#if TEST_STD_VER >= 23
+  test_increasing_allocator();
+#endif
 #if TEST_STD_VER > 17
   static_assert(test());
 #endif

>From 518cef7f38ea539f14aedbf3a08a9989960fc355 Mon Sep 17 00:00:00 2001
From: azhan92 <alisonxzhang at gmail.com>
Date: Tue, 23 Jul 2024 09:49:41 -0400
Subject: [PATCH 13/91] [PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11
 (#99511)

This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in
clang and llvm.

(cherry picked from commit 1df4d866cca51eeab8f012a97cc50957b45971fe)
---
 clang/lib/Basic/Targets/PPC.cpp               | 39 ++++++++++++-------
 clang/lib/Basic/Targets/PPC.h                 | 19 ++++++---
 clang/lib/Driver/ToolChains/Arch/PPC.cpp      |  3 ++
 clang/test/Misc/target-invalid-cpu-note.c     |  2 +-
 clang/test/Preprocessor/init-ppc64.c          | 22 +++++++++++
 llvm/lib/Target/PowerPC/PPC.td                | 20 ++++++++--
 llvm/lib/Target/PowerPC/PPCISelLowering.cpp   |  3 ++
 llvm/lib/Target/PowerPC/PPCInstrInfo.cpp      |  1 +
 llvm/lib/Target/PowerPC/PPCSubtarget.h        |  1 +
 .../Target/PowerPC/PPCTargetTransformInfo.cpp |  4 +-
 llvm/lib/TargetParser/Host.cpp                |  7 ++++
 llvm/test/CodeGen/PowerPC/check-cpu.ll        |  6 ++-
 llvm/test/CodeGen/PowerPC/mma-acc-spill.ll    |  7 ++++
 ...{p10-constants.ll => p10-p11-constants.ll} | 12 +++++-
 llvm/unittests/TargetParser/Host.cpp          |  1 +
 15 files changed, 120 insertions(+), 27 deletions(-)
 rename llvm/test/CodeGen/PowerPC/{p10-constants.ll => p10-p11-constants.ll} (94%)

diff --git a/clang/lib/Basic/Targets/PPC.cpp b/clang/lib/Basic/Targets/PPC.cpp
index 4ba4a49311d36..9ff54083c923b 100644
--- a/clang/lib/Basic/Targets/PPC.cpp
+++ b/clang/lib/Basic/Targets/PPC.cpp
@@ -385,6 +385,8 @@ void PPCTargetInfo::getTargetDefines(const LangOptions &Opts,
     Builder.defineMacro("_ARCH_PWR9");
   if (ArchDefs & ArchDefinePwr10)
     Builder.defineMacro("_ARCH_PWR10");
+  if (ArchDefs & ArchDefinePwr11)
+    Builder.defineMacro("_ARCH_PWR11");
   if (ArchDefs & ArchDefineA2)
     Builder.defineMacro("_ARCH_A2");
   if (ArchDefs & ArchDefineE500)
@@ -622,10 +624,17 @@ bool PPCTargetInfo::initFeatureMap(
     addP10SpecificFeatures(Features);
   }
 
-  // Future CPU should include all of the features of Power 10 as well as any
+  // Power11 includes all the same features as Power10 plus any features
+  // specific to the Power11 core.
+  if (CPU == "pwr11" || CPU == "power11") {
+    initFeatureMap(Features, Diags, "pwr10", FeaturesVec);
+    addP11SpecificFeatures(Features);
+  }
+
+  // Future CPU should include all of the features of Power 11 as well as any
   // additional features (yet to be determined) specific to it.
   if (CPU == "future") {
-    initFeatureMap(Features, Diags, "pwr10", FeaturesVec);
+    initFeatureMap(Features, Diags, "pwr11", FeaturesVec);
     addFutureSpecificFeatures(Features);
   }
 
@@ -696,6 +705,10 @@ void PPCTargetInfo::addP10SpecificFeatures(
   Features["isa-v31-instructions"] = true;
 }
 
+// Add any Power11 specific features.
+void PPCTargetInfo::addP11SpecificFeatures(
+    llvm::StringMap<bool> &Features) const {}
+
 // Add features specific to the "Future" CPU.
 void PPCTargetInfo::addFutureSpecificFeatures(
     llvm::StringMap<bool> &Features) const {}
@@ -870,17 +883,17 @@ ArrayRef<TargetInfo::AddlRegName> PPCTargetInfo::getGCCAddlRegNames() const {
 }
 
 static constexpr llvm::StringLiteral ValidCPUNames[] = {
-    {"generic"},     {"440"},     {"450"},    {"601"},       {"602"},
-    {"603"},         {"603e"},    {"603ev"},  {"604"},       {"604e"},
-    {"620"},         {"630"},     {"g3"},     {"7400"},      {"g4"},
-    {"7450"},        {"g4+"},     {"750"},    {"8548"},      {"970"},
-    {"g5"},          {"a2"},      {"e500"},   {"e500mc"},    {"e5500"},
-    {"power3"},      {"pwr3"},    {"power4"}, {"pwr4"},      {"power5"},
-    {"pwr5"},        {"power5x"}, {"pwr5x"},  {"power6"},    {"pwr6"},
-    {"power6x"},     {"pwr6x"},   {"power7"}, {"pwr7"},      {"power8"},
-    {"pwr8"},        {"power9"},  {"pwr9"},   {"power10"},   {"pwr10"},
-    {"powerpc"},     {"ppc"},     {"ppc32"},  {"powerpc64"}, {"ppc64"},
-    {"powerpc64le"}, {"ppc64le"}, {"future"}};
+    {"generic"},   {"440"},     {"450"},         {"601"},     {"602"},
+    {"603"},       {"603e"},    {"603ev"},       {"604"},     {"604e"},
+    {"620"},       {"630"},     {"g3"},          {"7400"},    {"g4"},
+    {"7450"},      {"g4+"},     {"750"},         {"8548"},    {"970"},
+    {"g5"},        {"a2"},      {"e500"},        {"e500mc"},  {"e5500"},
+    {"power3"},    {"pwr3"},    {"power4"},      {"pwr4"},    {"power5"},
+    {"pwr5"},      {"power5x"}, {"pwr5x"},       {"power6"},  {"pwr6"},
+    {"power6x"},   {"pwr6x"},   {"power7"},      {"pwr7"},    {"power8"},
+    {"pwr8"},      {"power9"},  {"pwr9"},        {"power10"}, {"pwr10"},
+    {"power11"},   {"pwr11"},   {"powerpc"},     {"ppc"},     {"ppc32"},
+    {"powerpc64"}, {"ppc64"},   {"powerpc64le"}, {"ppc64le"}, {"future"}};
 
 bool PPCTargetInfo::isValidCPUName(StringRef Name) const {
   return llvm::is_contained(ValidCPUNames, Name);
diff --git a/clang/lib/Basic/Targets/PPC.h b/clang/lib/Basic/Targets/PPC.h
index b15ab6fbcf492..6d5d8dd54d013 100644
--- a/clang/lib/Basic/Targets/PPC.h
+++ b/clang/lib/Basic/Targets/PPC.h
@@ -44,8 +44,9 @@ class LLVM_LIBRARY_VISIBILITY PPCTargetInfo : public TargetInfo {
     ArchDefinePwr8 = 1 << 12,
     ArchDefinePwr9 = 1 << 13,
     ArchDefinePwr10 = 1 << 14,
-    ArchDefineFuture = 1 << 15,
-    ArchDefineA2 = 1 << 16,
+    ArchDefinePwr11 = 1 << 15,
+    ArchDefineFuture = 1 << 16,
+    ArchDefineA2 = 1 << 17,
     ArchDefineE500 = 1 << 18
   } ArchDefineTypes;
 
@@ -166,11 +167,16 @@ class LLVM_LIBRARY_VISIBILITY PPCTargetInfo : public TargetInfo {
                          ArchDefinePwr7 | ArchDefinePwr6 | ArchDefinePwr5x |
                          ArchDefinePwr5 | ArchDefinePwr4 | ArchDefinePpcgr |
                          ArchDefinePpcsq)
+              .Cases("power11", "pwr11",
+                     ArchDefinePwr11 | ArchDefinePwr10 | ArchDefinePwr9 |
+                         ArchDefinePwr8 | ArchDefinePwr7 | ArchDefinePwr6 |
+                         ArchDefinePwr5x | ArchDefinePwr5 | ArchDefinePwr4 |
+                         ArchDefinePpcgr | ArchDefinePpcsq)
               .Case("future",
-                    ArchDefineFuture | ArchDefinePwr10 | ArchDefinePwr9 |
-                        ArchDefinePwr8 | ArchDefinePwr7 | ArchDefinePwr6 |
-                        ArchDefinePwr5x | ArchDefinePwr5 | ArchDefinePwr4 |
-                        ArchDefinePpcgr | ArchDefinePpcsq)
+                    ArchDefineFuture | ArchDefinePwr11 | ArchDefinePwr10 |
+                        ArchDefinePwr9 | ArchDefinePwr8 | ArchDefinePwr7 |
+                        ArchDefinePwr6 | ArchDefinePwr5x | ArchDefinePwr5 |
+                        ArchDefinePwr4 | ArchDefinePpcgr | ArchDefinePpcsq)
               .Cases("8548", "e500", ArchDefineE500)
               .Default(ArchDefineNone);
     }
@@ -192,6 +198,7 @@ class LLVM_LIBRARY_VISIBILITY PPCTargetInfo : public TargetInfo {
                  const std::vector<std::string> &FeaturesVec) const override;
 
   void addP10SpecificFeatures(llvm::StringMap<bool> &Features) const;
+  void addP11SpecificFeatures(llvm::StringMap<bool> &Features) const;
   void addFutureSpecificFeatures(llvm::StringMap<bool> &Features) const;
 
   bool handleTargetFeatures(std::vector<std::string> &Features,
diff --git a/clang/lib/Driver/ToolChains/Arch/PPC.cpp b/clang/lib/Driver/ToolChains/Arch/PPC.cpp
index 634c096523319..acd5757d6ea97 100644
--- a/clang/lib/Driver/ToolChains/Arch/PPC.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/PPC.cpp
@@ -70,6 +70,7 @@ static std::string normalizeCPUName(StringRef CPUName, const llvm::Triple &T) {
       .Case("power8", "pwr8")
       .Case("power9", "pwr9")
       .Case("power10", "pwr10")
+      .Case("power11", "pwr11")
       .Case("future", "future")
       .Case("powerpc", "ppc")
       .Case("powerpc64", "ppc64")
@@ -103,6 +104,8 @@ const char *ppc::getPPCAsmModeForCPU(StringRef Name) {
       .Case("power9", "-mpower9")
       .Case("pwr10", "-mpower10")
       .Case("power10", "-mpower10")
+      .Case("pwr11", "-mpower11")
+      .Case("power11", "-mpower11")
       .Default("-many");
 }
 
diff --git a/clang/test/Misc/target-invalid-cpu-note.c b/clang/test/Misc/target-invalid-cpu-note.c
index a5f9ffa21220a..4d6759dd81537 100644
--- a/clang/test/Misc/target-invalid-cpu-note.c
+++ b/clang/test/Misc/target-invalid-cpu-note.c
@@ -57,7 +57,7 @@
 
 // RUN: not %clang_cc1 -triple powerpc--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix PPC
 // PPC: error: unknown target CPU 'not-a-cpu'
-// PPC-NEXT: note: valid target CPU values are: generic, 440, 450, 601, 602, 603, 603e, 603ev, 604, 604e, 620, 630, g3, 7400, g4, 7450, g4+, 750, 8548, 970, g5, a2, e500, e500mc, e5500, power3, pwr3, power4, pwr4, power5, pwr5, power5x, pwr5x, power6, pwr6, power6x, pwr6x, power7, pwr7, power8, pwr8, power9, pwr9, power10, pwr10, powerpc, ppc, ppc32, powerpc64, ppc64, powerpc64le, ppc64le, future{{$}}
+// PPC-NEXT: note: valid target CPU values are: generic, 440, 450, 601, 602, 603, 603e, 603ev, 604, 604e, 620, 630, g3, 7400, g4, 7450, g4+, 750, 8548, 970, g5, a2, e500, e500mc, e5500, power3, pwr3, power4, pwr4, power5, pwr5, power5x, pwr5x, power6, pwr6, power6x, pwr6x, power7, pwr7, power8, pwr8, power9, pwr9, power10, pwr10, power11, pwr11, powerpc, ppc, ppc32, powerpc64, ppc64, powerpc64le, ppc64le, future{{$}}
 
 // RUN: not %clang_cc1 -triple mips--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix MIPS
 // MIPS: error: unknown target CPU 'not-a-cpu'
diff --git a/clang/test/Preprocessor/init-ppc64.c b/clang/test/Preprocessor/init-ppc64.c
index 42e5232824de7..56164beb913d5 100644
--- a/clang/test/Preprocessor/init-ppc64.c
+++ b/clang/test/Preprocessor/init-ppc64.c
@@ -632,6 +632,27 @@
 // PPCPOWER10:#define __PCREL__ 1
 // PPCPOWER10-NOT:#define __ROP_PROTECT__ 1
 //
+// RUN: %clang_cc1 -E -dM -ffreestanding -triple=powerpc64-none-none -target-cpu pwr11 -fno-signed-char < /dev/null | FileCheck -match-full-lines -check-prefix PPCPOWER11 %s
+// RUN: %clang_cc1 -E -dM -ffreestanding -triple=powerpc64-none-none -target-cpu power11 -fno-signed-char < /dev/null | FileCheck -match-full-lines -check-prefix PPCPOWER11 %s
+//
+// PPCPOWER11:#define _ARCH_PPC 1
+// PPCPOWER11:#define _ARCH_PPC64 1
+// PPCPOWER11:#define _ARCH_PPCGR 1
+// PPCPOWER11:#define _ARCH_PPCSQ 1
+// PPCPOWER11:#define _ARCH_PWR10 1
+// PPCPOWER11:#define _ARCH_PWR11 1
+// PPCPOWER11:#define _ARCH_PWR4 1
+// PPCPOWER11:#define _ARCH_PWR5 1
+// PPCPOWER11:#define _ARCH_PWR5X 1
+// PPCPOWER11:#define _ARCH_PWR6 1
+// PPCPOWER11-NOT:#define _ARCH_PWR6X 1
+// PPCPOWER11:#define _ARCH_PWR7 1
+// PPCPOWER11:#define _ARCH_PWR8 1
+// PPCPOWER11:#define _ARCH_PWR9 1
+// PPCPOWER11:#define __MMA__ 1
+// PPCPOWER11:#define __PCREL__ 1
+// PPCPOWER11-NOT:#define __ROP_PROTECT__ 1
+//
 // RUN: %clang_cc1 -E -dM -ffreestanding -triple=powerpc64-none-none -target-cpu future -fno-signed-char < /dev/null | FileCheck -match-full-lines -check-prefix PPCFUTURE %s
 //
 // PPCFUTURE:#define _ARCH_PPC 1
@@ -639,6 +660,7 @@
 // PPCFUTURE:#define _ARCH_PPCGR 1
 // PPCFUTURE:#define _ARCH_PPCSQ 1
 // PPCFUTURE:#define _ARCH_PWR10 1
+// PPCFUTURE:#define _ARCH_PWR11 1
 // PPCFUTURE:#define _ARCH_PWR4 1
 // PPCFUTURE:#define _ARCH_PWR5 1
 // PPCFUTURE:#define _ARCH_PWR5X 1
diff --git a/llvm/lib/Target/PowerPC/PPC.td b/llvm/lib/Target/PowerPC/PPC.td
index 84ef582c029d3..da31a993b9c69 100644
--- a/llvm/lib/Target/PowerPC/PPC.td
+++ b/llvm/lib/Target/PowerPC/PPC.td
@@ -52,6 +52,7 @@ def DirectivePwr7: SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR7", "">;
 def DirectivePwr8: SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR8", "">;
 def DirectivePwr9: SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR9", "">;
 def DirectivePwr10: SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR10", "">;
+def DirectivePwr11: SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR11", "">;
 def DirectivePwrFuture
     : SubtargetFeature<"", "CPUDirective", "PPC::DIR_PWR_FUTURE", "">;
 
@@ -467,13 +468,25 @@ def ProcessorFeatures {
   list<SubtargetFeature> P10Features =
     !listconcat(P10InheritableFeatures, P10SpecificFeatures);
 
-  // Future
-  // For future CPU we assume that all of the existing features from Power10
+  // Power11
+  // For P11 CPU we assume that all the existing features from Power10
   // still exist with the exception of those we know are Power10 specific.
+  list<SubtargetFeature> P11AdditionalFeatures =
+    [DirectivePwr11];
+  list<SubtargetFeature> P11SpecificFeatures =
+    [];
+  list<SubtargetFeature> P11InheritableFeatures =
+    !listconcat(P10InheritableFeatures, P11AdditionalFeatures);
+  list<SubtargetFeature> P11Features =
+    !listconcat(P11InheritableFeatures, P11SpecificFeatures);
+
+  // Future
+  // For future CPU we assume that all of the existing features from Power11
+  // still exist with the exception of those we know are Power11 specific.
   list<SubtargetFeature> FutureAdditionalFeatures = [FeatureISAFuture];
   list<SubtargetFeature> FutureSpecificFeatures = [];
   list<SubtargetFeature> FutureInheritableFeatures =
-    !listconcat(P10InheritableFeatures, FutureAdditionalFeatures);
+    !listconcat(P11InheritableFeatures, FutureAdditionalFeatures);
   list<SubtargetFeature> FutureFeatures =
     !listconcat(FutureInheritableFeatures, FutureSpecificFeatures);
 }
@@ -672,6 +685,7 @@ def : ProcessorModel<"pwr7", P7Model, ProcessorFeatures.P7Features>;
 def : ProcessorModel<"pwr8", P8Model, ProcessorFeatures.P8Features>;
 def : ProcessorModel<"pwr9", P9Model, ProcessorFeatures.P9Features>;
 def : ProcessorModel<"pwr10", P10Model, ProcessorFeatures.P10Features>;
+def : ProcessorModel<"pwr11", P10Model, ProcessorFeatures.P11Features>;
 // No scheduler model for future CPU.
 def : ProcessorModel<"future", NoSchedModel,
                   ProcessorFeatures.FutureFeatures>;
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 898d1f80d0564..aaf0449a55387 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -1469,6 +1469,7 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
   case PPC::DIR_PWR8:
   case PPC::DIR_PWR9:
   case PPC::DIR_PWR10:
+  case PPC::DIR_PWR11:
   case PPC::DIR_PWR_FUTURE:
     setPrefLoopAlignment(Align(16));
     setPrefFunctionAlignment(Align(16));
@@ -16664,6 +16665,7 @@ Align PPCTargetLowering::getPrefLoopAlignment(MachineLoop *ML) const {
   case PPC::DIR_PWR8:
   case PPC::DIR_PWR9:
   case PPC::DIR_PWR10:
+  case PPC::DIR_PWR11:
   case PPC::DIR_PWR_FUTURE: {
     if (!ML)
       break;
@@ -18046,6 +18048,7 @@ SDValue PPCTargetLowering::combineMUL(SDNode *N, DAGCombinerInfo &DCI) const {
       return true;
     case PPC::DIR_PWR9:
     case PPC::DIR_PWR10:
+    case PPC::DIR_PWR11:
     case PPC::DIR_PWR_FUTURE:
       //  type        mul     add    shl
       // scalar        5       2      2
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
index 2d3c520429f2a..81f16eb1a905b 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
@@ -3485,6 +3485,7 @@ unsigned PPCInstrInfo::getSpillTarget() const {
   // With P10, we may need to spill paired vector registers or accumulator
   // registers. MMA implies paired vectors, so we can just check that.
   bool IsP10Variant = Subtarget.isISA3_1() || Subtarget.pairedVectorMemops();
+  // P11 uses the P10 target.
   return Subtarget.isISAFuture() ? 3 : IsP10Variant ?
                                    2 : Subtarget.hasP9Vector() ?
                                    1 : 0;
diff --git a/llvm/lib/Target/PowerPC/PPCSubtarget.h b/llvm/lib/Target/PowerPC/PPCSubtarget.h
index bf35f8ec151b1..2079dc0acc3cf 100644
--- a/llvm/lib/Target/PowerPC/PPCSubtarget.h
+++ b/llvm/lib/Target/PowerPC/PPCSubtarget.h
@@ -61,6 +61,7 @@ enum {
   DIR_PWR8,
   DIR_PWR9,
   DIR_PWR10,
+  DIR_PWR11,
   DIR_PWR_FUTURE,
   DIR_64
 };
diff --git a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
index 3fa35efc2d159..b7bdbeb535d52 100644
--- a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
@@ -504,7 +504,7 @@ unsigned PPCTTIImpl::getCacheLineSize() const {
   // Assume that Future CPU has the same cache line size as the others.
   if (Directive == PPC::DIR_PWR7 || Directive == PPC::DIR_PWR8 ||
       Directive == PPC::DIR_PWR9 || Directive == PPC::DIR_PWR10 ||
-      Directive == PPC::DIR_PWR_FUTURE)
+      Directive == PPC::DIR_PWR11 || Directive == PPC::DIR_PWR_FUTURE)
     return 128;
 
   // On other processors return a default of 64 bytes.
@@ -538,7 +538,7 @@ unsigned PPCTTIImpl::getMaxInterleaveFactor(ElementCount VF) {
   // Assume that future is the same as the others.
   if (Directive == PPC::DIR_PWR7 || Directive == PPC::DIR_PWR8 ||
       Directive == PPC::DIR_PWR9 || Directive == PPC::DIR_PWR10 ||
-      Directive == PPC::DIR_PWR_FUTURE)
+      Directive == PPC::DIR_PWR11 || Directive == PPC::DIR_PWR_FUTURE)
     return 12;
 
   // For most things, modern systems have two execution units (and
diff --git a/llvm/lib/TargetParser/Host.cpp b/llvm/lib/TargetParser/Host.cpp
index fda085f880096..7e637cba4cfbc 100644
--- a/llvm/lib/TargetParser/Host.cpp
+++ b/llvm/lib/TargetParser/Host.cpp
@@ -150,6 +150,7 @@ StringRef sys::detail::getHostCPUNameForPowerPC(StringRef ProcCpuinfoContent) {
       .Case("POWER8NVL", "pwr8")
       .Case("POWER9", "pwr9")
       .Case("POWER10", "pwr10")
+      .Case("POWER11", "pwr11")
       // FIXME: If we get a simulator or machine with the capabilities of
       // mcpu=future, we should revisit this and add the name reported by the
       // simulator/machine.
@@ -1549,6 +1550,12 @@ StringRef sys::getHostCPUName() {
   case 0x40000:
 #endif
     return "pwr10";
+#ifdef POWER_11
+  case POWER_11:
+#else
+  case 0x80000:
+#endif
+    return "pwr11";
   default:
     return "generic";
   }
diff --git a/llvm/test/CodeGen/PowerPC/check-cpu.ll b/llvm/test/CodeGen/PowerPC/check-cpu.ll
index e1a201427a410..1dc532cb428f4 100644
--- a/llvm/test/CodeGen/PowerPC/check-cpu.ll
+++ b/llvm/test/CodeGen/PowerPC/check-cpu.ll
@@ -3,6 +3,10 @@
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
 ; RUN:     -mcpu=future < %s 2>&1 | FileCheck %s
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
+; RUN:     -mcpu=pwr11 < %s 2>&1 | FileCheck %s
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
+; RUN:     -mcpu=pwr11 < %s 2>&1 | FileCheck %s
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
 ; RUN:     -mcpu=pwr10 < %s 2>&1 | FileCheck %s
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
 ; RUN:     -mcpu=pwr10 < %s 2>&1 | FileCheck %s
@@ -13,7 +17,7 @@
 
 
 
-; Test -mcpu=[pwr9|pwr10|future] is recognized on PowerPC.
+; Test -mcpu=[pwr9|pwr10|pwr11|future] is recognized on PowerPC.
 
 ; CHECK-NOT: is not a recognized processor for this target
 ; CHECK:     .text
diff --git a/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll b/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
index 8d03594fe1bfd..681f81d74794d 100644
--- a/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
+++ b/llvm/test/CodeGen/PowerPC/mma-acc-spill.ll
@@ -6,6 +6,13 @@
 ; RUN:   -mcpu=pwr10 -ppc-asm-full-reg-names -disable-auto-paired-vec-st=false \
 ; RUN:   -ppc-vsr-nums-as-vr < %s | FileCheck %s --check-prefix=CHECK-BE
 
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
+; RUN:   -mcpu=pwr11 -ppc-asm-full-reg-names -disable-auto-paired-vec-st=false \
+; RUN:   -ppc-vsr-nums-as-vr < %s | FileCheck %s
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
+; RUN:   -mcpu=pwr11 -ppc-asm-full-reg-names -disable-auto-paired-vec-st=false \
+; RUN:   -ppc-vsr-nums-as-vr < %s | FileCheck %s --check-prefix=CHECK-BE
+
 declare <512 x i1> @llvm.ppc.mma.xvf16ger2pp(<512 x i1>, <16 x i8>, <16 x i8>)
 declare <512 x i1> @llvm.ppc.mma.assemble.acc(<16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>)
 declare void @foo()
diff --git a/llvm/test/CodeGen/PowerPC/p10-constants.ll b/llvm/test/CodeGen/PowerPC/p10-p11-constants.ll
similarity index 94%
rename from llvm/test/CodeGen/PowerPC/p10-constants.ll
rename to llvm/test/CodeGen/PowerPC/p10-p11-constants.ll
index 77472afd9c3d4..5f6a345bdd938 100644
--- a/llvm/test/CodeGen/PowerPC/p10-constants.ll
+++ b/llvm/test/CodeGen/PowerPC/p10-p11-constants.ll
@@ -8,7 +8,17 @@
 ; RUN:   -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
 ; RUN:   FileCheck %s --check-prefix=CHECK32
 
-; These test cases aim to test constant materialization using the pli instruction on Power10.
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
+; RUN:   -mcpu=pwr11 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
+; RUN:   FileCheck %s
+; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
+; RUN:   -mcpu=pwr11 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
+; RUN:   FileCheck %s
+; RUN: llc -verify-machineinstrs -mtriple=powerpc-unknown-linux-gnu \
+; RUN:   -mcpu=pwr11 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
+; RUN:   FileCheck %s --check-prefix=CHECK32
+
+; These test cases aim to test constant materialization using the pli instruction on Power10 and Power11.
 
 define  signext i32 @t_16BitsMinRequiring34Bits() {
 ; CHECK-LABEL: t_16BitsMinRequiring34Bits:
diff --git a/llvm/unittests/TargetParser/Host.cpp b/llvm/unittests/TargetParser/Host.cpp
index 61921a99e1711..f8dd1d3a60a00 100644
--- a/llvm/unittests/TargetParser/Host.cpp
+++ b/llvm/unittests/TargetParser/Host.cpp
@@ -536,6 +536,7 @@ TEST(HostTest, AIXHostCPUDetect) {
                        .Case("POWER 8\n", "pwr8")
                        .Case("POWER 9\n", "pwr9")
                        .Case("POWER 10\n", "pwr10")
+                       .Case("POWER 11\n", "pwr11")
                        .Default("unknown");
 
   StringRef HostCPU = sys::getHostCPUName();

>From 43cec9d5512692315a749ccb080dcd04561453f3 Mon Sep 17 00:00:00 2001
From: Jan Leyonberg <jan_sjodin at yahoo.com>
Date: Wed, 24 Jul 2024 09:57:39 -0400
Subject: [PATCH 14/91] [clang][OpenMP] Propoagate debug location to
 OMPIRBuilder reduction codegen (#100358)

This patch propagates the debug location from Clang to the
OpenMPIRBuilder.

Fixes https://github.com/llvm/llvm-project/issues/97458

(cherry picked from commit 5b15d9c441810121c23f9f421bbb007fd4c448e8)
---
 clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index f5bd4a141cc2d..8965a14d88a6f 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -1695,7 +1695,8 @@ void CGOpenMPRuntimeGPU::emitReduction(
                          CGF.AllocaInsertPt->getIterator());
   InsertPointTy CodeGenIP(CGF.Builder.GetInsertBlock(),
                           CGF.Builder.GetInsertPoint());
-  llvm::OpenMPIRBuilder::LocationDescription OmpLoc(CodeGenIP);
+  llvm::OpenMPIRBuilder::LocationDescription OmpLoc(
+      CodeGenIP, CGF.SourceLocToDebugLoc(Loc));
   llvm::SmallVector<llvm::OpenMPIRBuilder::ReductionInfo> ReductionInfos;
 
   CodeGenFunction::OMPPrivateScope Scope(CGF);

>From 3651ae013e43482ea587067e575c89e9495b0f33 Mon Sep 17 00:00:00 2001
From: Chris Copeland <chris at chrisnc.net>
Date: Wed, 24 Jul 2024 05:53:39 -0700
Subject: [PATCH 15/91] [clang] Define `ATOMIC_FLAG_INIT` correctly for C++.
 (#97534)

(cherry picked from commit 4bb3a1e16f3a854d05bc0b8c5b6f8f78effb1d93)
---
 clang/docs/ReleaseNotes.rst    | 3 +++
 clang/lib/Headers/stdatomic.h  | 4 ++++
 clang/test/Headers/stdatomic.c | 5 +++++
 3 files changed, 12 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 24d88ed6edf00..5b6ee9830b507 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -891,6 +891,9 @@ Bug Fixes in This Version
 - Fixed an assertion failure when a template non-type parameter contains
   an invalid expression.
 
+- Fixed the definition of ``ATOMIC_FLAG_INIT`` in ``<stdatomic.h>`` so it can
+  be used in C++.
+
 Bug Fixes to Compiler Builtins
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/lib/Headers/stdatomic.h b/clang/lib/Headers/stdatomic.h
index 2027055f38796..1991351f9e9ef 100644
--- a/clang/lib/Headers/stdatomic.h
+++ b/clang/lib/Headers/stdatomic.h
@@ -172,7 +172,11 @@ typedef _Atomic(uintmax_t)          atomic_uintmax_t;
 
 typedef struct atomic_flag { atomic_bool _Value; } atomic_flag;
 
+#ifdef __cplusplus
+#define ATOMIC_FLAG_INIT {false}
+#else
 #define ATOMIC_FLAG_INIT { 0 }
+#endif
 
 /* These should be provided by the libc implementation. */
 #ifdef __cplusplus
diff --git a/clang/test/Headers/stdatomic.c b/clang/test/Headers/stdatomic.c
index 3643fd4245b31..9afd531a9ed9b 100644
--- a/clang/test/Headers/stdatomic.c
+++ b/clang/test/Headers/stdatomic.c
@@ -1,5 +1,8 @@
 // RUN: %clang_cc1 -std=c11 -E %s | FileCheck %s
 // RUN: %clang_cc1 -std=c11 -fms-compatibility -E %s | FileCheck %s
+// RUN: %clang_cc1 -std=c11 %s -verify
+// RUN: %clang_cc1 -x c++ -std=c++11 %s -verify
+// expected-no-diagnostics
 #include <stdatomic.h>
 
 int bool_lock_free = ATOMIC_BOOL_LOCK_FREE;
@@ -31,3 +34,5 @@ int llong_lock_free = ATOMIC_LLONG_LOCK_FREE;
 
 int pointer_lock_free = ATOMIC_POINTER_LOCK_FREE;
 // CHECK: pointer_lock_free = {{ *[012] *;}}
+
+atomic_flag f = ATOMIC_FLAG_INIT;

>From 0934f6d441183ed58c9b4197d29668bbe2d3ea7d Mon Sep 17 00:00:00 2001
From: Benjamin Maxwell <benjamin.maxwell at arm.com>
Date: Tue, 23 Jul 2024 07:30:02 +0000
Subject: [PATCH 16/91] Precommit vscale-fixups.ll test (NFC)

Precommit test for #100080.

(cherry picked from commit c1b70fa5bfea973d4141e27cf9668e9325609e19)
---
 .../AArch64/vscale-fixups.ll                  | 47 +++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll b/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
index 483955c1c57a0..56b59012eef40 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
@@ -384,4 +384,51 @@ for.exit:
   ret void
 }
 
+;; This test demonstrates an incorrect MUL VL address calculation. Here there
+;; are two writes that should be `16 * vscale * vscale` apart, however,
+;; loop-strength-reduce has ignored the second `vscale` and offset the second
+;; write by `#4, mul vl` which is an offset of `16 * vscale` dropping a vscale.
+define void @vscale_squared_offset(ptr %alloc) #0 {
+; COMMON-LABEL: vscale_squared_offset:
+; COMMON:       // %bb.0: // %entry
+; COMMON-NEXT:    fmov z0.s, #4.00000000
+; COMMON-NEXT:    mov x8, xzr
+; COMMON-NEXT:    cntw x9
+; COMMON-NEXT:    fmov z1.s, #8.00000000
+; COMMON-NEXT:    ptrue p0.s, vl1
+; COMMON-NEXT:    cmp x8, x9
+; COMMON-NEXT:    b.ge .LBB6_2
+; COMMON-NEXT:  .LBB6_1: // %for.body
+; COMMON-NEXT:    // =>This Inner Loop Header: Depth=1
+; COMMON-NEXT:    st1w { z0.s }, p0, [x0]
+; COMMON-NEXT:    add x8, x8, #1
+; COMMON-NEXT:    st1w { z1.s }, p0, [x0, #4, mul vl]
+; COMMON-NEXT:    addvl x0, x0, #1
+; COMMON-NEXT:    cmp x8, x9
+; COMMON-NEXT:    b.lt .LBB6_1
+; COMMON-NEXT:  .LBB6_2: // %for.exit
+; COMMON-NEXT:    ret
+entry:
+  %vscale = call i64 @llvm.vscale.i64()
+  %c4_vscale = mul i64 %vscale, 4
+  br label %for.check
+for.check:
+  %i = phi i64 [ %next_i, %for.body ], [ 0, %entry ]
+  %is_lt = icmp slt i64 %i, %c4_vscale
+  br i1 %is_lt, label %for.body, label %for.exit
+for.body:
+  %mask = call <vscale x 4 x i1> @llvm.aarch64.sve.whilelt.nxv4i1.i64(i64 0, i64 1)
+  %upper_offset = mul i64 %i, %c4_vscale
+  %upper_ptr = getelementptr float, ptr %alloc, i64 %upper_offset
+  call void @llvm.masked.store.nxv4f32.p0(<vscale x 4 x float> shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 4.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer), ptr %upper_ptr, i32 4, <vscale x 4 x i1> %mask)
+  %lower_i = add i64 %i, %c4_vscale
+  %lower_offset = mul i64 %lower_i, %c4_vscale
+  %lower_ptr = getelementptr float, ptr %alloc, i64 %lower_offset
+  call void @llvm.masked.store.nxv4f32.p0(<vscale x 4 x float> shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 8.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer), ptr %lower_ptr, i32 4, <vscale x 4 x i1> %mask)
+  %next_i = add i64 %i, 1
+  br label %for.check
+for.exit:
+  ret void
+}
+
 attributes #0 = { "target-features"="+sve2" vscale_range(1,16) }

>From 75642a00e15b722bdfb90726be31f1c8adaeb0c5 Mon Sep 17 00:00:00 2001
From: Benjamin Maxwell <benjamin.maxwell at arm.com>
Date: Wed, 24 Jul 2024 10:06:34 +0100
Subject: [PATCH 17/91] [LSR] Fix matching vscale immediates (#100080)

Somewhat confusingly a `SCEVMulExpr` is a `SCEVNAryExpr`, so can have
> 2 operands. Previously, the vscale immediate matching did not check
the number of operands of the `SCEVMulExpr`, so would ignore any
operands after the first two.

This led to incorrect codegen (and results) for ArmSME in IREE
(https://github.com/iree-org/iree), which sometimes addresses things
that are a `vscale * vscale` multiple away. The test added with this
change shows an example reduced from IREE. The second write should
be offset from the first `16 * vscale * vscale` (* 4 bytes), however,
previously LSR dropped the second vscale and instead offset the write by
`#4, mul vl`, which is an offset of `16 * vscale` (* 4 bytes).

(cherry picked from commit 7fad04e94b7b594389111ae7eca0883ef18dc90b)
---
 .../Transforms/Scalar/LoopStrengthReduce.cpp  |  6 ++++--
 .../AArch64/vscale-fixups.ll                  | 20 +++++++++++--------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
index 11f9f7822a15c..91461d1ed2759 100644
--- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
@@ -946,13 +946,15 @@ static Immediate ExtractImmediate(const SCEV *&S, ScalarEvolution &SE) {
                            // FIXME: AR->getNoWrapFlags(SCEV::FlagNW)
                            SCEV::FlagAnyWrap);
     return Result;
-  } else if (EnableVScaleImmediates)
-    if (const SCEVMulExpr *M = dyn_cast<SCEVMulExpr>(S))
+  } else if (const SCEVMulExpr *M = dyn_cast<SCEVMulExpr>(S)) {
+    if (EnableVScaleImmediates && M->getNumOperands() == 2) {
       if (const SCEVConstant *C = dyn_cast<SCEVConstant>(M->getOperand(0)))
         if (isa<SCEVVScale>(M->getOperand(1))) {
           S = SE.getConstant(M->getType(), 0);
           return Immediate::getScalable(C->getValue()->getSExtValue());
         }
+    }
+  }
   return Immediate::getZero();
 }
 
diff --git a/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll b/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
index 56b59012eef40..588696d20227f 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
@@ -384,27 +384,31 @@ for.exit:
   ret void
 }
 
-;; This test demonstrates an incorrect MUL VL address calculation. Here there
-;; are two writes that should be `16 * vscale * vscale` apart, however,
-;; loop-strength-reduce has ignored the second `vscale` and offset the second
-;; write by `#4, mul vl` which is an offset of `16 * vscale` dropping a vscale.
+;; Here are two writes that should be `16 * vscale * vscale` apart, so MUL VL
+;; addressing cannot be used to offset the second write, as for example,
+;; `#4, mul vl` would only be an offset of `16 * vscale` (dropping a vscale).
 define void @vscale_squared_offset(ptr %alloc) #0 {
 ; COMMON-LABEL: vscale_squared_offset:
 ; COMMON:       // %bb.0: // %entry
+; COMMON-NEXT:    rdvl x9, #1
 ; COMMON-NEXT:    fmov z0.s, #4.00000000
 ; COMMON-NEXT:    mov x8, xzr
-; COMMON-NEXT:    cntw x9
+; COMMON-NEXT:    lsr x9, x9, #4
 ; COMMON-NEXT:    fmov z1.s, #8.00000000
+; COMMON-NEXT:    cntw x10
 ; COMMON-NEXT:    ptrue p0.s, vl1
-; COMMON-NEXT:    cmp x8, x9
+; COMMON-NEXT:    umull x9, w9, w9
+; COMMON-NEXT:    lsl x9, x9, #6
+; COMMON-NEXT:    cmp x8, x10
 ; COMMON-NEXT:    b.ge .LBB6_2
 ; COMMON-NEXT:  .LBB6_1: // %for.body
 ; COMMON-NEXT:    // =>This Inner Loop Header: Depth=1
+; COMMON-NEXT:    add x11, x0, x9
 ; COMMON-NEXT:    st1w { z0.s }, p0, [x0]
 ; COMMON-NEXT:    add x8, x8, #1
-; COMMON-NEXT:    st1w { z1.s }, p0, [x0, #4, mul vl]
+; COMMON-NEXT:    st1w { z1.s }, p0, [x11]
 ; COMMON-NEXT:    addvl x0, x0, #1
-; COMMON-NEXT:    cmp x8, x9
+; COMMON-NEXT:    cmp x8, x10
 ; COMMON-NEXT:    b.lt .LBB6_1
 ; COMMON-NEXT:  .LBB6_2: // %for.exit
 ; COMMON-NEXT:    ret

>From a87fbeb3a77a53ded341277c5b326f7696d47594 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng <dtcxzyw2333 at gmail.com>
Date: Wed, 24 Jul 2024 20:06:36 +0800
Subject: [PATCH 18/91] [ValueTracking] Don't use CondContext in dataflow
 analysis of phi nodes (#100316)

See the following case:
```
define i16 @pr100298() {
entry:
  br label %for.inc

for.inc:
  %indvar = phi i32 [ -15, %entry ], [ %mask, %for.inc ]
  %add = add nsw i32 %indvar, 9
  %mask = and i32 %add, 65535
  %cmp1 = icmp ugt i32 %mask, 5
  br i1 %cmp1, label %for.inc, label %for.end

for.end:
  %conv = trunc i32 %add to i16
  %cmp2 = icmp ugt i32 %mask, 3
  %shl = shl nuw i16 %conv, 14
  %res = select i1 %cmp2, i16 %conv, i16 %shl
  ret i16 %res
}
```

When computing knownbits of `%shl` with `%cmp2=false`, we cannot use
this condition in the analysis of `%mask (%for.inc -> %for.inc)`.

Fixes https://github.com/llvm/llvm-project/issues/100298.

(cherry picked from commit 59eae919c938f890e9b9b4be8a3fa3cb1b11ed89)
---
 llvm/include/llvm/Analysis/SimplifyQuery.h   |  6 +++
 llvm/lib/Analysis/ValueTracking.cpp          | 22 +++++------
 llvm/test/Transforms/InstCombine/pr100298.ll | 39 ++++++++++++++++++++
 3 files changed, 56 insertions(+), 11 deletions(-)
 create mode 100644 llvm/test/Transforms/InstCombine/pr100298.ll

diff --git a/llvm/include/llvm/Analysis/SimplifyQuery.h b/llvm/include/llvm/Analysis/SimplifyQuery.h
index a560744f01222..e8f43c8c2e91f 100644
--- a/llvm/include/llvm/Analysis/SimplifyQuery.h
+++ b/llvm/include/llvm/Analysis/SimplifyQuery.h
@@ -130,6 +130,12 @@ struct SimplifyQuery {
     Copy.CC = &CC;
     return Copy;
   }
+
+  SimplifyQuery getWithoutCondContext() const {
+    SimplifyQuery Copy(*this);
+    Copy.CC = nullptr;
+    return Copy;
+  }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 40fe1ffe13f1b..4b77c0046cc70 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1435,7 +1435,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
         // inferred hold at original context instruction.  TODO: It may be
         // correct to use the original context.  IF warranted, explore and
         // add sufficient tests to cover.
-        SimplifyQuery RecQ = Q;
+        SimplifyQuery RecQ = Q.getWithoutCondContext();
         RecQ.CxtI = P;
         computeKnownBits(R, DemandedElts, Known2, Depth + 1, RecQ);
         switch (Opcode) {
@@ -1468,7 +1468,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
         // phi. This is important because that is where the value is actually
         // "evaluated" even though it is used later somewhere else. (see also
         // D69571).
-        SimplifyQuery RecQ = Q;
+        SimplifyQuery RecQ = Q.getWithoutCondContext();
 
         unsigned OpNum = P->getOperand(0) == R ? 0 : 1;
         Instruction *RInst = P->getIncomingBlock(OpNum)->getTerminator();
@@ -1546,7 +1546,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
         // phi. This is important because that is where the value is actually
         // "evaluated" even though it is used later somewhere else. (see also
         // D69571).
-        SimplifyQuery RecQ = Q;
+        SimplifyQuery RecQ = Q.getWithoutCondContext();
         RecQ.CxtI = P->getIncomingBlock(u)->getTerminator();
 
         Known2 = KnownBits(BitWidth);
@@ -2329,7 +2329,7 @@ bool isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth,
     // it is an induction variable where in each step its value is a power of
     // two.
     auto *PN = cast<PHINode>(I);
-    SimplifyQuery RecQ = Q;
+    SimplifyQuery RecQ = Q.getWithoutCondContext();
 
     // Check if it is an induction variable and always power of two.
     if (isPowerOfTwoRecurrence(PN, OrZero, Depth, RecQ))
@@ -2943,7 +2943,7 @@ static bool isKnownNonZeroFromOperator(const Operator *I,
       return true;
 
     // Check if all incoming values are non-zero using recursion.
-    SimplifyQuery RecQ = Q;
+    SimplifyQuery RecQ = Q.getWithoutCondContext();
     unsigned NewDepth = std::max(Depth, MaxAnalysisRecursionDepth - 1);
     return llvm::all_of(PN->operands(), [&](const Use &U) {
       if (U.get() == PN)
@@ -3509,7 +3509,7 @@ static bool isNonEqualPHIs(const PHINode *PN1, const PHINode *PN2,
     if (UsedFullRecursion)
       return false;
 
-    SimplifyQuery RecQ = Q;
+    SimplifyQuery RecQ = Q.getWithoutCondContext();
     RecQ.CxtI = IncomBB->getTerminator();
     if (!isKnownNonEqual(IV1, IV2, DemandedElts, Depth + 1, RecQ))
       return false;
@@ -4001,7 +4001,7 @@ static unsigned ComputeNumSignBitsImpl(const Value *V,
 
       // Take the minimum of all incoming values.  This can't infinitely loop
       // because of our depth threshold.
-      SimplifyQuery RecQ = Q;
+      SimplifyQuery RecQ = Q.getWithoutCondContext();
       Tmp = TyBits;
       for (unsigned i = 0, e = NumIncomingValues; i != e; ++i) {
         if (Tmp == 1) return Tmp;
@@ -5909,10 +5909,10 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
         // Recurse, but cap the recursion to two levels, because we don't want
         // to waste time spinning around in loops. We need at least depth 2 to
         // detect known sign bits.
-        computeKnownFPClass(
-            IncValue, DemandedElts, InterestedClasses, KnownSrc,
-            PhiRecursionLimit,
-            Q.getWithInstruction(P->getIncomingBlock(U)->getTerminator()));
+        computeKnownFPClass(IncValue, DemandedElts, InterestedClasses, KnownSrc,
+                            PhiRecursionLimit,
+                            Q.getWithoutCondContext().getWithInstruction(
+                                P->getIncomingBlock(U)->getTerminator()));
 
         if (First) {
           Known = KnownSrc;
diff --git a/llvm/test/Transforms/InstCombine/pr100298.ll b/llvm/test/Transforms/InstCombine/pr100298.ll
new file mode 100644
index 0000000000000..6cf2a71bb916e
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/pr100298.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+; Make sure that the result of computeKnownBits for %indvar is correct.
+
+define i16 @pr100298() {
+; CHECK-LABEL: define i16 @pr100298() {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    br label %[[FOR_INC:.*]]
+; CHECK:       [[FOR_INC]]:
+; CHECK-NEXT:    [[INDVAR:%.*]] = phi i32 [ -15, %[[ENTRY]] ], [ [[MASK:%.*]], %[[FOR_INC]] ]
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[INDVAR]], 9
+; CHECK-NEXT:    [[MASK]] = and i32 [[ADD]], 65535
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp ugt i32 [[MASK]], 5
+; CHECK-NEXT:    br i1 [[CMP1]], label %[[FOR_INC]], label %[[FOR_END:.*]]
+; CHECK:       [[FOR_END]]:
+; CHECK-NEXT:    [[CONV:%.*]] = trunc i32 [[ADD]] to i16
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp ugt i32 [[MASK]], 3
+; CHECK-NEXT:    [[SHL:%.*]] = shl nuw i16 [[CONV]], 14
+; CHECK-NEXT:    [[RES:%.*]] = select i1 [[CMP2]], i16 [[CONV]], i16 [[SHL]]
+; CHECK-NEXT:    ret i16 [[RES]]
+;
+entry:
+  br label %for.inc
+
+for.inc:
+  %indvar = phi i32 [ -15, %entry ], [ %mask, %for.inc ]
+  %add = add nsw i32 %indvar, 9
+  %mask = and i32 %add, 65535
+  %cmp1 = icmp ugt i32 %mask, 5
+  br i1 %cmp1, label %for.inc, label %for.end
+
+for.end:
+  %conv = trunc i32 %add to i16
+  %cmp2 = icmp ugt i32 %mask, 3
+  %shl = shl nuw i16 %conv, 14
+  %res = select i1 %cmp2, i16 %conv, i16 %shl
+  ret i16 %res
+}

>From fc0b1ce075a570a7631eab23ab865023346b574e Mon Sep 17 00:00:00 2001
From: Akira Hatanaka <ahatanak at gmail.com>
Date: Wed, 24 Jul 2024 02:04:37 -0700
Subject: [PATCH 19/91] [PAC] Define __builtin_ptrauth_type_discriminator
 (#100204)

The builtin computes the discriminator for a type, which can be used to
sign/authenticate function pointers and member function pointers.

If the type passed to the builtin is a C++ member function pointer type,
the result is the discriminator used to signed member function pointers
of that type. If the type is a function, function pointer, or function
reference type, the result is the discriminator used to sign functions
of that type. It is ill-formed to use this builtin with any other type.

A call to this function is an integer constant expression.

Co-Authored-By: John McCall rjmccall at apple.com
(cherry picked from commit 666e3326fedfb6a033494c36c36aa95c4124d642)
---
 .../clang/Basic/DiagnosticSemaKinds.td        |  3 +
 clang/include/clang/Basic/TokenKinds.def      |  2 +
 clang/include/clang/Parse/Parser.h            |  2 +
 clang/include/clang/Sema/Sema.h               |  2 +
 clang/lib/AST/ExprConstant.cpp                |  6 ++
 clang/lib/AST/ItaniumMangle.cpp               |  8 +++
 clang/lib/Headers/ptrauth.h                   | 19 ++++++
 clang/lib/Parse/ParseExpr.cpp                 | 23 +++++++
 clang/lib/Sema/SemaChecking.cpp               | 10 ++-
 clang/lib/Sema/SemaExpr.cpp                   | 19 ++++++
 clang/test/AST/ast-dump-ptrauth-json.cpp      |  5 ++
 clang/test/CodeGenCXX/mangle-fail.cpp         | 14 ++++
 clang/test/Sema/ptrauth-intrinsics-macro.c    |  5 ++
 .../SemaCXX/ptrauth-type-discriminator.cpp    | 68 +++++++++++++++++++
 14 files changed, 183 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/AST/ast-dump-ptrauth-json.cpp
 create mode 100644 clang/test/SemaCXX/ptrauth-type-discriminator.cpp

diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index b8d97a6b14fe6..eb0506e71fe3f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -942,6 +942,9 @@ def warn_ptrauth_auth_null_pointer :
   InGroup<PtrAuthNullPointers>;
 def err_ptrauth_string_not_literal : Error<
   "argument must be a string literal%select{| of char type}0">;
+def err_ptrauth_type_disc_undiscriminated : Error<
+  "cannot pass undiscriminated type %0 to "
+  "'__builtin_ptrauth_type_discriminator'">;
 
 def note_ptrauth_virtual_function_pointer_incomplete_arg_ret :
   Note<"cannot take an address of a virtual member function if its return or "
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 7f4912b9bcd96..8c54661e65cf4 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -596,6 +596,8 @@ ALIAS("__is_same_as", __is_same, KEYCXX)
 KEYWORD(__private_extern__          , KEYALL)
 KEYWORD(__module_private__          , KEYALL)
 
+UNARY_EXPR_OR_TYPE_TRAIT(__builtin_ptrauth_type_discriminator, PtrAuthTypeDiscriminator, KEYALL)
+
 // Extension that will be enabled for Microsoft, Borland and PS4, but can be
 // disabled via '-fno-declspec'.
 KEYWORD(__declspec                  , 0)
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index 613bab9120dfc..35bb1a19d40f0 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -3890,6 +3890,8 @@ class Parser : public CodeCompletionHandler {
   ExprResult ParseArrayTypeTrait();
   ExprResult ParseExpressionTrait();
 
+  ExprResult ParseBuiltinPtrauthTypeDiscriminator();
+
   //===--------------------------------------------------------------------===//
   // Preprocessor code-completion pass-through
   void CodeCompleteDirective(bool InConditional) override;
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index d638d31e050dc..7bfdaaae45a93 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3456,6 +3456,8 @@ class Sema final : public SemaBase {
                                     TemplateIdAnnotation *TemplateId,
                                     bool IsMemberSpecialization);
 
+  bool checkPointerAuthEnabled(SourceLocation Loc, SourceRange Range);
+
   bool checkConstantPointerAuthKey(Expr *keyExpr, unsigned &key);
 
   /// Diagnose function specifiers on a declaration of an identifier that
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index fcb382474ea62..03a606102a77e 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -14054,6 +14054,12 @@ bool IntExprEvaluator::VisitUnaryExprOrTypeTraitExpr(
                      E);
   }
 
+  case UETT_PtrAuthTypeDiscriminator: {
+    if (E->getArgumentType()->isDependentType())
+      return false;
+    return Success(
+        Info.Ctx.getPointerAuthTypeDiscriminator(E->getArgumentType()), E);
+  }
   case UETT_VecStep: {
     QualType Ty = E->getTypeOfArgument();
 
diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 40ef82785f454..d46d621d4c7d4 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -5179,6 +5179,14 @@ void CXXNameMangler::mangleExpression(const Expr *E, unsigned Arity,
       Diags.Report(DiagID);
       return;
     }
+    case UETT_PtrAuthTypeDiscriminator: {
+      DiagnosticsEngine &Diags = Context.getDiags();
+      unsigned DiagID = Diags.getCustomDiagID(
+          DiagnosticsEngine::Error,
+          "cannot yet mangle __builtin_ptrauth_type_discriminator expression");
+      Diags.Report(E->getExprLoc(), DiagID);
+      return;
+    }
     case UETT_VecStep: {
       DiagnosticsEngine &Diags = Context.getDiags();
       unsigned DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Error,
diff --git a/clang/lib/Headers/ptrauth.h b/clang/lib/Headers/ptrauth.h
index e0bc8c4f9acf7..4724155b0dc79 100644
--- a/clang/lib/Headers/ptrauth.h
+++ b/clang/lib/Headers/ptrauth.h
@@ -202,6 +202,23 @@ typedef __UINTPTR_TYPE__ ptrauth_generic_signature_t;
 #define ptrauth_string_discriminator(__string)                                 \
   __builtin_ptrauth_string_discriminator(__string)
 
+/* Compute a constant discriminator from the given type.
+
+   The result can be used as the second argument to
+   ptrauth_blend_discriminator or the third argument to the
+   __ptrauth qualifier.  It has type size_t.
+
+   If the type is a C++ member function pointer type, the result is
+   the discriminator used to signed member function pointers of that
+   type.  If the type is a function, function pointer, or function
+   reference type, the result is the discriminator used to sign
+   functions of that type.  It is ill-formed to use this macro with any
+   other type.
+
+   A call to this function is an integer constant expression. */
+#define ptrauth_type_discriminator(__type)                                     \
+  __builtin_ptrauth_type_discriminator(__type)
+
 /* Compute a signature for the given pair of pointer-sized values.
    The order of the arguments is significant.
 
@@ -289,6 +306,8 @@ typedef __UINTPTR_TYPE__ ptrauth_generic_signature_t;
     ((ptrauth_extra_data_t)0);                                                 \
   })
 
+#define ptrauth_type_discriminator(__type) ((ptrauth_extra_data_t)0)
+
 #define ptrauth_sign_generic_data(__value, __data)                             \
   ({                                                                           \
     (void)__value;                                                             \
diff --git a/clang/lib/Parse/ParseExpr.cpp b/clang/lib/Parse/ParseExpr.cpp
index a12c375c8d48c..0a017ae79de75 100644
--- a/clang/lib/Parse/ParseExpr.cpp
+++ b/clang/lib/Parse/ParseExpr.cpp
@@ -841,6 +841,26 @@ bool Parser::isRevertibleTypeTrait(const IdentifierInfo *II,
   return false;
 }
 
+ExprResult Parser::ParseBuiltinPtrauthTypeDiscriminator() {
+  SourceLocation Loc = ConsumeToken();
+
+  BalancedDelimiterTracker T(*this, tok::l_paren);
+  if (T.expectAndConsume())
+    return ExprError();
+
+  TypeResult Ty = ParseTypeName();
+  if (Ty.isInvalid()) {
+    SkipUntil(tok::r_paren, StopAtSemi);
+    return ExprError();
+  }
+
+  SourceLocation EndLoc = Tok.getLocation();
+  T.consumeClose();
+  return Actions.ActOnUnaryExprOrTypeTraitExpr(
+      Loc, UETT_PtrAuthTypeDiscriminator,
+      /*isType=*/true, Ty.get().getAsOpaquePtr(), SourceRange(Loc, EndLoc));
+}
+
 /// Parse a cast-expression, or, if \pisUnaryExpression is true, parse
 /// a unary-expression.
 ///
@@ -1806,6 +1826,9 @@ ExprResult Parser::ParseCastExpression(CastParseKind ParseKind,
     Res = ParseArrayTypeTrait();
     break;
 
+  case tok::kw___builtin_ptrauth_type_discriminator:
+    return ParseBuiltinPtrauthTypeDiscriminator();
+
   case tok::kw___is_lvalue_expr:
   case tok::kw___is_rvalue_expr:
     if (NotPrimaryExpression)
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 45b9bbb23dbf7..cf1196ad23c21 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -1489,14 +1489,18 @@ enum PointerAuthOpKind {
 };
 }
 
-static bool checkPointerAuthEnabled(Sema &S, Expr *E) {
-  if (S.getLangOpts().PointerAuthIntrinsics)
+bool Sema::checkPointerAuthEnabled(SourceLocation Loc, SourceRange Range) {
+  if (getLangOpts().PointerAuthIntrinsics)
     return false;
 
-  S.Diag(E->getExprLoc(), diag::err_ptrauth_disabled) << E->getSourceRange();
+  Diag(Loc, diag::err_ptrauth_disabled) << Range;
   return true;
 }
 
+static bool checkPointerAuthEnabled(Sema &S, Expr *E) {
+  return S.checkPointerAuthEnabled(E->getExprLoc(), E->getSourceRange());
+}
+
 static bool checkPointerAuthKey(Sema &S, Expr *&Arg) {
   // Convert it to type 'int'.
   if (convertArgumentToType(S, Arg, S.Context.IntTy))
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 439db55668cc6..9207bf7a41349 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -4117,6 +4117,21 @@ static bool CheckVectorElementsTraitOperandType(Sema &S, QualType T,
   return false;
 }
 
+static bool checkPtrAuthTypeDiscriminatorOperandType(Sema &S, QualType T,
+                                                     SourceLocation Loc,
+                                                     SourceRange ArgRange) {
+  if (S.checkPointerAuthEnabled(Loc, ArgRange))
+    return true;
+
+  if (!T->isFunctionType() && !T->isFunctionPointerType() &&
+      !T->isFunctionReferenceType() && !T->isMemberFunctionPointerType()) {
+    S.Diag(Loc, diag::err_ptrauth_type_disc_undiscriminated) << T << ArgRange;
+    return true;
+  }
+
+  return false;
+}
+
 static bool CheckExtensionTraitOperandType(Sema &S, QualType T,
                                            SourceLocation Loc,
                                            SourceRange ArgRange,
@@ -4511,6 +4526,10 @@ bool Sema::CheckUnaryExprOrTypeTraitOperand(QualType ExprType,
     return CheckVectorElementsTraitOperandType(*this, ExprType, OpLoc,
                                                ExprRange);
 
+  if (ExprKind == UETT_PtrAuthTypeDiscriminator)
+    return checkPtrAuthTypeDiscriminatorOperandType(*this, ExprType, OpLoc,
+                                                    ExprRange);
+
   // Explicitly list some types as extensions.
   if (!CheckExtensionTraitOperandType(*this, ExprType, OpLoc, ExprRange,
                                       ExprKind))
diff --git a/clang/test/AST/ast-dump-ptrauth-json.cpp b/clang/test/AST/ast-dump-ptrauth-json.cpp
new file mode 100644
index 0000000000000..125cda0cff53a
--- /dev/null
+++ b/clang/test/AST/ast-dump-ptrauth-json.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -std=c++11 -ast-dump=json %s | FileCheck %s
+
+// CHECK: "name": "__builtin_ptrauth_type_discriminator",
+
+int d = __builtin_ptrauth_type_discriminator(int());
diff --git a/clang/test/CodeGenCXX/mangle-fail.cpp b/clang/test/CodeGenCXX/mangle-fail.cpp
index b588d57749fa3..f3b50cfb54dbd 100644
--- a/clang/test/CodeGenCXX/mangle-fail.cpp
+++ b/clang/test/CodeGenCXX/mangle-fail.cpp
@@ -1,5 +1,6 @@
 // RUN: %clang_cc1 -emit-llvm-only -x c++ -std=c++11 -triple %itanium_abi_triple -verify %s -DN=1
 // RUN: %clang_cc1 -emit-llvm-only -x c++ -std=c++11 -triple %itanium_abi_triple -verify %s -DN=2
+// RUN: %clang_cc1 -emit-llvm-only -x c++ -std=c++11 -triple aarch64-linux-gnu -fptrauth-intrinsics -verify %s -DN=3
 
 struct A { int a; };
 
@@ -13,6 +14,19 @@ template void test<int>(int (&)[sizeof(int)]);
 template<class T> void test(int (&)[sizeof((A){}, T())]) {} // expected-error {{cannot yet mangle}}
 template void test<int>(int (&)[sizeof(A)]);
 
+#elif N == 3
+// __builtin_ptrauth_type_discriminator
+template <class T, unsigned disc>
+struct S1 {};
+
+template<class T>
+void func(S1<T, __builtin_ptrauth_type_discriminator(T)> s1) { // expected-error {{cannot yet mangle __builtin_ptrauth_type_discriminator expression}}
+}
+
+void testfunc1() {
+  func(S1<int(), __builtin_ptrauth_type_discriminator(int())>());
+}
+
 // FIXME: There are several more cases we can't yet mangle.
 
 #else
diff --git a/clang/test/Sema/ptrauth-intrinsics-macro.c b/clang/test/Sema/ptrauth-intrinsics-macro.c
index f76f677315dd3..adbb71a9d6e50 100644
--- a/clang/test/Sema/ptrauth-intrinsics-macro.c
+++ b/clang/test/Sema/ptrauth-intrinsics-macro.c
@@ -38,6 +38,11 @@ void test_string_discriminator(int *dp) {
   (void)t0;
 }
 
+void test_type_discriminator(int *dp) {
+  ptrauth_extra_data_t t0 = ptrauth_type_discriminator(int (*)(int));
+  (void)t0;
+}
+
 void test_sign_constant(int *dp) {
   dp = ptrauth_sign_constant(&dv, VALID_DATA_KEY, 0);
 }
diff --git a/clang/test/SemaCXX/ptrauth-type-discriminator.cpp b/clang/test/SemaCXX/ptrauth-type-discriminator.cpp
new file mode 100644
index 0000000000000..685ca1f03fddd
--- /dev/null
+++ b/clang/test/SemaCXX/ptrauth-type-discriminator.cpp
@@ -0,0 +1,68 @@
+// RUN: %clang_cc1 -triple arm64-apple-ios -std=c++17 -Wno-vla -fsyntax-only -verify -fptrauth-intrinsics %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -std=c++17 -Wno-vla -fsyntax-only -verify -fptrauth-intrinsics %s
+
+// RUN: not %clang_cc1 -triple arm64-apple-ios -std=c++17 -Wno-vla -fsyntax-only %s 2>&1 | FileCheck %s
+// CHECK: this target does not support pointer authentication
+
+struct S {
+  virtual int foo();
+};
+
+template <class T>
+constexpr unsigned dependentOperandDisc() {
+  return __builtin_ptrauth_type_discriminator(T);
+}
+
+void test_builtin_ptrauth_type_discriminator(unsigned s) {
+  typedef int (S::*MemFnTy)();
+  MemFnTy memFnPtr;
+  int (S::*memFnPtr2)();
+  constexpr unsigned d0 = __builtin_ptrauth_type_discriminator(MemFnTy);
+  static_assert(d0 == __builtin_ptrauth_string_discriminator("_ZTSM1SFivE"));
+  static_assert(d0 == 60844);
+  static_assert(__builtin_ptrauth_type_discriminator(int (S::*)()) == d0);
+  static_assert(__builtin_ptrauth_type_discriminator(decltype(memFnPtr)) == d0);
+  static_assert(__builtin_ptrauth_type_discriminator(decltype(memFnPtr2)) == d0);
+  static_assert(__builtin_ptrauth_type_discriminator(decltype(&S::foo)) == d0);
+  static_assert(dependentOperandDisc<decltype(&S::foo)>() == d0);
+
+  constexpr unsigned d1 = __builtin_ptrauth_type_discriminator(void (S::*)(int));
+  static_assert(__builtin_ptrauth_string_discriminator("_ZTSM1SFviE") == d1);
+  static_assert(d1 == 39121);
+
+  constexpr unsigned d2 = __builtin_ptrauth_type_discriminator(void (S::*)(float));
+  static_assert(__builtin_ptrauth_string_discriminator("_ZTSM1SFvfE") == d2);
+  static_assert(d2 == 52453);
+
+  constexpr unsigned d3 = __builtin_ptrauth_type_discriminator(int (*())[s]);
+  static_assert(__builtin_ptrauth_string_discriminator("FPE") == d3);
+  static_assert(d3 == 34128);
+
+  int f4(float);
+  constexpr unsigned d4 = __builtin_ptrauth_type_discriminator(decltype(f4));
+  static_assert(__builtin_ptrauth_type_discriminator(int (*)(float)) == d4);
+  static_assert(__builtin_ptrauth_string_discriminator("FifE") == d4);
+  static_assert(d4 == 48468);
+
+  int f5(int);
+  constexpr unsigned d5 = __builtin_ptrauth_type_discriminator(decltype(f5));
+  static_assert(__builtin_ptrauth_type_discriminator(int (*)(int)) == d5);
+  static_assert(__builtin_ptrauth_type_discriminator(short (*)(short)) == d5);
+  static_assert(__builtin_ptrauth_type_discriminator(char (*)(char)) == d5);
+  static_assert(__builtin_ptrauth_type_discriminator(long (*)(long)) == d5);
+  static_assert(__builtin_ptrauth_type_discriminator(unsigned int (*)(unsigned int)) == d5);
+  static_assert(__builtin_ptrauth_type_discriminator(int (&)(int)) == d5);
+  static_assert(__builtin_ptrauth_string_discriminator("FiiE") == d5);
+  static_assert(d5 == 2981);
+
+  int t;
+  int vmarray[s];
+  (void)__builtin_ptrauth_type_discriminator(t); // expected-error {{unknown type name 't'}}
+  (void)__builtin_ptrauth_type_discriminator(&t); // expected-error {{expected a type}}
+  (void)__builtin_ptrauth_type_discriminator(decltype(vmarray)); // expected-error {{cannot pass undiscriminated type 'decltype(vmarray)' (aka 'int[s]')}}
+  (void)__builtin_ptrauth_type_discriminator(int *); // expected-error {{cannot pass undiscriminated type 'int *' to '__builtin_ptrauth_type_discriminator'}}
+  (void)__builtin_ptrauth_type_discriminator(); // expected-error {{expected a type}}
+  (void)__builtin_ptrauth_type_discriminator(int (*)(int), int (*)(int));
+  // expected-error at -1 {{expected ')'}}
+  // expected-note at -2 {{to match this '('}}
+}

>From 5f8189c47d94f2cf909ca3c618e72475ab1166c4 Mon Sep 17 00:00:00 2001
From: jeanPerier <jperier at nvidia.com>
Date: Wed, 24 Jul 2024 10:24:04 +0200
Subject: [PATCH 20/91] [flang] fix C_PTR function result lowering (#100082)

Functions returning C_PTR were lowered to function returning intptr (i64
on 64bit arch). This caused conflicts when these functions were defined
as returning !fir.ref<none>/llvm.ptr in other compiler generated
contexts (e.g., malloc).

Lower them to return !fir.ref<none>.

This should deal with https://github.com/llvm/llvm-project/issues/97325
and https://github.com/llvm/llvm-project/issues/98644.

(cherry picked from commit 1ead51a86c6c746a1b9948ca1ee142df223ffebd)
---
 flang/lib/Optimizer/Builder/FIRBuilder.cpp    |  54 +++++----
 .../Optimizer/Transforms/AbstractResult.cpp   | 108 +++++++++---------
 flang/test/Fir/abstract-results.fir           |  36 +++---
 3 files changed, 110 insertions(+), 88 deletions(-)

diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
index 2961df96b3cab..fbe79d0e45e5a 100644
--- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp
+++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
@@ -1541,21 +1541,44 @@ mlir::Value fir::factory::genMaxWithZero(fir::FirOpBuilder &builder,
                                                zero);
 }
 
+static std::pair<mlir::Value, mlir::Type>
+genCPtrOrCFunptrFieldIndex(fir::FirOpBuilder &builder, mlir::Location loc,
+                           mlir::Type cptrTy) {
+  auto recTy = mlir::cast<fir::RecordType>(cptrTy);
+  assert(recTy.getTypeList().size() == 1);
+  auto addrFieldName = recTy.getTypeList()[0].first;
+  mlir::Type addrFieldTy = recTy.getTypeList()[0].second;
+  auto fieldIndexType = fir::FieldType::get(cptrTy.getContext());
+  mlir::Value addrFieldIndex = builder.create<fir::FieldIndexOp>(
+      loc, fieldIndexType, addrFieldName, recTy,
+      /*typeParams=*/mlir::ValueRange{});
+  return {addrFieldIndex, addrFieldTy};
+}
+
 mlir::Value fir::factory::genCPtrOrCFunptrAddr(fir::FirOpBuilder &builder,
                                                mlir::Location loc,
                                                mlir::Value cPtr,
                                                mlir::Type ty) {
-  assert(mlir::isa<fir::RecordType>(ty));
-  auto recTy = mlir::dyn_cast<fir::RecordType>(ty);
-  assert(recTy.getTypeList().size() == 1);
-  auto fieldName = recTy.getTypeList()[0].first;
-  mlir::Type fieldTy = recTy.getTypeList()[0].second;
-  auto fieldIndexType = fir::FieldType::get(ty.getContext());
-  mlir::Value field =
-      builder.create<fir::FieldIndexOp>(loc, fieldIndexType, fieldName, recTy,
-                                        /*typeParams=*/mlir::ValueRange{});
-  return builder.create<fir::CoordinateOp>(loc, builder.getRefType(fieldTy),
-                                           cPtr, field);
+  auto [addrFieldIndex, addrFieldTy] =
+      genCPtrOrCFunptrFieldIndex(builder, loc, ty);
+  return builder.create<fir::CoordinateOp>(loc, builder.getRefType(addrFieldTy),
+                                           cPtr, addrFieldIndex);
+}
+
+mlir::Value fir::factory::genCPtrOrCFunptrValue(fir::FirOpBuilder &builder,
+                                                mlir::Location loc,
+                                                mlir::Value cPtr) {
+  mlir::Type cPtrTy = fir::unwrapRefType(cPtr.getType());
+  if (fir::isa_ref_type(cPtr.getType())) {
+    mlir::Value cPtrAddr =
+        fir::factory::genCPtrOrCFunptrAddr(builder, loc, cPtr, cPtrTy);
+    return builder.create<fir::LoadOp>(loc, cPtrAddr);
+  }
+  auto [addrFieldIndex, addrFieldTy] =
+      genCPtrOrCFunptrFieldIndex(builder, loc, cPtrTy);
+  auto arrayAttr =
+      builder.getArrayAttr({builder.getIntegerAttr(builder.getIndexType(), 0)});
+  return builder.create<fir::ExtractValueOp>(loc, addrFieldTy, cPtr, arrayAttr);
 }
 
 fir::BoxValue fir::factory::createBoxValue(fir::FirOpBuilder &builder,
@@ -1596,15 +1619,6 @@ fir::BoxValue fir::factory::createBoxValue(fir::FirOpBuilder &builder,
   return fir::BoxValue(box, lbounds, explicitTypeParams);
 }
 
-mlir::Value fir::factory::genCPtrOrCFunptrValue(fir::FirOpBuilder &builder,
-                                                mlir::Location loc,
-                                                mlir::Value cPtr) {
-  mlir::Type cPtrTy = fir::unwrapRefType(cPtr.getType());
-  mlir::Value cPtrAddr =
-      fir::factory::genCPtrOrCFunptrAddr(builder, loc, cPtr, cPtrTy);
-  return builder.create<fir::LoadOp>(loc, cPtrAddr);
-}
-
 mlir::Value fir::factory::createNullBoxProc(fir::FirOpBuilder &builder,
                                             mlir::Location loc,
                                             mlir::Type boxType) {
diff --git a/flang/lib/Optimizer/Transforms/AbstractResult.cpp b/flang/lib/Optimizer/Transforms/AbstractResult.cpp
index 3906aa553cb34..ff37310224e85 100644
--- a/flang/lib/Optimizer/Transforms/AbstractResult.cpp
+++ b/flang/lib/Optimizer/Transforms/AbstractResult.cpp
@@ -59,14 +59,16 @@ static mlir::FunctionType getNewFunctionType(mlir::FunctionType funcTy,
                                  /*resultTypes=*/{});
 }
 
+static mlir::Type getVoidPtrType(mlir::MLIRContext *context) {
+  return fir::ReferenceType::get(mlir::NoneType::get(context));
+}
+
 /// This is for function result types that are of type C_PTR from ISO_C_BINDING.
 /// Follow the ABI for interoperability with C.
 static mlir::FunctionType getCPtrFunctionType(mlir::FunctionType funcTy) {
-  auto resultType = funcTy.getResult(0);
-  assert(fir::isa_builtin_cptr_type(resultType));
-  llvm::SmallVector<mlir::Type> outputTypes;
-  auto recTy = mlir::dyn_cast<fir::RecordType>(resultType);
-  outputTypes.emplace_back(recTy.getTypeList()[0].second);
+  assert(fir::isa_builtin_cptr_type(funcTy.getResult(0)));
+  llvm::SmallVector<mlir::Type> outputTypes{
+      getVoidPtrType(funcTy.getContext())};
   return mlir::FunctionType::get(funcTy.getContext(), funcTy.getInputs(),
                                  outputTypes);
 }
@@ -109,15 +111,11 @@ class CallConversion : public mlir::OpRewritePattern<Op> {
           saveResult.getTypeparams());
 
     llvm::SmallVector<mlir::Type> newResultTypes;
-    // TODO: This should be generalized for derived types, and it is
-    // architecture and OS dependent.
     bool isResultBuiltinCPtr = fir::isa_builtin_cptr_type(result.getType());
-    Op newOp;
-    if (isResultBuiltinCPtr) {
-      auto recTy = mlir::dyn_cast<fir::RecordType>(result.getType());
-      newResultTypes.emplace_back(recTy.getTypeList()[0].second);
-    }
+    if (isResultBuiltinCPtr)
+      newResultTypes.emplace_back(getVoidPtrType(result.getContext()));
 
+    Op newOp;
     // fir::CallOp specific handling.
     if constexpr (std::is_same_v<Op, fir::CallOp>) {
       if (op.getCallee()) {
@@ -175,7 +173,7 @@ class CallConversion : public mlir::OpRewritePattern<Op> {
       FirOpBuilder builder(rewriter, module);
       mlir::Value saveAddr = fir::factory::genCPtrOrCFunptrAddr(
           builder, loc, save, result.getType());
-      rewriter.create<fir::StoreOp>(loc, newOp->getResult(0), saveAddr);
+      builder.createStoreWithConvert(loc, newOp->getResult(0), saveAddr);
     }
     op->dropAllReferences();
     rewriter.eraseOp(op);
@@ -210,42 +208,52 @@ class ReturnOpConversion : public mlir::OpRewritePattern<mlir::func::ReturnOp> {
                   mlir::PatternRewriter &rewriter) const override {
     auto loc = ret.getLoc();
     rewriter.setInsertionPoint(ret);
-    auto returnedValue = ret.getOperand(0);
-    bool replacedStorage = false;
-    if (auto *op = returnedValue.getDefiningOp())
-      if (auto load = mlir::dyn_cast<fir::LoadOp>(op)) {
-        auto resultStorage = load.getMemref();
-        // The result alloca may be behind a fir.declare, if any.
-        if (auto declare = mlir::dyn_cast_or_null<fir::DeclareOp>(
-                resultStorage.getDefiningOp()))
-          resultStorage = declare.getMemref();
-        // TODO: This should be generalized for derived types, and it is
-        // architecture and OS dependent.
-        if (fir::isa_builtin_cptr_type(returnedValue.getType())) {
-          rewriter.eraseOp(load);
-          auto module = ret->getParentOfType<mlir::ModuleOp>();
-          FirOpBuilder builder(rewriter, module);
-          mlir::Value retAddr = fir::factory::genCPtrOrCFunptrAddr(
-              builder, loc, resultStorage, returnedValue.getType());
-          mlir::Value retValue = rewriter.create<fir::LoadOp>(
-              loc, fir::unwrapRefType(retAddr.getType()), retAddr);
-          rewriter.replaceOpWithNewOp<mlir::func::ReturnOp>(
-              ret, mlir::ValueRange{retValue});
-          return mlir::success();
-        }
-        resultStorage.replaceAllUsesWith(newArg);
-        replacedStorage = true;
-        if (auto *alloc = resultStorage.getDefiningOp())
-          if (alloc->use_empty())
-            rewriter.eraseOp(alloc);
+    mlir::Value resultValue = ret.getOperand(0);
+    fir::LoadOp resultLoad;
+    mlir::Value resultStorage;
+    // Identify result local storage.
+    if (auto load = resultValue.getDefiningOp<fir::LoadOp>()) {
+      resultLoad = load;
+      resultStorage = load.getMemref();
+      // The result alloca may be behind a fir.declare, if any.
+      if (auto declare = resultStorage.getDefiningOp<fir::DeclareOp>())
+        resultStorage = declare.getMemref();
+    }
+    // Replace old local storage with new storage argument, unless
+    // the derived type is C_PTR/C_FUN_PTR, in which case the return
+    // type is updated to return void* (no new argument is passed).
+    if (fir::isa_builtin_cptr_type(resultValue.getType())) {
+      auto module = ret->getParentOfType<mlir::ModuleOp>();
+      FirOpBuilder builder(rewriter, module);
+      mlir::Value cptr = resultValue;
+      if (resultLoad) {
+        // Replace whole derived type load by component load.
+        cptr = resultLoad.getMemref();
+        rewriter.setInsertionPoint(resultLoad);
       }
-    // The result storage may have been optimized out by a memory to
-    // register pass, this is possible for fir.box results, or fir.record
-    // with no length parameters. Simply store the result in the result storage.
-    // at the return point.
-    if (!replacedStorage)
-      rewriter.create<fir::StoreOp>(loc, returnedValue, newArg);
-    rewriter.replaceOpWithNewOp<mlir::func::ReturnOp>(ret);
+      mlir::Value newResultValue =
+          fir::factory::genCPtrOrCFunptrValue(builder, loc, cptr);
+      newResultValue = builder.createConvert(
+          loc, getVoidPtrType(ret.getContext()), newResultValue);
+      rewriter.setInsertionPoint(ret);
+      rewriter.replaceOpWithNewOp<mlir::func::ReturnOp>(
+          ret, mlir::ValueRange{newResultValue});
+    } else if (resultStorage) {
+      resultStorage.replaceAllUsesWith(newArg);
+      rewriter.replaceOpWithNewOp<mlir::func::ReturnOp>(ret);
+    } else {
+      // The result storage may have been optimized out by a memory to
+      // register pass, this is possible for fir.box results, or fir.record
+      // with no length parameters. Simply store the result in the result
+      // storage. at the return point.
+      rewriter.create<fir::StoreOp>(loc, resultValue, newArg);
+      rewriter.replaceOpWithNewOp<mlir::func::ReturnOp>(ret);
+    }
+    // Delete result old local storage if unused.
+    if (resultStorage)
+      if (auto alloc = resultStorage.getDefiningOp<fir::AllocaOp>())
+        if (alloc->use_empty())
+          rewriter.eraseOp(alloc);
     return mlir::success();
   }
 
@@ -263,8 +271,6 @@ class AddrOfOpConversion : public mlir::OpRewritePattern<fir::AddrOfOp> {
                   mlir::PatternRewriter &rewriter) const override {
     auto oldFuncTy = mlir::cast<mlir::FunctionType>(addrOf.getType());
     mlir::FunctionType newFuncTy;
-    // TODO: This should be generalized for derived types, and it is
-    // architecture and OS dependent.
     if (oldFuncTy.getNumResults() != 0 &&
         fir::isa_builtin_cptr_type(oldFuncTy.getResult(0)))
       newFuncTy = getCPtrFunctionType(oldFuncTy);
@@ -298,8 +304,6 @@ class AbstractResultOpt
     // Convert function type itself if it has an abstract result.
     auto funcTy = mlir::cast<mlir::FunctionType>(func.getFunctionType());
     if (hasAbstractResult(funcTy)) {
-      // TODO: This should be generalized for derived types, and it is
-      // architecture and OS dependent.
       if (fir::isa_builtin_cptr_type(funcTy.getResult(0))) {
         func.setType(getCPtrFunctionType(funcTy));
         patterns.insert<ReturnOpConversion>(context, mlir::Value{});
diff --git a/flang/test/Fir/abstract-results.fir b/flang/test/Fir/abstract-results.fir
index 82f1cd33073fd..93e63dc657f0c 100644
--- a/flang/test/Fir/abstract-results.fir
+++ b/flang/test/Fir/abstract-results.fir
@@ -87,8 +87,8 @@ func.func @boxfunc_callee() -> !fir.box<!fir.heap<f64>> {
   // FUNC-BOX: return
 }
 
-// FUNC-REF-LABEL: func @retcptr() -> i64
-// FUNC-BOX-LABEL: func @retcptr() -> i64
+// FUNC-REF-LABEL: func @retcptr() -> !fir.ref<none>
+// FUNC-BOX-LABEL: func @retcptr() -> !fir.ref<none>
 func.func @retcptr() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {
   %0 = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {bindc_name = "rec", uniq_name = "_QFrecErec"}
   %1 = fir.load %0 : !fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>
@@ -98,12 +98,14 @@ func.func @retcptr() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__addres
   // FUNC-REF: %[[FIELD:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
   // FUNC-REF: %[[ADDR:.*]] = fir.coordinate_of %[[ALLOC]], %[[FIELD]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
   // FUNC-REF: %[[VAL:.*]] = fir.load %[[ADDR]] : !fir.ref<i64>
-  // FUNC-REF: return %[[VAL]] : i64
+  // FUNC-REF: %[[CAST:.*]] = fir.convert %[[VAL]] : (i64) -> !fir.ref<none>
+  // FUNC-REF: return %[[CAST]] : !fir.ref<none>
   // FUNC-BOX: %[[ALLOC:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {bindc_name = "rec", uniq_name = "_QFrecErec"}
   // FUNC-BOX: %[[FIELD:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
   // FUNC-BOX: %[[ADDR:.*]] = fir.coordinate_of %[[ALLOC]], %[[FIELD]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
   // FUNC-BOX: %[[VAL:.*]] = fir.load %[[ADDR]] : !fir.ref<i64>
-  // FUNC-BOX: return %[[VAL]] : i64
+  // FUNC-BOX: %[[CAST:.*]] = fir.convert %[[VAL]] : (i64) -> !fir.ref<none>
+  // FUNC-BOX: return %[[CAST]] : !fir.ref<none>
 }
 
 // FUNC-REF-LABEL:  func private @arrayfunc_callee_declare(
@@ -311,8 +313,8 @@ func.func @test_address_of() {
 
 }
 
-// FUNC-REF-LABEL: func.func private @returns_null() -> i64
-// FUNC-BOX-LABEL: func.func private @returns_null() -> i64
+// FUNC-REF-LABEL: func.func private @returns_null() -> !fir.ref<none>
+// FUNC-BOX-LABEL: func.func private @returns_null() -> !fir.ref<none>
 func.func private @returns_null() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
 
 // FUNC-REF-LABEL: func @test_address_of_cptr
@@ -323,12 +325,12 @@ func.func @test_address_of_cptr() {
   fir.call @_QMtest_c_func_modPsubr(%1) : (() -> ()) -> ()
   return
 
-  // FUNC-REF: %[[VAL_0:.*]] = fir.address_of(@returns_null) : () -> i64
-  // FUNC-REF: %[[VAL_1:.*]] = fir.convert %[[VAL_0]] : (() -> i64) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
+  // FUNC-REF: %[[VAL_0:.*]] = fir.address_of(@returns_null) : () -> !fir.ref<none>
+  // FUNC-REF: %[[VAL_1:.*]] = fir.convert %[[VAL_0]] : (() -> !fir.ref<none>) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
   // FUNC-REF: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> ())
   // FUNC-REF: fir.call @_QMtest_c_func_modPsubr(%[[VAL_2]]) : (() -> ()) -> ()
-  // FUNC-BOX: %[[VAL_0:.*]] = fir.address_of(@returns_null) : () -> i64
-  // FUNC-BOX: %[[VAL_1:.*]] = fir.convert %[[VAL_0]] : (() -> i64) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
+  // FUNC-BOX: %[[VAL_0:.*]] = fir.address_of(@returns_null) : () -> !fir.ref<none>
+  // FUNC-BOX: %[[VAL_1:.*]] = fir.convert %[[VAL_0]] : (() -> !fir.ref<none>) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
   // FUNC-BOX: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> ())
   // FUNC-BOX: fir.call @_QMtest_c_func_modPsubr(%[[VAL_2]]) : (() -> ()) -> ()
 }
@@ -380,18 +382,20 @@ func.func @test_indirect_calls_return_cptr(%arg0: () -> ()) {
 
   // FUNC-REF: %[[VAL_0:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {bindc_name = ".result"}
   // FUNC-REF: %[[VAL_1:.*]] = fir.convert %[[ARG0]] : (() -> ()) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
-  // FUNC-REF: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> i64)
-  // FUNC-REF: %[[VAL_3:.*]] = fir.call %[[VAL_2]]() : () -> i64
+  // FUNC-REF: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> !fir.ref<none>)
+  // FUNC-REF: %[[VAL_3:.*]] = fir.call %[[VAL_2]]() : () -> !fir.ref<none>
   // FUNC-REF: %[[VAL_4:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
   // FUNC-REF: %[[VAL_5:.*]] = fir.coordinate_of %[[VAL_0]], %[[VAL_4]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
-  // FUNC-REF: fir.store %[[VAL_3]] to %[[VAL_5]] : !fir.ref<i64>
+  // FUNC-REF: %[[CAST:.*]] = fir.convert %[[VAL_3]] : (!fir.ref<none>) -> i64
+  // FUNC-REF: fir.store %[[CAST]] to %[[VAL_5]] : !fir.ref<i64>
   // FUNC-BOX: %[[VAL_0:.*]] = fir.alloca !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}> {bindc_name = ".result"}
   // FUNC-BOX: %[[VAL_1:.*]] = fir.convert %[[ARG0]] : (() -> ()) -> (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>)
-  // FUNC-BOX: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> i64)
-  // FUNC-BOX: %[[VAL_3:.*]] = fir.call %[[VAL_2]]() : () -> i64
+  // FUNC-BOX: %[[VAL_2:.*]] = fir.convert %[[VAL_1]] : (() -> !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) -> (() -> !fir.ref<none>)
+  // FUNC-BOX: %[[VAL_3:.*]] = fir.call %[[VAL_2]]() : () -> !fir.ref<none>
   // FUNC-BOX: %[[VAL_4:.*]] = fir.field_index __address, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>
   // FUNC-BOX: %[[VAL_5:.*]] = fir.coordinate_of %[[VAL_0]], %[[VAL_4]] : (!fir.ref<!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>>, !fir.field) -> !fir.ref<i64>
-  // FUNC-BOX: fir.store %[[VAL_3]] to %[[VAL_5]] : !fir.ref<i64>
+  // FUNC-BOX: %[[CAST:.*]] = fir.convert %[[VAL_3]] : (!fir.ref<none>) -> i64
+  // FUNC-BOX: fir.store %[[CAST]] to %[[VAL_5]] : !fir.ref<i64>
 }
 
 // ----------------------- Test GlobalOp rewrite ------------------------

>From 411bb691fe4c69bf167b3a03b052c094687f99ce Mon Sep 17 00:00:00 2001
From: Sudharsan Veeravalli <quic_svs at quicinc.com>
Date: Tue, 23 Jul 2024 18:49:57 +0530
Subject: [PATCH 21/91] [RISCV] Fix InsnCI register type (#100113)

According to the spec the CI type instructions can take any of the 32
RVI registers.

Fixes #100112

(cherry picked from commit 1ebfc81a91194c000ac70b4ea53891cc956aa6eb)
---
 llvm/lib/Target/RISCV/RISCVInstrInfoC.td |  8 ++++----
 llvm/test/MC/RISCV/insn_c.s              | 10 ++++++++++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoC.td b/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
index 9257ee5a09a8e..3f279b7a58ca6 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoC.td
@@ -764,9 +764,9 @@ def InsnCR : DirectiveInsnCR<(outs AnyReg:$rd), (ins uimm2_opcode:$opcode,
                                                      uimm4:$funct4,
                                                      AnyReg:$rs2),
                              "$opcode, $funct4, $rd, $rs2">;
-def InsnCI : DirectiveInsnCI<(outs AnyRegC:$rd), (ins uimm2_opcode:$opcode,
-                                                      uimm3:$funct3,
-                                                      simm6:$imm6),
+def InsnCI : DirectiveInsnCI<(outs AnyReg:$rd), (ins uimm2_opcode:$opcode,
+                                                     uimm3:$funct3,
+                                                     simm6:$imm6),
                              "$opcode, $funct3, $rd, $imm6">;
 def InsnCIW : DirectiveInsnCIW<(outs AnyRegC:$rd), (ins uimm2_opcode:$opcode,
                                                         uimm3:$funct3,
@@ -818,7 +818,7 @@ def : InstAlias<".insn_cr $opcode, $funct4, $rd, $rs2",
                 (InsnCR AnyReg:$rd, uimm2_opcode:$opcode, uimm4:$funct4,
                         AnyReg:$rs2)>;
 def : InstAlias<".insn_ci $opcode, $funct3, $rd, $imm6",
-                (InsnCI AnyRegC:$rd, uimm2_opcode:$opcode, uimm3:$funct3,
+                (InsnCI AnyReg:$rd, uimm2_opcode:$opcode, uimm3:$funct3,
                         simm6:$imm6)>;
 def : InstAlias<".insn_ciw $opcode, $funct3, $rd, $imm8",
                 (InsnCIW AnyRegC:$rd, uimm2_opcode:$opcode, uimm3:$funct3,
diff --git a/llvm/test/MC/RISCV/insn_c.s b/llvm/test/MC/RISCV/insn_c.s
index 19169e8b08c94..c63e8ab33aef9 100644
--- a/llvm/test/MC/RISCV/insn_c.s
+++ b/llvm/test/MC/RISCV/insn_c.s
@@ -31,6 +31,16 @@ target:
 # CHECK-OBJ: c.addi a0, 0xd
 .insn ci C1, 0, a0, 13
 
+# CHECK-ASM: .insn ci  1, 0, a6, 13
+# CHECK-ASM: encoding: [0x35,0x08]
+# CHECK-OBJ: c.addi a6, 0xd
+.insn ci  1, 0, a6, 13
+
+# CHECK-ASM: .insn ci  1, 0, a6, 13
+# CHECK-ASM: encoding: [0x35,0x08]
+# CHECK-OBJ: c.addi a6, 0xd
+.insn ci C1, 0, a6, 13
+
 # CHECK-ASM: .insn ciw  0, 0, a0, 13
 # CHECK-ASM: encoding: [0xa8,0x01]
 # CHECK-OBJ: c.addi4spn a0, sp, 0xc8

>From 7af27be6633a18dd3d568594175feb6353320795 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Tue, 23 Jul 2024 09:44:00 -0700
Subject: [PATCH 22/91] [ARM] Create mapping symbols with non-unique names

Similar to #99836 for AArch64.

Non-unique names save .strtab space and match GNU assembler.

Pull Request: https://github.com/llvm/llvm-project/pull/99906

(cherry picked from commit 298a9223a57c50cb0d24b82687ad1bc2f7a022e6)
---
 lld/test/ELF/arm-cmse-implib.s                   |  8 ++++----
 .../Target/ARM/MCTargetDesc/ARMELFStreamer.cpp   |  8 ++------
 .../DebugInfo/Symbolize/ELF/arm-mapping-symbol.s | 16 ++++++++--------
 llvm/test/MC/ARM/CheckDataSymbol.s               |  2 +-
 llvm/test/MC/ARM/data-in-code.ll                 |  2 +-
 llvm/test/MC/ARM/directive-arm-thumb-alignment.s |  8 ++++----
 llvm/test/MC/ARM/multi-section-mapping.s         | 12 ++++++------
 llvm/test/MC/ARM/thumb-function-address.s        |  4 ++--
 llvm/test/MC/ARM/thumb-types.s                   | 16 ++++++++--------
 llvm/test/MC/ARM/thumb_set.s                     |  4 ++--
 llvm/test/MC/ELF/ARM/execute-only-section.s      |  6 +++---
 llvm/test/tools/llvm-objdump/multiple-symbols.s  |  4 ++--
 12 files changed, 43 insertions(+), 47 deletions(-)

diff --git a/lld/test/ELF/arm-cmse-implib.s b/lld/test/ELF/arm-cmse-implib.s
index 581bff9dd8536..60a68b0226c3d 100644
--- a/lld/test/ELF/arm-cmse-implib.s
+++ b/lld/test/ELF/arm-cmse-implib.s
@@ -53,8 +53,8 @@ secure_entry:
 // CHECK1-NEXT:    Num:    Value  Size Type    Bind   Vis       Ndx Name
 // CHECK1-NEXT:      0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND
 // CHECK1-NEXT:      1: 00020000     0 NOTYPE  LOCAL  DEFAULT     2 $t
-// CHECK1-NEXT:      2: 00008000     0 NOTYPE  LOCAL  DEFAULT     1 $t.0
-// CHECK1-NEXT:      3: 00008004     0 NOTYPE  LOCAL  DEFAULT     1 $t.0
+// CHECK1-NEXT:      2: 00008000     0 NOTYPE  LOCAL  DEFAULT     1 $t
+// CHECK1-NEXT:      3: 00008004     0 NOTYPE  LOCAL  DEFAULT     1 $t
 // CHECK1-NEXT:      4: 00008001     2 FUNC    GLOBAL DEFAULT     1 secure_entry
 // CHECK1-NEXT:      5: 00020001     8 FUNC    GLOBAL DEFAULT     2 foo
 // CHECK1-NEXT:      6: 00008005     2 FUNC    GLOBAL DEFAULT     1 __acle_se_foo
@@ -82,8 +82,8 @@ secure_entry:
 // CHECK2-NEXT:    Num:    Value  Size Type    Bind   Vis       Ndx Name
 // CHECK2-NEXT:      0: 00000000     0 NOTYPE  LOCAL  DEFAULT   UND
 // CHECK2-NEXT:      1: 00020000     0 NOTYPE  LOCAL  DEFAULT     2 $t
-// CHECK2-NEXT:      2: 00008000     0 NOTYPE  LOCAL  DEFAULT     1 $t.0
-// CHECK2-NEXT:      3: 00008004     0 NOTYPE  LOCAL  DEFAULT     1 $t.0
+// CHECK2-NEXT:      2: 00008000     0 NOTYPE  LOCAL  DEFAULT     1 $t
+// CHECK2-NEXT:      3: 00008004     0 NOTYPE  LOCAL  DEFAULT     1 $t
 // CHECK2-NEXT:      4: 00008001     2 FUNC    GLOBAL DEFAULT     1 secure_entry
 // CHECK2-NEXT:      5: 00020011     8 FUNC    WEAK   DEFAULT     2 baz
 // CHECK2-NEXT:      6: 00008005     2 FUNC    GLOBAL DEFAULT     1 __acle_se_baz
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp
index 3182fecffecf4..de343a7d72ad9 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp
@@ -670,8 +670,7 @@ class ARMELFStreamer : public MCELFStreamer {
   }
 
   void EmitMappingSymbol(StringRef Name) {
-    auto *Symbol = cast<MCSymbolELF>(getContext().getOrCreateSymbol(
-        Name + "." + Twine(MappingSymbolCounter++)));
+    auto *Symbol = cast<MCSymbolELF>(getContext().createLocalSymbol(Name));
     emitLabel(Symbol);
 
     Symbol->setType(ELF::STT_NOTYPE);
@@ -679,8 +678,7 @@ class ARMELFStreamer : public MCELFStreamer {
   }
 
   void emitMappingSymbol(StringRef Name, MCDataFragment &F, uint64_t Offset) {
-    auto *Symbol = cast<MCSymbolELF>(getContext().getOrCreateSymbol(
-        Name + "." + Twine(MappingSymbolCounter++)));
+    auto *Symbol = cast<MCSymbolELF>(getContext().createLocalSymbol(Name));
     emitLabelAtPos(Symbol, SMLoc(), F, Offset);
     Symbol->setType(ELF::STT_NOTYPE);
     Symbol->setBinding(ELF::STB_LOCAL);
@@ -710,7 +708,6 @@ class ARMELFStreamer : public MCELFStreamer {
 
   bool IsThumb;
   bool IsAndroid;
-  int64_t MappingSymbolCounter = 0;
 
   DenseMap<const MCSection *, std::unique_ptr<ElfMappingSymbolInfo>>
       LastMappingSymbols;
@@ -1121,7 +1118,6 @@ void ARMELFStreamer::reset() {
   MCTargetStreamer &TS = *getTargetStreamer();
   ARMTargetStreamer &ATS = static_cast<ARMTargetStreamer &>(TS);
   ATS.reset();
-  MappingSymbolCounter = 0;
   MCELFStreamer::reset();
   LastMappingSymbols.clear();
   LastEMSInfo.reset();
diff --git a/llvm/test/DebugInfo/Symbolize/ELF/arm-mapping-symbol.s b/llvm/test/DebugInfo/Symbolize/ELF/arm-mapping-symbol.s
index 6e17ef29ae577..27310c09fb07c 100644
--- a/llvm/test/DebugInfo/Symbolize/ELF/arm-mapping-symbol.s
+++ b/llvm/test/DebugInfo/Symbolize/ELF/arm-mapping-symbol.s
@@ -5,19 +5,19 @@
 
 ## Verify that mapping symbols are actually present in the object at expected
 ## addresses.
-# RUN: llvm-nm --special-syms %t | FileCheck %s -check-prefix MAPPING_A
+# RUN: llvm-nm --special-syms %t | FileCheck %s --check-prefix=MAPPING_A --match-full-lines
 
-# MAPPING_A:      00000004 t $a.1
-# MAPPING_A-NEXT: 00000000 t $d.0
-# MAPPING_A-NEXT: 00000008 t $d.2
+# MAPPING_A:      00000004 t $a
+# MAPPING_A-NEXT: 00000000 t $d
+# MAPPING_A-NEXT: 00000008 t $d
 # MAPPING_A-NEXT: 00000000 T foo
 
 # RUN: llvm-mc -filetype=obj -triple=thumbv7-none-linux %s -o %tthumb
-# RUN: llvm-nm --special-syms %tthumb | FileCheck %s -check-prefix MAPPING_T
+# RUN: llvm-nm --special-syms %tthumb | FileCheck %s --check-prefix=MAPPING_T --match-full-lines
 
-# MAPPING_T:      00000000 t $d.0
-# MAPPING_T-NEXT: 00000006 t $d.2
-# MAPPING_T-NEXT: 00000004 t $t.1
+# MAPPING_T:      00000000 t $d
+# MAPPING_T-NEXT: 00000006 t $d
+# MAPPING_T-NEXT: 00000004 t $t
 # MAPPING_T-NEXT: 00000000 T foo
 
 # RUN: llvm-symbolizer --obj=%t 4 8 | FileCheck %s -check-prefix SYMBOL
diff --git a/llvm/test/MC/ARM/CheckDataSymbol.s b/llvm/test/MC/ARM/CheckDataSymbol.s
index 14ea92a943a1e..ec421f51395af 100644
--- a/llvm/test/MC/ARM/CheckDataSymbol.s
+++ b/llvm/test/MC/ARM/CheckDataSymbol.s
@@ -1,7 +1,7 @@
 # RUN: llvm-mc -filetype=obj -assemble \
 # RUN: -triple=arm-arm-none-eabi -mcpu=cortex-a9 %s -o - \
 # RUN: | llvm-readobj -S --symbols - | FileCheck %s
-# CHECK:     Name: $d.1 ({{[1-9][0-9]+}})
+# CHECK:     Name: $d
 # CHECK-NEXT:     Value: 0x4
 # CHECK-NEXT:     Size: 0
 # CHECK-NEXT:     Binding: Local (0x0)
diff --git a/llvm/test/MC/ARM/data-in-code.ll b/llvm/test/MC/ARM/data-in-code.ll
index 2e107f250e05d..b755c3bb5cad4 100644
--- a/llvm/test/MC/ARM/data-in-code.ll
+++ b/llvm/test/MC/ARM/data-in-code.ll
@@ -72,7 +72,7 @@ exit:
 ;; TMB-NEXT:     Section: [[MIXED_SECT:[^ ]+]]
 
 ;; TMB:        Symbol {
-;; TMB:          Name: $d.1
+;; TMB:          Name: $d
 ;; TMB-NEXT:     Value: 0x{{[0-9A-F]+}}
 ;; TMB-NEXT:     Size: 0
 ;; TMB-NEXT:     Binding: Local
diff --git a/llvm/test/MC/ARM/directive-arm-thumb-alignment.s b/llvm/test/MC/ARM/directive-arm-thumb-alignment.s
index b90c76d2b121c..0e798f67b48aa 100644
--- a/llvm/test/MC/ARM/directive-arm-thumb-alignment.s
+++ b/llvm/test/MC/ARM/directive-arm-thumb-alignment.s
@@ -10,12 +10,12 @@
 @ CHECK:      Num:    Value  Size Type    Bind   Vis      Ndx Name
 @ CHECK-NEXT:   0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
 @ CHECK-NEXT:   1: 00000001     0 FUNC    LOCAL  DEFAULT    2 aligned_thumb
-@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $t.0
+@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $t
 @ CHECK-NEXT:   3: 00000004     0 FUNC    LOCAL  DEFAULT    2 thumb_to_arm
-@ CHECK-NEXT:   4: 00000004     0 NOTYPE  LOCAL  DEFAULT    2 $a.1
-@ CHECK-NEXT:   5: 00000008     0 NOTYPE  LOCAL  DEFAULT    2 $d.2
+@ CHECK-NEXT:   4: 00000004     0 NOTYPE  LOCAL  DEFAULT    2 $a
+@ CHECK-NEXT:   5: 00000008     0 NOTYPE  LOCAL  DEFAULT    2 $d
 @ CHECK-NEXT:   6: 0000000b     0 FUNC    LOCAL  DEFAULT    2 unaligned_arm_to_thumb
-@ CHECK-NEXT:   7: 0000000a     0 NOTYPE  LOCAL  DEFAULT    2 $t.3
+@ CHECK-NEXT:   7: 0000000a     0 NOTYPE  LOCAL  DEFAULT    2 $t
 
 .thumb
 
diff --git a/llvm/test/MC/ARM/multi-section-mapping.s b/llvm/test/MC/ARM/multi-section-mapping.s
index 6107f262b0b8c..ed531306042aa 100644
--- a/llvm/test/MC/ARM/multi-section-mapping.s
+++ b/llvm/test/MC/ARM/multi-section-mapping.s
@@ -1,4 +1,4 @@
-@ RUN: llvm-mc -triple=armv7-linux-gnueabi -filetype=obj < %s | llvm-objdump -t - | FileCheck %s
+@ RUN: llvm-mc -triple=armv7-linux-gnueabi -filetype=obj < %s | llvm-objdump -t - | FileCheck %s --match-full-lines
 
         .text
         add r0, r0, r0
@@ -42,10 +42,10 @@
 @   + .starts_thumb to have $t at 0
 @   + .starts_data to have $d at 0
 
-@ CHECK:      00000000 l .text 00000000 $a.0
-@ CHECK-NEXT: 00000000 l .wibble 00000000 $a.1
-@ CHECK-NEXT: 00000000 l .starts_thumb 00000000 $t.2
-@ CHECK-NEXT: 00000008 l .text 00000000 $t.3
-@ CHECK-NEXT: 0000000a l .text 00000000 $d.4
+@ CHECK:      00000000 l .text 00000000 $a
+@ CHECK-NEXT: 00000000 l .wibble 00000000 $a
+@ CHECK-NEXT: 00000000 l .starts_thumb 00000000 $t
+@ CHECK-NEXT: 00000008 l .text 00000000 $t
+@ CHECK-NEXT: 0000000a l .text 00000000 $d
 @ CHECK-NOT: ${{[adt]}}
 
diff --git a/llvm/test/MC/ARM/thumb-function-address.s b/llvm/test/MC/ARM/thumb-function-address.s
index 753a049137bbf..d69dcb6724019 100644
--- a/llvm/test/MC/ARM/thumb-function-address.s
+++ b/llvm/test/MC/ARM/thumb-function-address.s
@@ -35,8 +35,8 @@ label:
 @ CHECK-NEXT: 00000000 0 NOTYPE LOCAL DEFAULT     UND
 @ CHECK-NEXT: 00000001 0 FUNC   LOCAL DEFAULT 2   func_label
 @ CHECK-NEXT: 00000001 0 FUNC   LOCAL DEFAULT 2   foo_impl
-@ CHECK-NEXT: 00000000 0 NOTYPE LOCAL DEFAULT 2   $t.0
+@ CHECK-NEXT: 00000000 0 NOTYPE LOCAL DEFAULT 2   $t
 @ CHECK-NEXT: 00000003 0 FUNC   LOCAL DEFAULT 2   foo_resolver
 @ CHECK-NEXT: 00000003 0 IFUNC  LOCAL DEFAULT 2   foo
 @ CHECK-NEXT: 00000004 0 FUNC   LOCAL DEFAULT 2   label
-@ CHECK-NEXT: 00000008 0 NOTYPE LOCAL DEFAULT 2   $a.1
+@ CHECK-NEXT: 00000008 0 NOTYPE LOCAL DEFAULT 2   $a
diff --git a/llvm/test/MC/ARM/thumb-types.s b/llvm/test/MC/ARM/thumb-types.s
index cb1b47e1fa7fb..b965cd8accf05 100644
--- a/llvm/test/MC/ARM/thumb-types.s
+++ b/llvm/test/MC/ARM/thumb-types.s
@@ -3,22 +3,22 @@
 @ CHECK:      Num:    Value  Size Type    Bind   Vis      Ndx Name
 @ CHECK-NEXT:   0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
 @ CHECK-NEXT:   1: 00000001     0 FUNC    LOCAL  DEFAULT    2 implicit_function
-@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $t.0
+@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $t
 @ CHECK-NEXT:   3: 00000002     0 OBJECT  LOCAL  DEFAULT    2 implicit_data
-@ CHECK-NEXT:   4: 00000002     0 NOTYPE  LOCAL  DEFAULT    2 $d.1
+@ CHECK-NEXT:   4: 00000002     0 NOTYPE  LOCAL  DEFAULT    2 $d
 @ CHECK-NEXT:   5: 00000008     0 FUNC    LOCAL  DEFAULT    2 arm_function
-@ CHECK-NEXT:   6: 00000008     0 NOTYPE  LOCAL  DEFAULT    2 $a.2
+@ CHECK-NEXT:   6: 00000008     0 NOTYPE  LOCAL  DEFAULT    2 $a
 @ CHECK-NEXT:   7: 0000000c     0 NOTYPE  LOCAL  DEFAULT    2 untyped_text_label
-@ CHECK-NEXT:   8: 0000000c     0 NOTYPE  LOCAL  DEFAULT    2 $t.3
+@ CHECK-NEXT:   8: 0000000c     0 NOTYPE  LOCAL  DEFAULT    2 $t
 @ CHECK-NEXT:   9: 0000000f     0 FUNC    LOCAL  DEFAULT    2 explicit_function
-@ CHECK-NEXT:  10: 00000010     0 NOTYPE  LOCAL  DEFAULT    2 $d.4
+@ CHECK-NEXT:  10: 00000010     0 NOTYPE  LOCAL  DEFAULT    2 $d
 @ CHECK-NEXT:  11: 00000000     4 TLS     LOCAL  DEFAULT    5 tls
 @ CHECK-NEXT:  12: 00000015     0 IFUNC   LOCAL  DEFAULT    2 indirect_function
-@ CHECK-NEXT:  13: 00000014     0 NOTYPE  LOCAL  DEFAULT    2 $t.5
+@ CHECK-NEXT:  13: 00000014     0 NOTYPE  LOCAL  DEFAULT    2 $t
 @ CHECK-NEXT:  14: 00000000     0 NOTYPE  LOCAL  DEFAULT    4 untyped_data_label
-@ CHECK-NEXT:  15: 00000000     0 NOTYPE  LOCAL  DEFAULT    4 $t.6
+@ CHECK-NEXT:  15: 00000000     0 NOTYPE  LOCAL  DEFAULT    4 $t
 @ CHECK-NEXT:  16: 00000002     0 OBJECT  LOCAL  DEFAULT    4 explicit_data
-@ CHECK-NEXT:  17: 00000002     0 NOTYPE  LOCAL  DEFAULT    4 $d.7
+@ CHECK-NEXT:  17: 00000002     0 NOTYPE  LOCAL  DEFAULT    4 $d
 
 
 	.syntax unified
diff --git a/llvm/test/MC/ARM/thumb_set.s b/llvm/test/MC/ARM/thumb_set.s
index 4bb7b599aaf11..836eb0b62e0fa 100644
--- a/llvm/test/MC/ARM/thumb_set.s
+++ b/llvm/test/MC/ARM/thumb_set.s
@@ -6,12 +6,12 @@
 @ CHECK:      Num:    Value  Size Type    Bind   Vis      Ndx Name
 @ CHECK-NEXT:   0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
 @ CHECK-NEXT:   1: 00000000     0 FUNC    LOCAL  DEFAULT    2 arm_func
-@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $a.0
+@ CHECK-NEXT:   2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $a
 @ CHECK-NEXT:   3: 00000001     0 FUNC    LOCAL  DEFAULT    2 alias_arm_func
 @ CHECK-NEXT:   4: 00000001     0 FUNC    LOCAL  DEFAULT    2 alias_arm_func2
 @ CHECK-NEXT:   5: 00000001     0 FUNC    LOCAL  DEFAULT    2 alias_arm_func3
 @ CHECK-NEXT:   6: 00000005     0 FUNC    LOCAL  DEFAULT    2 thumb_func
-@ CHECK-NEXT:   7: 00000004     0 NOTYPE  LOCAL  DEFAULT    2 $t.1
+@ CHECK-NEXT:   7: 00000004     0 NOTYPE  LOCAL  DEFAULT    2 $t
 @ CHECK-NEXT:   8: 00000005     0 FUNC    LOCAL  DEFAULT    2 alias_thumb_func
 @ CHECK-NEXT:   9: 5eed1e55     0 FUNC    LOCAL  DEFAULT  ABS seedless
 @ CHECK-NEXT:  10: e665a1ad     0 FUNC    LOCAL  DEFAULT  ABS eggsalad
diff --git a/llvm/test/MC/ELF/ARM/execute-only-section.s b/llvm/test/MC/ELF/ARM/execute-only-section.s
index 12020e030cc04..ac5e31f70dba0 100644
--- a/llvm/test/MC/ELF/ARM/execute-only-section.s
+++ b/llvm/test/MC/ELF/ARM/execute-only-section.s
@@ -18,7 +18,7 @@ foo:
 
 
 // CHECK:      Section {
-// CHECK:        Name: .text (16)
+// CHECK:        Name: .text
 // CHECK-NEXT:   Type: SHT_PROGBITS (0x1)
 // CHECK-NEXT:   Flags [ (0x20000006)
 // CHECK-NEXT:     SHF_ALLOC (0x2)
@@ -29,7 +29,7 @@ foo:
 // CHECK:      }
 
 // CHECK:      Section {
-// CHECK:        Name: .text (16)
+// CHECK:        Name: .text
 // CHECK-NEXT:   Type: SHT_PROGBITS (0x1)
 // CHECK-NEXT:   Flags [ (0x20000006)
 // CHECK-NEXT:     SHF_ALLOC (0x2)
@@ -40,6 +40,6 @@ foo:
 // CHECK:      }
 
 // CHECK: Symbol {
-// CHECK:   Name: foo (22)
+// CHECK:   Name: foo
 // CHECK:   Section: .text (0x3)
 // CHECK: }
diff --git a/llvm/test/tools/llvm-objdump/multiple-symbols.s b/llvm/test/tools/llvm-objdump/multiple-symbols.s
index 24c169e32147b..1b13f099ae98c 100644
--- a/llvm/test/tools/llvm-objdump/multiple-symbols.s
+++ b/llvm/test/tools/llvm-objdump/multiple-symbols.s
@@ -26,13 +26,13 @@
 
 @ HEAD:          Disassembly of section .text:
 @ HEAD-EMPTY:
-@ AMAP-NEXT:     00000000 <$a.0>:
+@ AMAP-NEXT:     00000000 <$a>:
 @ AAAA-NEXT:     00000000 <aaaa>:
 @ BBBB-NEXT:     00000000 <bbbb>:
 @ AABB-NEXT:            0: e0800080      add     r0, r0, r0, lsl #1
 @ AABB-NEXT:            4: e12fff1e      bx      lr
 @ BOTH-EMPTY:    
-@ TMAP-NEXT:     00000008 <$t.1>:
+@ TMAP-NEXT:     00000008 <$t>:
 @ CCCC-NEXT:     00000008 <cccc>:
 @ DDDD-NEXT:     00000008 <dddd>:
 @ CCDD-NEXT:            8: eb00 0080     add.w   r0, r0, r0, lsl #2

>From 2df1cec9db73697a04e733de39ab9ac63c7b2d9b Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Tue, 23 Jul 2024 18:59:23 +0200
Subject: [PATCH 23/91] [libc++][doc] Update the release notes for LLVM 19.
 (#99061)

This is a preparation for the upcoming LLVM 19 release.
---
 libcxx/docs/ReleaseNotes/19.rst    | 47 ++++++++++++++++++++++++------
 libcxx/docs/ReleaseNotes/20.rst    | 14 ++++++++-
 libcxx/docs/Status/SpecialMath.rst |  2 +-
 3 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/libcxx/docs/ReleaseNotes/19.rst b/libcxx/docs/ReleaseNotes/19.rst
index 439f552db59a8..c2c2bfbed4ac3 100644
--- a/libcxx/docs/ReleaseNotes/19.rst
+++ b/libcxx/docs/ReleaseNotes/19.rst
@@ -35,6 +35,20 @@ see the `releases page <https://llvm.org/releases/>`_.
 What's New in Libc++ 19.0.0?
 ==============================
 
+The main focus of the libc++ team has been to implement new C++20, C++23,
+and C++26 features.
+
+Experimental support for the time zone database has progressed.
+
+Work on the ranges support has progressed. See
+:ref:`ranges-status` for the current status.
+
+Work on the experimental C++17 Parallel STL has progressed. See
+:ref:`pstl-status` for the current status.
+
+Work on the C++17 mathematical special functions has started. See
+:ref:`special-math-status` for the current status.
+
 Implemented Papers
 ------------------
 
@@ -59,14 +73,21 @@ Implemented Papers
 - P0019R8 - ``std::atomic_ref``
 - P2389R2 - Alias template ``dims`` for the ``extents`` of ``mdspan``
 - P1223R5 - ``ranges::find_last()``, ``ranges::find_last_if()``, and ``ranges::find_last_if_not()``
+- P2602R2 - Poison Pills are Too Toxic
+- P1981R0 - Rename ``leap`` to ``leap_second``
+- P1982R0 - Rename ``link`` to ``time_zone_link``
+
 
 Improvements and New Features
 -----------------------------
 
 - The performance of growing ``std::vector`` has been improved for trivially relocatable types.
-- A lot of types are considered trivially relocatable now, including ``vector`` and ``string``.
-- The performance of ``ranges::fill`` and ``ranges::fill_n`` has been improved for ``vector<bool>::iterator``\s,
+
+- A lot of types are considered trivially relocatable now, including ``std::vector`` and ``std::string``.
+
+- The performance of ``std::ranges::fill`` and ``std::ranges::fill_n`` has been improved for ``std::vector<bool>::iterator``\s,
   resulting in a performance increase of up to 1400x.
+
 - The ``std::mismatch`` algorithm has been optimized for integral types, which can lead up to 40x performance
   improvements.
 
@@ -74,7 +95,7 @@ Improvements and New Features
   up to 100x.
 
 - The ``std::set_intersection`` and ``std::ranges::set_intersection`` algorithms have been optimized to fast-forward over
-  contiguous ranges of non-matching values, reducing the number of comparisons from linear to 
+  contiguous ranges of non-matching values, reducing the number of comparisons from linear to
   logarithmic growth with the number of elements in best-case scenarios.
 
 - The ``_LIBCPP_ENABLE_CXX26_REMOVED_STRSTREAM`` macro has been added to make the declarations in ``<strstream>`` available.
@@ -101,15 +122,18 @@ Improvements and New Features
 
   Note: bounded iterators currently are not supported for ``vector<bool>``.
 
+- In C++23 and C++26 the number of transitive includes in several headers has been reduced, improving the compilation speed.
+
+
 Deprecations and Removals
 -------------------------
 
-- The C++20 synchronization library (``<barrier>``, ``<latch>``, ``atomic::wait``, etc.) has been deprecated
+- The C++20 synchronization library (``<barrier>``, ``<latch>``, ``std::atomic::wait``, etc.) has been deprecated
   in language modes prior to C++20. If you are using these features prior to C++20, please update to ``-std=c++20``.
   In LLVM 20, the C++20 synchronization library will be removed entirely in language modes prior to C++20.
 
 - ``_LIBCPP_DISABLE_NODISCARD_EXT`` has been removed. ``[[nodiscard]]`` applications are now unconditional.
-  This decision is based on LEWGs discussion on `P3122 <https://wg21.link/P3122>` and `P3162 <https://wg21.link/P3162>`
+  This decision is based on LEWGs discussion on `P3122 <https://wg21.link/P3122>`_ and `P3162 <https://wg21.link/P3162>`_
   to not use ``[[nodiscard]]`` in the standard.
 
 - The ``LIBCXX_ENABLE_ASSERTIONS`` CMake variable that was used to enable the safe mode has been deprecated and setting
@@ -151,10 +175,11 @@ Deprecations and Removals
 - libc++ no longer supports ``std::allocator<const T>`` and containers of ``const``-qualified element type, such
   as ``std::vector<const T>`` and ``std::list<const T>``. This used to be supported as an undocumented extension.
   If you were using ``std::vector<const T>``, replace it with ``std::vector<T>`` instead. The
-  ``_LIBCPP_ENABLE_REMOVED_ALLOCATOR_CONST`` macro can be defined to temporarily re-enable this extension as
-  folks transition their code. This macro will be honored for one released and ignored starting in LLVM 20.
+  ``_LIBCPP_ENABLE_REMOVED_ALLOCATOR_CONST`` macro can be defined
+  to temporarily re-enable this extension to make it easier to update user code.
+  This macro will be honored for one released and ignored starting in LLVM 20.
   To assist with the clean-up process, consider running your code through Clang Tidy, with
-  `std-allocator-const <https://clang.llvm.org/extra/clang-tidy/checks/portability/std-allocator-const.html>`
+  `std-allocator-const <https://clang.llvm.org/extra/clang-tidy/checks/portability/std-allocator-const.html>`_
   enabled.
 
 - When configuring libc++ with localization or threads disabled, the library no longer emits an error when
@@ -187,6 +212,9 @@ LLVM 20
   ``_LIBCPP_ENABLE_REMOVED_WEEKDAY_RELATIONAL_OPERATORS`` macro that was used to re-enable this extension will be
   ignored in LLVM 20.
 
+- The ``_LIBCPP_ENABLE_REMOVED_ALLOCATOR_CONST`` macro will no longer have an effect.
+
+
 LLVM 21
 ~~~~~~~
 
@@ -197,6 +225,7 @@ LLVM 21
 
   If you are using C++03 in your project, you should consider moving to a newer version of the Standard to get the most out of libc++.
 
+
 ABI Affecting Changes
 ---------------------
 
@@ -211,7 +240,7 @@ Build System Changes
 - The ``LIBCXX_EXECUTOR`` and ``LIBCXXABI_EXECUTOR`` CMake variables have been removed. Please
   set ``LIBCXX_TEST_PARAMS`` to ``executor=<...>`` instead.
 
-- The Cmake variable ``LIBCXX_ENABLE_CLANG_TIDY`` has been removed. The build system has been changed
+- The CMake variable ``LIBCXX_ENABLE_CLANG_TIDY`` has been removed. The build system has been changed
   to automatically detect the presence of ``clang-tidy`` and the required ``Clang`` libraries.
 
 - The CMake options ``LIBCXX_INSTALL_MODULES`` now defaults to ``ON``.
diff --git a/libcxx/docs/ReleaseNotes/20.rst b/libcxx/docs/ReleaseNotes/20.rst
index fb677b1667ddc..f959c8829277e 100644
--- a/libcxx/docs/ReleaseNotes/20.rst
+++ b/libcxx/docs/ReleaseNotes/20.rst
@@ -59,16 +59,28 @@ Deprecations and Removals
   ``_LIBCPP_ENABLE_REMOVED_WEEKDAY_RELATIONAL_OPERATORS`` macro that was used to re-enable this extension will be
   ignored in LLVM 20.
 
+- TODO: The ``_LIBCPP_ENABLE_REMOVED_ALLOCATOR_CONST`` macro will no longer have an effect.
 
 Upcoming Deprecations and Removals
 ----------------------------------
 
-LLVM 21
+LLVM 20
 ~~~~~~~
 
 - TODO
 
 
+LLVM 21
+~~~~~~~
+
+- The status of the C++03 implementation will be frozen after the LLVM 21 release. This means that starting in LLVM 22, non-critical bug fixes may not be back-ported
+  to C++03, including LWG issues. C++03 is a legacy platform, where most projects are no longer actively maintained. To
+  reduce the amount of fixes required to keep such legacy projects compiling with up-to-date toolchains, libc++ will aim to freeze the status of the headers in C++03 mode to avoid unintended breaking changes.
+  See https://discourse.llvm.org/t/rfc-freezing-c-03-headers-in-libc for more details.
+
+  If you are using C++03 in your project, you should consider moving to a newer version of the Standard to get the most out of libc++.
+
+
 ABI Affecting Changes
 ---------------------
 
diff --git a/libcxx/docs/Status/SpecialMath.rst b/libcxx/docs/Status/SpecialMath.rst
index fcc9f03e3ae64..46e5c97cdaab2 100644
--- a/libcxx/docs/Status/SpecialMath.rst
+++ b/libcxx/docs/Status/SpecialMath.rst
@@ -1,4 +1,4 @@
-.. special-math-status:
+.. _special-math-status:
 
 ======================================================
 libc++ Mathematical Special Functions Status (P0226R1)

>From 3120547296c558634261ec944d7846be56eba306 Mon Sep 17 00:00:00 2001
From: Ian Anderson <iana at apple.com>
Date: Tue, 23 Jul 2024 13:02:59 -0700
Subject: [PATCH 24/91] [clang][headers] Including stddef.h always redefines
 NULL (#99727)

stddef.h always includes __stddef_null.h. This is fine in modules
because it's not possible to re-include the pcm, and it's necessary to
export the _Builtin_stddef.null submodule. However, without modules it
causes NULL to always get redefined which disrupts some C++ code. Rework
the inclusion of __stddef_null.h so that with not building with modules
it's only included if __need_NULL is set by the includer, or it's the
first time stddef.h is being included.

(cherry picked from commit 92a9d4831d5e40c286247c30fcd794563adbef6e)
---
 clang/lib/Headers/stdarg.h         |  4 +-
 clang/lib/Headers/stddef.h         | 21 ++++++++-
 clang/test/Headers/stddefneeds.cpp | 15 ++++--
 clang/test/Modules/stddef.cpp      | 73 ++++++++++++++++++++++++++++++
 4 files changed, 105 insertions(+), 8 deletions(-)
 create mode 100644 clang/test/Modules/stddef.cpp

diff --git a/clang/lib/Headers/stdarg.h b/clang/lib/Headers/stdarg.h
index 8292ab907becf..6203d7a600a23 100644
--- a/clang/lib/Headers/stdarg.h
+++ b/clang/lib/Headers/stdarg.h
@@ -20,19 +20,18 @@
  * modules.
  */
 #if defined(__MVS__) && __has_include_next(<stdarg.h>)
-#include <__stdarg_header_macro.h>
 #undef __need___va_list
 #undef __need_va_list
 #undef __need_va_arg
 #undef __need___va_copy
 #undef __need_va_copy
+#include <__stdarg_header_macro.h>
 #include_next <stdarg.h>
 
 #else
 #if !defined(__need___va_list) && !defined(__need_va_list) &&                  \
     !defined(__need_va_arg) && !defined(__need___va_copy) &&                   \
     !defined(__need_va_copy)
-#include <__stdarg_header_macro.h>
 #define __need___va_list
 #define __need_va_list
 #define __need_va_arg
@@ -45,6 +44,7 @@
     !defined(__STRICT_ANSI__)
 #define __need_va_copy
 #endif
+#include <__stdarg_header_macro.h>
 #endif
 
 #ifdef __need___va_list
diff --git a/clang/lib/Headers/stddef.h b/clang/lib/Headers/stddef.h
index 8985c526e8fc5..99b275aebf5aa 100644
--- a/clang/lib/Headers/stddef.h
+++ b/clang/lib/Headers/stddef.h
@@ -20,7 +20,6 @@
  * modules.
  */
 #if defined(__MVS__) && __has_include_next(<stddef.h>)
-#include <__stddef_header_macro.h>
 #undef __need_ptrdiff_t
 #undef __need_size_t
 #undef __need_rsize_t
@@ -31,6 +30,7 @@
 #undef __need_max_align_t
 #undef __need_offsetof
 #undef __need_wint_t
+#include <__stddef_header_macro.h>
 #include_next <stddef.h>
 
 #else
@@ -40,7 +40,6 @@
     !defined(__need_NULL) && !defined(__need_nullptr_t) &&                     \
     !defined(__need_unreachable) && !defined(__need_max_align_t) &&            \
     !defined(__need_offsetof) && !defined(__need_wint_t)
-#include <__stddef_header_macro.h>
 #define __need_ptrdiff_t
 #define __need_size_t
 /* ISO9899:2011 7.20 (C11 Annex K): Define rsize_t if __STDC_WANT_LIB_EXT1__ is
@@ -49,7 +48,24 @@
 #define __need_rsize_t
 #endif
 #define __need_wchar_t
+#if !defined(__STDDEF_H) || __has_feature(modules)
+/*
+ * __stddef_null.h is special when building without modules: if __need_NULL is
+ * set, then it will unconditionally redefine NULL. To avoid stepping on client
+ * definitions of NULL, __need_NULL should only be set the first time this
+ * header is included, that is when __STDDEF_H is not defined. However, when
+ * building with modules, this header is a textual header and needs to
+ * unconditionally include __stdef_null.h to support multiple submodules
+ * exporting _Builtin_stddef.null. Take module SM with submodules A and B, whose
+ * headers both include stddef.h When SM.A builds, __STDDEF_H will be defined.
+ * When SM.B builds, the definition from SM.A will leak when building without
+ * local submodule visibility. stddef.h wouldn't include __stddef_null.h, and
+ * SM.B wouldn't import _Builtin_stddef.null, and SM.B's `export *` wouldn't
+ * export NULL as expected. When building with modules, always include
+ * __stddef_null.h so that everything works as expected.
+ */
 #define __need_NULL
+#endif
 #if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L) ||              \
     defined(__cplusplus)
 #define __need_nullptr_t
@@ -65,6 +81,7 @@
 /* wint_t is provided by <wchar.h> and not <stddef.h>. It's here
  * for compatibility, but must be explicitly requested. Therefore
  * __need_wint_t is intentionally not defined here. */
+#include <__stddef_header_macro.h>
 #endif
 
 #if defined(__need_ptrdiff_t)
diff --git a/clang/test/Headers/stddefneeds.cpp b/clang/test/Headers/stddefneeds.cpp
index 0763bbdee13ae..0282e8afa600d 100644
--- a/clang/test/Headers/stddefneeds.cpp
+++ b/clang/test/Headers/stddefneeds.cpp
@@ -56,14 +56,21 @@ max_align_t m5;
 #undef NULL
 #define NULL 0
 
-// glibc (and other) headers then define __need_NULL and rely on stddef.h
-// to redefine NULL to the correct value again.
-#define __need_NULL
+// Including stddef.h again shouldn't redefine NULL
 #include <stddef.h>
 
 // gtk headers then use __attribute__((sentinel)), which doesn't work if NULL
 // is 0.
-void f(const char* c, ...) __attribute__((sentinel));
+void f(const char* c, ...) __attribute__((sentinel)); // expected-note{{function has been explicitly marked sentinel here}}
 void g() {
+  f("", NULL); // expected-warning{{missing sentinel in function call}}
+}
+
+// glibc (and other) headers then define __need_NULL and rely on stddef.h
+// to redefine NULL to the correct value again.
+#define __need_NULL
+#include <stddef.h>
+
+void h() {
   f("", NULL);  // Shouldn't warn.
 }
diff --git a/clang/test/Modules/stddef.cpp b/clang/test/Modules/stddef.cpp
new file mode 100644
index 0000000000000..c53bfa3485194
--- /dev/null
+++ b/clang/test/Modules/stddef.cpp
@@ -0,0 +1,73 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/no-lsv -I%t %t/stddef.cpp -verify
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-local-submodule-visibility -fmodules-cache-path=%t/lsv -I%t %t/stddef.cpp -verify
+
+//--- stddef.cpp
+#include <b.h>
+
+void *pointer = NULL;
+size_t size = 0;
+
+// When building with modules, a pcm is never re-imported, so re-including
+// stddef.h will not re-import _Builtin_stddef.null to restore the definition of
+// NULL, even though stddef.h will unconditionally include __stddef_null.h when
+// building with modules.
+#undef NULL
+#include <stddef.h>
+
+void *anotherPointer = NULL; // expected-error{{use of undeclared identifier 'NULL'}}
+
+// stddef.h needs to be a `textual` header to support clients doing things like
+// this.
+//
+// #define __need_NULL
+// #include <stddef.h>
+//
+// As a textual header designed to be included multiple times, it can't directly
+// declare anything, or those declarations would go into every module that
+// included it. e.g. if stddef.h contained all of its declarations, and modules
+// A and B included stddef.h, they would both have the declaration for size_t.
+// That breaks Swift, which uses the module name as part of the type name, i.e.
+// A.size_t and B.size_t are treated as completely different types in Swift and
+// cannot be interchanged. To fix that, stddef.h (and stdarg.h) are split out
+// into a separate file per __need macro that can be normal headers in explicit
+// submodules. That runs into yet another wrinkle though. When modules build,
+// declarations from previous submodules leak into subsequent ones when not
+// using local submodule visibility. Consider if stddef.h did the normal thing.
+//
+// #ifndef __STDDEF_H
+// #define __STDDEF_H
+// // include all of the sub-headers
+// #endif
+//
+// When SM builds without local submodule visibility, it will precompile a.h
+// first. When it gets to b.h, the __STDDEF_H declaration from precompiling a.h
+// will leak, and so when b.h includes stddef.h, it won't include any of its
+// sub-headers, and SM.B will thus not import _Builtin_stddef or make any of its
+// submodules visible. Precompiling b.h will be fine since it sees all of the
+// declarations from a.h including stddef.h, but clients that only include b.h
+// will not see any of the stddef.h types. stddef.h thus has to make sure to
+// always include the necessary sub-headers, even if they've been included
+// already. They all have their own header guards to allow this.
+// __stddef_null.h is extra special, so this test makes sure to cover NULL plus
+// one of the normal stddef.h types.
+
+//--- module.modulemap
+module SM {
+  module A {
+    header "a.h"
+    export *
+  }
+
+  module B {
+    header "b.h"
+    export *
+  }
+}
+
+//--- a.h
+#include <stddef.h>
+
+//--- b.h
+#include <stddef.h>

>From c5f34876cd00c01bf5ecd38a2bfc5377867e4b04 Mon Sep 17 00:00:00 2001
From: Wesley Wiser <wwiser at gmail.com>
Date: Tue, 23 Jul 2024 11:43:30 -0500
Subject: [PATCH 25/91] [LLVM] [MC] Update frame layout & CFI generation to
 handle frames larger than 2gb (#99263)

Rebase of #84114. I've only included the core changes to frame layout
calculation & CFI generation which sidesteps the regressions found after
merging #84114. Since these changes are a necessary precursor to the
overall fix and are themselves slightly beneficial as CFI is now
generated correctly, I think it is reasonable to merge this first step.

---

For very large stack frames, the offset from the stack pointer to a
local can be more than 2^31 which overflows various `int` offsets in the
frame lowering code.

This patch updates the frame lowering code to calculate the offsets as
64-bit values and fixes CFI to use the corrected sizes.

After this patch, additional work is needed to fix offset truncations in
each target's codegen.

(cherry picked from commit ca076f7a63f6a80e2e38315ec462be354b196b8d)
---
 llvm/include/llvm/CodeGen/MachineFrameInfo.h  | 14 +++---
 .../llvm/CodeGen/TargetFrameLowering.h        |  4 +-
 llvm/include/llvm/MC/MCAsmBackend.h           |  2 +-
 llvm/include/llvm/MC/MCDwarf.h                | 44 +++++++++----------
 llvm/lib/CodeGen/CFIInstrInserter.cpp         | 10 ++---
 llvm/lib/CodeGen/MachineFrameInfo.cpp         |  2 +-
 llvm/lib/CodeGen/PrologEpilogInserter.cpp     |  4 +-
 llvm/lib/MC/MCDwarf.cpp                       |  6 +--
 .../MCTargetDesc/AArch64AsmBackend.cpp        |  8 ++--
 llvm/lib/Target/ARM/ARMFrameLowering.cpp      |  4 +-
 .../Target/ARM/MCTargetDesc/ARMAsmBackend.cpp |  2 +-
 .../ARM/MCTargetDesc/ARMAsmBackendDarwin.h    |  2 +-
 .../Target/Hexagon/HexagonFrameLowering.cpp   |  4 +-
 .../lib/Target/MSP430/MSP430FrameLowering.cpp |  2 +-
 .../Target/X86/MCTargetDesc/X86AsmBackend.cpp | 12 ++---
 llvm/lib/Target/X86/X86FrameLowering.cpp      |  4 +-
 llvm/test/CodeGen/PowerPC/huge-frame-size.ll  |  2 +-
 llvm/test/CodeGen/RISCV/pr88365.ll            |  2 +-
 llvm/test/CodeGen/X86/huge-stack.ll           |  2 +-
 19 files changed, 65 insertions(+), 65 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineFrameInfo.h b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
index 466fed7fb3a29..213b7ec6b3fbf 100644
--- a/llvm/include/llvm/CodeGen/MachineFrameInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineFrameInfo.h
@@ -251,7 +251,7 @@ class MachineFrameInfo {
   /// targets, this value is only used when generating debug info (via
   /// TargetRegisterInfo::getFrameIndexReference); when generating code, the
   /// corresponding adjustments are performed directly.
-  int OffsetAdjustment = 0;
+  int64_t OffsetAdjustment = 0;
 
   /// The prolog/epilog code inserter may process objects that require greater
   /// alignment than the default alignment the target provides.
@@ -280,7 +280,7 @@ class MachineFrameInfo {
   /// setup/destroy pseudo instructions (as defined in the TargetFrameInfo
   /// class).  This information is important for frame pointer elimination.
   /// It is only valid during and after prolog/epilog code insertion.
-  unsigned MaxCallFrameSize = ~0u;
+  uint64_t MaxCallFrameSize = ~UINT64_C(0);
 
   /// The number of bytes of callee saved registers that the target wants to
   /// report for the current function in the CodeView S_FRAMEPROC record.
@@ -593,10 +593,10 @@ class MachineFrameInfo {
   uint64_t estimateStackSize(const MachineFunction &MF) const;
 
   /// Return the correction for frame offsets.
-  int getOffsetAdjustment() const { return OffsetAdjustment; }
+  int64_t getOffsetAdjustment() const { return OffsetAdjustment; }
 
   /// Set the correction for frame offsets.
-  void setOffsetAdjustment(int Adj) { OffsetAdjustment = Adj; }
+  void setOffsetAdjustment(int64_t Adj) { OffsetAdjustment = Adj; }
 
   /// Return the alignment in bytes that this function must be aligned to,
   /// which is greater than the default stack alignment provided by the target.
@@ -663,7 +663,7 @@ class MachineFrameInfo {
   /// CallFrameSetup/Destroy pseudo instructions are used by the target, and
   /// then only during or after prolog/epilog code insertion.
   ///
-  unsigned getMaxCallFrameSize() const {
+  uint64_t getMaxCallFrameSize() const {
     // TODO: Enable this assert when targets are fixed.
     //assert(isMaxCallFrameSizeComputed() && "MaxCallFrameSize not computed yet");
     if (!isMaxCallFrameSizeComputed())
@@ -671,9 +671,9 @@ class MachineFrameInfo {
     return MaxCallFrameSize;
   }
   bool isMaxCallFrameSizeComputed() const {
-    return MaxCallFrameSize != ~0u;
+    return MaxCallFrameSize != ~UINT64_C(0);
   }
-  void setMaxCallFrameSize(unsigned S) { MaxCallFrameSize = S; }
+  void setMaxCallFrameSize(uint64_t S) { MaxCallFrameSize = S; }
 
   /// Returns how many bytes of callee-saved registers the target pushed in the
   /// prologue. Only used for debug info.
diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index 0b9cacecc7cbe..72978b2f746d7 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -51,7 +51,7 @@ class TargetFrameLowering {
   // Maps a callee saved register to a stack slot with a fixed offset.
   struct SpillSlot {
     unsigned Reg;
-    int Offset; // Offset relative to stack pointer on function entry.
+    int64_t Offset; // Offset relative to stack pointer on function entry.
   };
 
   struct DwarfFrameBase {
@@ -66,7 +66,7 @@ class TargetFrameLowering {
       // Used with FrameBaseKind::Register.
       unsigned Reg;
       // Used with FrameBaseKind::CFA.
-      int Offset;
+      int64_t Offset;
       struct WasmFrameBase WasmLoc;
     } Location;
   };
diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index 736f44686689b..d1d1814dd8b52 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -225,7 +225,7 @@ class MCAsmBackend {
   virtual void handleAssemblerFlag(MCAssemblerFlag Flag) {}
 
   /// Generate the compact unwind encoding for the CFI instructions.
-  virtual uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  virtual uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                                  const MCContext *Ctxt) const {
     return 0;
   }
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index d0e45ab59a92e..7dba67efa22fa 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -509,11 +509,11 @@ class MCCFIInstruction {
   union {
     struct {
       unsigned Register;
-      int Offset;
+      int64_t Offset;
     } RI;
     struct {
       unsigned Register;
-      int Offset;
+      int64_t Offset;
       unsigned AddressSpace;
     } RIA;
     struct {
@@ -527,7 +527,7 @@ class MCCFIInstruction {
   std::vector<char> Values;
   std::string Comment;
 
-  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int O, SMLoc Loc,
+  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int64_t O, SMLoc Loc,
                    StringRef V = "", StringRef Comment = "")
       : Label(L), Operation(Op), Loc(Loc), Values(V.begin(), V.end()),
         Comment(Comment) {
@@ -539,7 +539,7 @@ class MCCFIInstruction {
     assert(Op == OpRegister);
     U.RR = {R1, R2};
   }
-  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int O, unsigned AS,
+  MCCFIInstruction(OpType Op, MCSymbol *L, unsigned R, int64_t O, unsigned AS,
                    SMLoc Loc)
       : Label(L), Operation(Op), Loc(Loc) {
     assert(Op == OpLLVMDefAspaceCfa);
@@ -555,8 +555,8 @@ class MCCFIInstruction {
 public:
   /// .cfi_def_cfa defines a rule for computing CFA as: take address from
   /// Register and add Offset to it.
-  static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, int Offset,
-                                    SMLoc Loc = {}) {
+  static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register,
+                                    int64_t Offset, SMLoc Loc = {}) {
     return MCCFIInstruction(OpDefCfa, L, Register, Offset, Loc);
   }
 
@@ -564,13 +564,13 @@ class MCCFIInstruction {
   /// on Register will be used instead of the old one. Offset remains the same.
   static MCCFIInstruction createDefCfaRegister(MCSymbol *L, unsigned Register,
                                                SMLoc Loc = {}) {
-    return MCCFIInstruction(OpDefCfaRegister, L, Register, 0, Loc);
+    return MCCFIInstruction(OpDefCfaRegister, L, Register, INT64_C(0), Loc);
   }
 
   /// .cfi_def_cfa_offset modifies a rule for computing CFA. Register
   /// remains the same, but offset is new. Note that it is the absolute offset
   /// that will be added to a defined register to the compute CFA address.
-  static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int Offset,
+  static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int64_t Offset,
                                           SMLoc Loc = {}) {
     return MCCFIInstruction(OpDefCfaOffset, L, 0, Offset, Loc);
   }
@@ -578,7 +578,7 @@ class MCCFIInstruction {
   /// .cfi_adjust_cfa_offset Same as .cfi_def_cfa_offset, but
   /// Offset is a relative value that is added/subtracted from the previous
   /// offset.
-  static MCCFIInstruction createAdjustCfaOffset(MCSymbol *L, int Adjustment,
+  static MCCFIInstruction createAdjustCfaOffset(MCSymbol *L, int64_t Adjustment,
                                                 SMLoc Loc = {}) {
     return MCCFIInstruction(OpAdjustCfaOffset, L, 0, Adjustment, Loc);
   }
@@ -588,7 +588,7 @@ class MCCFIInstruction {
   /// be the result of evaluating the DWARF operation expression
   /// `DW_OP_constu AS; DW_OP_aspace_bregx R, B` as a location description.
   static MCCFIInstruction createLLVMDefAspaceCfa(MCSymbol *L, unsigned Register,
-                                                 int Offset,
+                                                 int64_t Offset,
                                                  unsigned AddressSpace,
                                                  SMLoc Loc) {
     return MCCFIInstruction(OpLLVMDefAspaceCfa, L, Register, Offset,
@@ -598,7 +598,7 @@ class MCCFIInstruction {
   /// .cfi_offset Previous value of Register is saved at offset Offset
   /// from CFA.
   static MCCFIInstruction createOffset(MCSymbol *L, unsigned Register,
-                                       int Offset, SMLoc Loc = {}) {
+                                       int64_t Offset, SMLoc Loc = {}) {
     return MCCFIInstruction(OpOffset, L, Register, Offset, Loc);
   }
 
@@ -606,7 +606,7 @@ class MCCFIInstruction {
   /// Offset from the current CFA register. This is transformed to .cfi_offset
   /// using the known displacement of the CFA register from the CFA.
   static MCCFIInstruction createRelOffset(MCSymbol *L, unsigned Register,
-                                          int Offset, SMLoc Loc = {}) {
+                                          int64_t Offset, SMLoc Loc = {}) {
     return MCCFIInstruction(OpRelOffset, L, Register, Offset, Loc);
   }
 
@@ -619,12 +619,12 @@ class MCCFIInstruction {
 
   /// .cfi_window_save SPARC register window is saved.
   static MCCFIInstruction createWindowSave(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpWindowSave, L, 0, 0, Loc);
+    return MCCFIInstruction(OpWindowSave, L, 0, INT64_C(0), Loc);
   }
 
   /// .cfi_negate_ra_state AArch64 negate RA state.
   static MCCFIInstruction createNegateRAState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpNegateRAState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpNegateRAState, L, 0, INT64_C(0), Loc);
   }
 
   /// .cfi_restore says that the rule for Register is now the same as it
@@ -632,31 +632,31 @@ class MCCFIInstruction {
   /// by .cfi_startproc were executed.
   static MCCFIInstruction createRestore(MCSymbol *L, unsigned Register,
                                         SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRestore, L, Register, 0, Loc);
+    return MCCFIInstruction(OpRestore, L, Register, INT64_C(0), Loc);
   }
 
   /// .cfi_undefined From now on the previous value of Register can't be
   /// restored anymore.
   static MCCFIInstruction createUndefined(MCSymbol *L, unsigned Register,
                                           SMLoc Loc = {}) {
-    return MCCFIInstruction(OpUndefined, L, Register, 0, Loc);
+    return MCCFIInstruction(OpUndefined, L, Register, INT64_C(0), Loc);
   }
 
   /// .cfi_same_value Current value of Register is the same as in the
   /// previous frame. I.e., no restoration is needed.
   static MCCFIInstruction createSameValue(MCSymbol *L, unsigned Register,
                                           SMLoc Loc = {}) {
-    return MCCFIInstruction(OpSameValue, L, Register, 0, Loc);
+    return MCCFIInstruction(OpSameValue, L, Register, INT64_C(0), Loc);
   }
 
   /// .cfi_remember_state Save all current rules for all registers.
   static MCCFIInstruction createRememberState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRememberState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpRememberState, L, 0, INT64_C(0), Loc);
   }
 
   /// .cfi_restore_state Restore the previously saved state.
   static MCCFIInstruction createRestoreState(MCSymbol *L, SMLoc Loc = {}) {
-    return MCCFIInstruction(OpRestoreState, L, 0, 0, Loc);
+    return MCCFIInstruction(OpRestoreState, L, 0, INT64_C(0), Loc);
   }
 
   /// .cfi_escape Allows the user to add arbitrary bytes to the unwind
@@ -667,7 +667,7 @@ class MCCFIInstruction {
   }
 
   /// A special wrapper for .cfi_escape that indicates GNU_ARGS_SIZE
-  static MCCFIInstruction createGnuArgsSize(MCSymbol *L, int Size,
+  static MCCFIInstruction createGnuArgsSize(MCSymbol *L, int64_t Size,
                                             SMLoc Loc = {}) {
     return MCCFIInstruction(OpGnuArgsSize, L, 0, Size, Loc);
   }
@@ -702,7 +702,7 @@ class MCCFIInstruction {
     return U.RIA.AddressSpace;
   }
 
-  int getOffset() const {
+  int64_t getOffset() const {
     if (Operation == OpLLVMDefAspaceCfa)
       return U.RIA.Offset;
     assert(Operation == OpDefCfa || Operation == OpOffset ||
@@ -736,7 +736,7 @@ struct MCDwarfFrameInfo {
   unsigned CurrentCfaRegister = 0;
   unsigned PersonalityEncoding = 0;
   unsigned LsdaEncoding = 0;
-  uint32_t CompactUnwindEncoding = 0;
+  uint64_t CompactUnwindEncoding = 0;
   bool IsSignalFrame = false;
   bool IsSimple = false;
   unsigned RAReg = static_cast<unsigned>(INT_MAX);
diff --git a/llvm/lib/CodeGen/CFIInstrInserter.cpp b/llvm/lib/CodeGen/CFIInstrInserter.cpp
index 1ff01ad34b30e..06de92515c044 100644
--- a/llvm/lib/CodeGen/CFIInstrInserter.cpp
+++ b/llvm/lib/CodeGen/CFIInstrInserter.cpp
@@ -68,9 +68,9 @@ class CFIInstrInserter : public MachineFunctionPass {
   struct MBBCFAInfo {
     MachineBasicBlock *MBB;
     /// Value of cfa offset valid at basic block entry.
-    int IncomingCFAOffset = -1;
+    int64_t IncomingCFAOffset = -1;
     /// Value of cfa offset valid at basic block exit.
-    int OutgoingCFAOffset = -1;
+    int64_t OutgoingCFAOffset = -1;
     /// Value of cfa register valid at basic block entry.
     unsigned IncomingCFARegister = 0;
     /// Value of cfa register valid at basic block exit.
@@ -120,7 +120,7 @@ class CFIInstrInserter : public MachineFunctionPass {
   /// Return the cfa offset value that should be set at the beginning of a MBB
   /// if needed. The negated value is needed when creating CFI instructions that
   /// set absolute offset.
-  int getCorrectCFAOffset(MachineBasicBlock *MBB) {
+  int64_t getCorrectCFAOffset(MachineBasicBlock *MBB) {
     return MBBVector[MBB->getNumber()].IncomingCFAOffset;
   }
 
@@ -175,7 +175,7 @@ void CFIInstrInserter::calculateCFAInfo(MachineFunction &MF) {
 
 void CFIInstrInserter::calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo) {
   // Outgoing cfa offset set by the block.
-  int SetOffset = MBBInfo.IncomingCFAOffset;
+  int64_t SetOffset = MBBInfo.IncomingCFAOffset;
   // Outgoing cfa register set by the block.
   unsigned SetRegister = MBBInfo.IncomingCFARegister;
   MachineFunction *MF = MBBInfo.MBB->getParent();
@@ -188,7 +188,7 @@ void CFIInstrInserter::calculateOutgoingCFAInfo(MBBCFAInfo &MBBInfo) {
   for (MachineInstr &MI : *MBBInfo.MBB) {
     if (MI.isCFIInstruction()) {
       std::optional<unsigned> CSRReg;
-      std::optional<int> CSROffset;
+      std::optional<int64_t> CSROffset;
       unsigned CFIIndex = MI.getOperand(0).getCFIIndex();
       const MCCFIInstruction &CFI = Instrs[CFIIndex];
       switch (CFI.getOperation()) {
diff --git a/llvm/lib/CodeGen/MachineFrameInfo.cpp b/llvm/lib/CodeGen/MachineFrameInfo.cpp
index 853de4c88caeb..e4b993850f73d 100644
--- a/llvm/lib/CodeGen/MachineFrameInfo.cpp
+++ b/llvm/lib/CodeGen/MachineFrameInfo.cpp
@@ -197,7 +197,7 @@ void MachineFrameInfo::computeMaxCallFrameSize(
     for (MachineInstr &MI : MBB) {
       unsigned Opcode = MI.getOpcode();
       if (Opcode == FrameSetupOpcode || Opcode == FrameDestroyOpcode) {
-        unsigned Size = TII.getFrameSize(MI);
+        uint64_t Size = TII.getFrameSize(MI);
         MaxCallFrameSize = std::max(MaxCallFrameSize, Size);
         if (FrameSDOps != nullptr)
           FrameSDOps->push_back(&MI);
diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index 3db5e17615fd4..cd5d877e53d82 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -366,8 +366,8 @@ void PEI::calculateCallFrameInfo(MachineFunction &MF) {
     return;
 
   // (Re-)Compute the MaxCallFrameSize.
-  [[maybe_unused]] uint32_t MaxCFSIn =
-      MFI.isMaxCallFrameSizeComputed() ? MFI.getMaxCallFrameSize() : UINT32_MAX;
+  [[maybe_unused]] uint64_t MaxCFSIn =
+      MFI.isMaxCallFrameSizeComputed() ? MFI.getMaxCallFrameSize() : UINT64_MAX;
   std::vector<MachineBasicBlock::iterator> FrameSDOps;
   MFI.computeMaxCallFrameSize(MF, &FrameSDOps);
   assert(MFI.getMaxCallFrameSize() <= MaxCFSIn &&
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index efafd555c5c5c..1297dc3828b58 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -1299,8 +1299,8 @@ static void EmitPersonality(MCStreamer &streamer, const MCSymbol &symbol,
 namespace {
 
 class FrameEmitterImpl {
-  int CFAOffset = 0;
-  int InitialCFAOffset = 0;
+  int64_t CFAOffset = 0;
+  int64_t InitialCFAOffset = 0;
   bool IsEH;
   MCObjectStreamer &Streamer;
 
@@ -1414,7 +1414,7 @@ void FrameEmitterImpl::emitCFIInstruction(const MCCFIInstruction &Instr) {
     if (!IsEH)
       Reg = MRI->getDwarfRegNumFromDwarfEHRegNum(Reg);
 
-    int Offset = Instr.getOffset();
+    int64_t Offset = Instr.getOffset();
     if (IsRelative)
       Offset -= CFAOffset;
     Offset = Offset / dataAlignmentFactor;
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
index be470c71ae8b6..be34a649e1c4b 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
@@ -599,7 +599,7 @@ class DarwinAArch64AsmBackend : public AArch64AsmBackend {
   }
 
   /// Generate the compact unwind encoding from the CFI directives.
-  uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                          const MCContext *Ctxt) const override {
     ArrayRef<MCCFIInstruction> Instrs = FI->Instructions;
     if (Instrs.empty())
@@ -609,10 +609,10 @@ class DarwinAArch64AsmBackend : public AArch64AsmBackend {
       return CU::UNWIND_ARM64_MODE_DWARF;
 
     bool HasFP = false;
-    unsigned StackSize = 0;
+    uint64_t StackSize = 0;
 
-    uint32_t CompactUnwindEncoding = 0;
-    int CurOffset = 0;
+    uint64_t CompactUnwindEncoding = 0;
+    int64_t CurOffset = 0;
     for (size_t i = 0, e = Instrs.size(); i != e; ++i) {
       const MCCFIInstruction &Inst = Instrs[i];
 
diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
index 62d01b9f7e90b..40354f9955989 100644
--- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp
@@ -1166,7 +1166,7 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
         if (STI.splitFramePushPop(MF)) {
           unsigned DwarfReg = MRI->getDwarfRegNum(
               Reg == ARM::R12 ? ARM::RA_AUTH_CODE : Reg, true);
-          unsigned Offset = MFI.getObjectOffset(FI);
+          int64_t Offset = MFI.getObjectOffset(FI);
           unsigned CFIIndex = MF.addFrameInst(
               MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
           BuildMI(MBB, Pos, dl, TII.get(TargetOpcode::CFI_INSTRUCTION))
@@ -1188,7 +1188,7 @@ void ARMFrameLowering::emitPrologue(MachineFunction &MF,
       if ((Reg >= ARM::D0 && Reg <= ARM::D31) &&
           (Reg < ARM::D8 || Reg >= ARM::D8 + AFI->getNumAlignedDPRCS2Regs())) {
         unsigned DwarfReg = MRI->getDwarfRegNum(Reg, true);
-        unsigned Offset = MFI.getObjectOffset(FI);
+        int64_t Offset = MFI.getObjectOffset(FI);
         unsigned CFIIndex = MF.addFrameInst(
             MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
         BuildMI(MBB, Pos, dl, TII.get(TargetOpcode::CFI_INSTRUCTION))
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
index eb55a2b5e70b8..994b43f1abb49 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
@@ -1146,7 +1146,7 @@ enum CompactUnwindEncodings {
 /// instructions. If the CFI instructions describe a frame that cannot be
 /// encoded in compact unwind, the method returns UNWIND_ARM_MODE_DWARF which
 /// tells the runtime to fallback and unwind using dwarf.
-uint32_t ARMAsmBackendDarwin::generateCompactUnwindEncoding(
+uint64_t ARMAsmBackendDarwin::generateCompactUnwindEncoding(
     const MCDwarfFrameInfo *FI, const MCContext *Ctxt) const {
   DEBUG_WITH_TYPE("compact-unwind", llvm::dbgs() << "generateCU()\n");
   // Only armv7k uses CFI based unwinding.
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h
index ac0c9b101cae1..9c958003ca756 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackendDarwin.h
@@ -34,7 +34,7 @@ class ARMAsmBackendDarwin : public ARMAsmBackend {
         /*Is64Bit=*/false, cantFail(MachO::getCPUType(TT)), Subtype);
   }
 
-  uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                          const MCContext *Ctxt) const override;
 };
 } // end namespace llvm
diff --git a/llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp b/llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
index 6ca18528591af..05357de40e3a9 100644
--- a/llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
@@ -1659,7 +1659,7 @@ bool HexagonFrameLowering::assignCalleeSavedSpillSlots(MachineFunction &MF,
   using SpillSlot = TargetFrameLowering::SpillSlot;
 
   unsigned NumFixed;
-  int MinOffset = 0;  // CS offsets are negative.
+  int64_t MinOffset = 0; // CS offsets are negative.
   const SpillSlot *FixedSlots = getCalleeSavedSpillSlots(NumFixed);
   for (const SpillSlot *S = FixedSlots; S != FixedSlots+NumFixed; ++S) {
     if (!SRegs[S->Reg])
@@ -1678,7 +1678,7 @@ bool HexagonFrameLowering::assignCalleeSavedSpillSlots(MachineFunction &MF,
     Register R = x;
     const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(R);
     unsigned Size = TRI->getSpillSize(*RC);
-    int Off = MinOffset - Size;
+    int64_t Off = MinOffset - Size;
     Align Alignment = std::min(TRI->getSpillAlign(*RC), getStackAlign());
     Off &= -Alignment.value();
     int FI = MFI.CreateFixedSpillStackObject(Size, Off);
diff --git a/llvm/lib/Target/MSP430/MSP430FrameLowering.cpp b/llvm/lib/Target/MSP430/MSP430FrameLowering.cpp
index f4d703ebeeab2..d0dc6dd146efd 100644
--- a/llvm/lib/Target/MSP430/MSP430FrameLowering.cpp
+++ b/llvm/lib/Target/MSP430/MSP430FrameLowering.cpp
@@ -293,7 +293,7 @@ void MSP430FrameLowering::emitEpilogue(MachineFunction &MF,
 
   if (!hasFP(MF)) {
     MBBI = FirstCSPop;
-    int64_t Offset = -CSSize - 2;
+    int64_t Offset = -(int64_t)CSSize - 2;
     // Mark callee-saved pop instruction.
     // Define the current CFA rule to use the provided offset.
     while (MBBI != MBB.end()) {
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index a3ef11b2cab45..fcc61d0a5e2f6 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -1312,7 +1312,7 @@ class DarwinX86AsmBackend : public X86AsmBackend {
 
   /// Implementation of algorithm to generate the compact unwind encoding
   /// for the CFI instructions.
-  uint32_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
+  uint64_t generateCompactUnwindEncoding(const MCDwarfFrameInfo *FI,
                                          const MCContext *Ctxt) const override {
     ArrayRef<MCCFIInstruction> Instrs = FI->Instructions;
     if (Instrs.empty()) return 0;
@@ -1327,13 +1327,13 @@ class DarwinX86AsmBackend : public X86AsmBackend {
     bool HasFP = false;
 
     // Encode that we are using EBP/RBP as the frame pointer.
-    uint32_t CompactUnwindEncoding = 0;
+    uint64_t CompactUnwindEncoding = 0;
 
     unsigned SubtractInstrIdx = Is64Bit ? 3 : 2;
     unsigned InstrOffset = 0;
     unsigned StackAdjust = 0;
-    unsigned StackSize = 0;
-    int MinAbsOffset = std::numeric_limits<int>::max();
+    uint64_t StackSize = 0;
+    int64_t MinAbsOffset = std::numeric_limits<int64_t>::max();
 
     for (const MCCFIInstruction &Inst : Instrs) {
       switch (Inst.getOperation()) {
@@ -1360,7 +1360,7 @@ class DarwinX86AsmBackend : public X86AsmBackend {
         memset(SavedRegs, 0, sizeof(SavedRegs));
         StackAdjust = 0;
         SavedRegIdx = 0;
-        MinAbsOffset = std::numeric_limits<int>::max();
+        MinAbsOffset = std::numeric_limits<int64_t>::max();
         InstrOffset += MoveInstrSize;
         break;
       }
@@ -1403,7 +1403,7 @@ class DarwinX86AsmBackend : public X86AsmBackend {
         unsigned Reg = *MRI.getLLVMRegNum(Inst.getRegister(), true);
         SavedRegs[SavedRegIdx++] = Reg;
         StackAdjust += OffsetSize;
-        MinAbsOffset = std::min(MinAbsOffset, abs(Inst.getOffset()));
+        MinAbsOffset = std::min(MinAbsOffset, std::abs(Inst.getOffset()));
         InstrOffset += PushInstrSize(Reg);
         break;
       }
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 0ff50d8ef678e..bdc9a0d29670a 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -473,7 +473,7 @@ void X86FrameLowering::emitCalleeSavedFrameMovesFullCFA(
                                : FramePtr;
   unsigned DwarfReg = MRI->getDwarfRegNum(MachineFramePtr, true);
   // Offset = space for return address + size of the frame pointer itself.
-  unsigned Offset = (Is64Bit ? 8 : 4) + (Uses64BitFramePtr ? 8 : 4);
+  int64_t Offset = (Is64Bit ? 8 : 4) + (Uses64BitFramePtr ? 8 : 4);
   BuildCFI(MBB, MBBI, DebugLoc{},
            MCCFIInstruction::createOffset(nullptr, DwarfReg, -Offset));
   emitCalleeSavedFrameMoves(MBB, MBBI, DebugLoc{}, true);
@@ -2553,7 +2553,7 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
 
   if (!HasFP && NeedsDwarfCFI) {
     MBBI = FirstCSPop;
-    int64_t Offset = -CSSize - SlotSize;
+    int64_t Offset = -(int64_t)CSSize - SlotSize;
     // Mark callee-saved pop instruction.
     // Define the current CFA rule to use the provided offset.
     while (MBBI != MBB.end()) {
diff --git a/llvm/test/CodeGen/PowerPC/huge-frame-size.ll b/llvm/test/CodeGen/PowerPC/huge-frame-size.ll
index f1039df6f549a..78bdac021ac8a 100644
--- a/llvm/test/CodeGen/PowerPC/huge-frame-size.ll
+++ b/llvm/test/CodeGen/PowerPC/huge-frame-size.ll
@@ -18,7 +18,7 @@ define void @foo(i8 %x) {
 ; CHECK-LE-NEXT:    oris 0, 0, 65535
 ; CHECK-LE-NEXT:    ori 0, 0, 65504
 ; CHECK-LE-NEXT:    stdux 1, 1, 0
-; CHECK-LE-NEXT:    .cfi_def_cfa_offset 32
+; CHECK-LE-NEXT:    .cfi_def_cfa_offset 4294967328
 ; CHECK-LE-NEXT:    li 4, 1
 ; CHECK-LE-NEXT:    addi 5, 1, 32
 ; CHECK-LE-NEXT:    stb 3, 32(1)
diff --git a/llvm/test/CodeGen/RISCV/pr88365.ll b/llvm/test/CodeGen/RISCV/pr88365.ll
index 73010fdf40447..4e4dead98ee69 100644
--- a/llvm/test/CodeGen/RISCV/pr88365.ll
+++ b/llvm/test/CodeGen/RISCV/pr88365.ll
@@ -10,7 +10,7 @@ define void @foo() {
 ; CHECK-NEXT:    .cfi_offset ra, -4
 ; CHECK-NEXT:    li a0, -2048
 ; CHECK-NEXT:    sub sp, sp, a0
-; CHECK-NEXT:    .cfi_def_cfa_offset -16
+; CHECK-NEXT:    .cfi_def_cfa_offset 4294967280
 ; CHECK-NEXT:    addi a0, sp, 4
 ; CHECK-NEXT:    call use
 ; CHECK-NEXT:    li a0, -2048
diff --git a/llvm/test/CodeGen/X86/huge-stack.ll b/llvm/test/CodeGen/X86/huge-stack.ll
index a7ceb4a4ee6fe..920033ba1182c 100644
--- a/llvm/test/CodeGen/X86/huge-stack.ll
+++ b/llvm/test/CodeGen/X86/huge-stack.ll
@@ -7,7 +7,7 @@ define void @foo() unnamed_addr #0 {
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    movabsq $8589934462, %rax # imm = 0x1FFFFFF7E
 ; CHECK-NEXT:    subq %rax, %rsp
-; CHECK-NEXT:    .cfi_def_cfa_offset -122
+; CHECK-NEXT:    .cfi_def_cfa_offset 8589934470
 ; CHECK-NEXT:    movb $42, -129(%rsp)
 ; CHECK-NEXT:    movb $43, -128(%rsp)
 ; CHECK-NEXT:    movabsq $8589934462, %rax # imm = 0x1FFFFFF7E

>From 07e9f01f8935c0006bdaf209acdce29cda7685d3 Mon Sep 17 00:00:00 2001
From: cor3ntin <corentinjabot at gmail.com>
Date: Wed, 24 Jul 2024 17:28:44 +0200
Subject: [PATCH 26/91] [Clang] Fix an assertion failure introduced by #93430
 (#100313)

The PR #93430 introduced an assertion that did not make any sense. and
caused a regression. The fix is to simply remove the assertion.

No changelog. the intent is to backport this fix to clang 19.

(cherry picked from commit dd82a84e0eeafb017c7220c4a9fbd0a8a407f8a9)
---
 clang/lib/Sema/SemaExpr.cpp                | 1 -
 clang/test/SemaCXX/cxx2b-deducing-this.cpp | 6 ++++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 9207bf7a41349..206194930f3b4 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -5730,7 +5730,6 @@ static bool isParenthetizedAndQualifiedAddressOfExpr(Expr *Fn) {
   if (!UO || UO->getOpcode() != clang::UO_AddrOf)
     return false;
   if (auto *DRE = dyn_cast<DeclRefExpr>(UO->getSubExpr()->IgnoreParens())) {
-    assert(isa<FunctionDecl>(DRE->getDecl()) && "expected a function");
     return DRE->hasQualifier();
   }
   if (auto *OVL = dyn_cast<OverloadExpr>(UO->getSubExpr()->IgnoreParens()))
diff --git a/clang/test/SemaCXX/cxx2b-deducing-this.cpp b/clang/test/SemaCXX/cxx2b-deducing-this.cpp
index 5cbc1f735383b..4811b6052254c 100644
--- a/clang/test/SemaCXX/cxx2b-deducing-this.cpp
+++ b/clang/test/SemaCXX/cxx2b-deducing-this.cpp
@@ -895,6 +895,10 @@ void g() {
 }
 
 namespace P2797 {
+
+int bar(void) { return 55; }
+int (&fref)(void) = bar;
+
 struct C {
   void c(this const C&);    // #first
   void c() &;               // #second
@@ -915,6 +919,8 @@ struct C {
     (&C::c)(C{});
     (&C::c)(*this);     // expected-error {{call to non-static member function without an object argument}}
     (&C::c)();
+
+    (&fref)();
   }
 };
 }

>From 12a11dc676fd36d790b705b918597877fb34772a Mon Sep 17 00:00:00 2001
From: cor3ntin <corentinjabot at gmail.com>
Date: Wed, 24 Jul 2024 17:27:58 +0200
Subject: [PATCH 27/91] [Clang][NFC] Simplify initialization of
 `OverloadCandidate` objects. (#100318)

Initialize some fields of OverloadCandidate in its constructor. The goal
here is try to fix read of uninitialized variable (which I was not able
to reproduce)
https://github.com/llvm/llvm-project/pull/93430#issuecomment-2187544278

We should certainly try to improve the construction of
`OverloadCandidate` further as it can be quite britle.

(cherry picked from commit 7d787df5b932b73aae6532d1e981152f103f9244)
---
 clang/include/clang/Sema/Overload.h |  4 +++-
 clang/lib/Sema/SemaOverload.cpp     | 20 +-------------------
 2 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/clang/include/clang/Sema/Overload.h b/clang/include/clang/Sema/Overload.h
index 9d8b797af6663..26ffe057c74a2 100644
--- a/clang/include/clang/Sema/Overload.h
+++ b/clang/include/clang/Sema/Overload.h
@@ -998,7 +998,9 @@ class Sema;
   private:
     friend class OverloadCandidateSet;
     OverloadCandidate()
-        : IsSurrogate(false), IsADLCandidate(CallExpr::NotADL), RewriteKind(CRK_None) {}
+        : IsSurrogate(false), IgnoreObjectArgument(false),
+          TookAddressOfOverload(false), IsADLCandidate(CallExpr::NotADL),
+          RewriteKind(CRK_None) {}
   };
 
   /// OverloadCandidateSet - A set of overload candidates, used in C++
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index a8d250fbabfed..554a2df14bea6 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -6857,10 +6857,7 @@ void Sema::AddOverloadCandidate(
   Candidate.Viable = true;
   Candidate.RewriteKind =
       CandidateSet.getRewriteInfo().getRewriteKind(Function, PO);
-  Candidate.IsSurrogate = false;
   Candidate.IsADLCandidate = IsADLCandidate;
-  Candidate.IgnoreObjectArgument = false;
-  Candidate.TookAddressOfOverload = false;
   Candidate.ExplicitCallArguments = Args.size();
 
   // Explicit functions are not actually candidates at all if we're not
@@ -7422,8 +7419,6 @@ Sema::AddMethodCandidate(CXXMethodDecl *Method, DeclAccessPair FoundDecl,
   Candidate.Function = Method;
   Candidate.RewriteKind =
       CandidateSet.getRewriteInfo().getRewriteKind(Method, PO);
-  Candidate.IsSurrogate = false;
-  Candidate.IgnoreObjectArgument = false;
   Candidate.TookAddressOfOverload =
       CandidateSet.getKind() == OverloadCandidateSet::CSK_AddressOfOverloadSet;
   Candidate.ExplicitCallArguments = Args.size();
@@ -7617,7 +7612,6 @@ void Sema::AddMethodTemplateCandidate(
     Candidate.IgnoreObjectArgument =
         cast<CXXMethodDecl>(Candidate.Function)->isStatic() ||
         ObjectType.isNull();
-    Candidate.TookAddressOfOverload = false;
     Candidate.ExplicitCallArguments = Args.size();
     if (Result == TemplateDeductionResult::NonDependentConversionFailure)
       Candidate.FailureKind = ovl_fail_bad_conversion;
@@ -7705,7 +7699,6 @@ void Sema::AddTemplateOverloadCandidate(
     Candidate.IgnoreObjectArgument =
         isa<CXXMethodDecl>(Candidate.Function) &&
         !isa<CXXConstructorDecl>(Candidate.Function);
-    Candidate.TookAddressOfOverload = false;
     Candidate.ExplicitCallArguments = Args.size();
     if (Result == TemplateDeductionResult::NonDependentConversionFailure)
       Candidate.FailureKind = ovl_fail_bad_conversion;
@@ -7886,9 +7879,6 @@ void Sema::AddConversionCandidate(
   OverloadCandidate &Candidate = CandidateSet.addCandidate(1);
   Candidate.FoundDecl = FoundDecl;
   Candidate.Function = Conversion;
-  Candidate.IsSurrogate = false;
-  Candidate.IgnoreObjectArgument = false;
-  Candidate.TookAddressOfOverload = false;
   Candidate.FinalConversion.setAsIdentityConversion();
   Candidate.FinalConversion.setFromType(ConvType);
   Candidate.FinalConversion.setAllToTypes(ToType);
@@ -8084,9 +8074,6 @@ void Sema::AddTemplateConversionCandidate(
     Candidate.Function = FunctionTemplate->getTemplatedDecl();
     Candidate.Viable = false;
     Candidate.FailureKind = ovl_fail_bad_deduction;
-    Candidate.IsSurrogate = false;
-    Candidate.IgnoreObjectArgument = false;
-    Candidate.TookAddressOfOverload = false;
     Candidate.ExplicitCallArguments = 1;
     Candidate.DeductionFailure = MakeDeductionFailureInfo(Context, Result,
                                                           Info);
@@ -8119,10 +8106,8 @@ void Sema::AddSurrogateCandidate(CXXConversionDecl *Conversion,
   Candidate.FoundDecl = FoundDecl;
   Candidate.Function = nullptr;
   Candidate.Surrogate = Conversion;
-  Candidate.Viable = true;
   Candidate.IsSurrogate = true;
-  Candidate.IgnoreObjectArgument = false;
-  Candidate.TookAddressOfOverload = false;
+  Candidate.Viable = true;
   Candidate.ExplicitCallArguments = Args.size();
 
   // Determine the implicit conversion sequence for the implicit
@@ -8328,9 +8313,6 @@ void Sema::AddBuiltinCandidate(QualType *ParamTys, ArrayRef<Expr *> Args,
   OverloadCandidate &Candidate = CandidateSet.addCandidate(Args.size());
   Candidate.FoundDecl = DeclAccessPair::make(nullptr, AS_none);
   Candidate.Function = nullptr;
-  Candidate.IsSurrogate = false;
-  Candidate.IgnoreObjectArgument = false;
-  Candidate.TookAddressOfOverload = false;
   std::copy(ParamTys, ParamTys + Args.size(), Candidate.BuiltinParamTypes);
 
   // Determine the implicit conversion sequences for each of the

>From 5c03c4fc269039dad8db96c298aaacf819b32125 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Wed, 24 Jul 2024 11:03:52 -0500
Subject: [PATCH 28/91] [libc++] Improve behavior when using relative path for
 LIBCXX_ASSERTION_HANDLER_FILE (#100157)

Fixes #80696

(cherry picked from commit 046a17717d9c5b5385ecd914621b48bdd91524d0)
---
 libcxx/CMakeLists.txt          | 8 ++++++--
 libcxx/docs/BuildingLibcxx.rst | 3 ++-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/libcxx/CMakeLists.txt b/libcxx/CMakeLists.txt
index 332816b15260a..674082c7d1787 100644
--- a/libcxx/CMakeLists.txt
+++ b/libcxx/CMakeLists.txt
@@ -71,12 +71,16 @@ if (NOT "${LIBCXX_HARDENING_MODE}" IN_LIST LIBCXX_SUPPORTED_HARDENING_MODES)
     "Unsupported hardening mode: '${LIBCXX_HARDENING_MODE}'. Supported values are ${LIBCXX_SUPPORTED_HARDENING_MODES}.")
 endif()
 set(LIBCXX_ASSERTION_HANDLER_FILE
-  "${CMAKE_CURRENT_SOURCE_DIR}/vendor/llvm/default_assertion_handler.in"
+  "vendor/llvm/default_assertion_handler.in"
   CACHE STRING
   "Specify the path to a header that contains a custom implementation of the
    assertion handler that gets invoked when a hardening assertion fails. If
    provided, this header will be included by the library, replacing the
-   default assertion handler.")
+   default assertion handler. If this is specified as a relative path, it
+   is assumed to be relative to '<monorepo>/libcxx'.")
+if (NOT IS_ABSOLUTE "${LIBCXX_ASSERTION_HANDLER_FILE}")
+  set(LIBCXX_ASSERTION_HANDLER_FILE "${CMAKE_CURRENT_SOURCE_DIR}/${LIBCXX_ASSERTION_HANDLER_FILE}")
+endif()
 option(LIBCXX_ENABLE_RANDOM_DEVICE
   "Whether to include support for std::random_device in the library. Disabling
    this can be useful when building the library for platforms that don't have
diff --git a/libcxx/docs/BuildingLibcxx.rst b/libcxx/docs/BuildingLibcxx.rst
index 66bb19bb5b2cd..5c224689e0f9f 100644
--- a/libcxx/docs/BuildingLibcxx.rst
+++ b/libcxx/docs/BuildingLibcxx.rst
@@ -406,7 +406,8 @@ libc++ Feature Options
   Specify the path to a header that contains a custom implementation of the
   assertion handler that gets invoked when a hardening assertion fails. If
   provided, this header will be included by the library, replacing the
-  default assertion handler.
+  default assertion handler. If this is specified as a relative path, it
+  is assumed to be relative to ``<monorepo>/libcxx``.
 
 
 libc++ ABI Feature Options

>From 05446fb31a97eb768c494ad7175809faf7d2000a Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Wed, 24 Jul 2024 19:42:48 +0200
Subject: [PATCH 29/91] [libc++][spaceship] Implements X::iterator container
 requirements. (#99343)

This implements the requirements for the container iterator requirements
for array, deque, vector, and `vector<bool>`.

Implements:
- LWG3352 strong_equality isn't a thing

Implements parts of:
- P1614R2 The Mothership has Landed

Fixes: https://github.com/llvm/llvm-project/issues/62486
---
 libcxx/docs/Status/Cxx20Issues.csv            |  2 +-
 libcxx/docs/Status/SpaceshipProjects.csv      |  2 +-
 libcxx/include/__bit_reference                | 14 +++
 libcxx/include/__iterator/bounded_iter.h      | 24 +++++
 libcxx/include/__iterator/wrap_iter.h         | 23 +++++
 libcxx/include/deque                          | 29 +++++-
 .../bounded_iter/comparison.pass.cpp          | 14 ++-
 .../sequences/array/iterators.pass.cpp        | 18 ++++
 .../sequences/deque/iterators.pass.cpp        | 29 ++++++
 .../sequences/vector.bool/iterators.pass.cpp  | 37 ++++++++
 .../sequences/vector/iterators.pass.cpp       | 49 +++++++++-
 .../span.iterators/iterator.pass.cpp          | 92 +++++++++++++++++++
 .../string.view.iterators/iterators.pass.cpp  | 85 +++++++++++++++++
 libcxx/test/support/test_iterators.h          |  2 +
 14 files changed, 413 insertions(+), 7 deletions(-)
 create mode 100644 libcxx/test/std/containers/views/views.span/span.iterators/iterator.pass.cpp
 create mode 100644 libcxx/test/std/strings/string.view/string.view.iterators/iterators.pass.cpp

diff --git a/libcxx/docs/Status/Cxx20Issues.csv b/libcxx/docs/Status/Cxx20Issues.csv
index 1a40a4472a405..8a431c922a2d9 100644
--- a/libcxx/docs/Status/Cxx20Issues.csv
+++ b/libcxx/docs/Status/Cxx20Issues.csv
@@ -264,7 +264,7 @@
 "`3349 <https://wg21.link/LWG3349>`__","Missing ``__cpp_lib_constexpr_complex``\  for P0415R1","Prague","|Complete|","16.0"
 "`3350 <https://wg21.link/LWG3350>`__","Simplify return type of ``lexicographical_compare_three_way``\ ","Prague","|Complete|","17.0","|spaceship|"
 "`3351 <https://wg21.link/LWG3351>`__","``ranges::enable_safe_range``\  should not be constrained","Prague","|Complete|","15.0","|ranges|"
-"`3352 <https://wg21.link/LWG3352>`__","``strong_equality``\  isn't a thing","Prague","|Nothing To Do|","","|spaceship|"
+"`3352 <https://wg21.link/LWG3352>`__","``strong_equality``\  isn't a thing","Prague","|Complete|","19.0","|spaceship|"
 "`3354 <https://wg21.link/LWG3354>`__","``has_strong_structural_equality``\  has a meaningless definition","Prague","|Nothing To Do|","","|spaceship|"
 "`3355 <https://wg21.link/LWG3355>`__","The memory algorithms should support move-only input iterators introduced by P1207","Prague","|Complete|","15.0","|ranges|"
 "`3356 <https://wg21.link/LWG3356>`__","``__cpp_lib_nothrow_convertible``\  should be ``__cpp_lib_is_nothrow_convertible``\ ","Prague","|Complete|","12.0"
diff --git a/libcxx/docs/Status/SpaceshipProjects.csv b/libcxx/docs/Status/SpaceshipProjects.csv
index e1cf2044cfd78..4dc43cdbbd08f 100644
--- a/libcxx/docs/Status/SpaceshipProjects.csv
+++ b/libcxx/docs/Status/SpaceshipProjects.csv
@@ -83,7 +83,7 @@ Section,Description,Dependencies,Assignee,Complete
 "| `[string.view.synop] <https://wg21.link/string.view.synop>`_
 | `[string.view.comparison] <https://wg21.link/string.view.comparison>`_",| `basic_string_view <https://reviews.llvm.org/D130295>`_,None,Mark de Wever,|Complete|
 - `5.7 Clause 22: Containers library <https://wg21.link/p1614r2#clause-22-containers-library>`_,,,,
-| `[container.requirements.general] <https://wg21.link/container.requirements.general>`_,|,None,Unassigned,|Not Started|
+| `[container.requirements.general] <https://wg21.link/container.requirements.general>`_,|,None,Mark de Wever,|Complete|
 | `[array.syn] <https://wg21.link/array.syn>`_ (`general <https://wg21.link/container.opt.reqmts>`_),| `array <https://reviews.llvm.org/D132265>`_,[expos.only.func],"| Adrian Vogelsgesang
 | Hristo Hristov",|Complete|
 | `[deque.syn] <https://wg21.link/deque.syn>`_ (`general <https://wg21.link/container.opt.reqmts>`_),| `deque <https://reviews.llvm.org/D144821>`_,[expos.only.func],Hristo Hristov,|Complete|
diff --git a/libcxx/include/__bit_reference b/libcxx/include/__bit_reference
index 606069d98be72..22637d4397412 100644
--- a/libcxx/include/__bit_reference
+++ b/libcxx/include/__bit_reference
@@ -16,6 +16,7 @@
 #include <__bit/countr.h>
 #include <__bit/invert_if.h>
 #include <__bit/popcount.h>
+#include <__compare/ordering.h>
 #include <__config>
 #include <__fwd/bit_reference.h>
 #include <__iterator/iterator_traits.h>
@@ -913,6 +914,7 @@ public:
     return __x.__seg_ == __y.__seg_ && __x.__ctz_ == __y.__ctz_;
   }
 
+#if _LIBCPP_STD_VER <= 17
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 friend bool
   operator!=(const __bit_iterator& __x, const __bit_iterator& __y) {
     return !(__x == __y);
@@ -937,6 +939,18 @@ public:
   operator>=(const __bit_iterator& __x, const __bit_iterator& __y) {
     return !(__x < __y);
   }
+#else  // _LIBCPP_STD_VER <= 17
+  _LIBCPP_HIDE_FROM_ABI constexpr friend strong_ordering
+  operator<=>(const __bit_iterator& __x, const __bit_iterator& __y) {
+    if (__x.__seg_ < __y.__seg_)
+      return strong_ordering::less;
+
+    if (__x.__seg_ == __y.__seg_)
+      return __x.__ctz_ <=> __y.__ctz_;
+
+    return strong_ordering::greater;
+  }
+#endif // _LIBCPP_STD_VER <= 17
 
 private:
   _LIBCPP_HIDE_FROM_ABI
diff --git a/libcxx/include/__iterator/bounded_iter.h b/libcxx/include/__iterator/bounded_iter.h
index ce0823b8c97e4..8a81c9ffbfc3f 100644
--- a/libcxx/include/__iterator/bounded_iter.h
+++ b/libcxx/include/__iterator/bounded_iter.h
@@ -11,6 +11,8 @@
 #define _LIBCPP___ITERATOR_BOUNDED_ITER_H
 
 #include <__assert>
+#include <__compare/ordering.h>
+#include <__compare/three_way_comparable.h>
 #include <__config>
 #include <__iterator/iterator_traits.h>
 #include <__memory/pointer_traits.h>
@@ -201,10 +203,15 @@ struct __bounded_iter {
   operator==(__bounded_iter const& __x, __bounded_iter const& __y) _NOEXCEPT {
     return __x.__current_ == __y.__current_;
   }
+
+#if _LIBCPP_STD_VER <= 17
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR friend bool
   operator!=(__bounded_iter const& __x, __bounded_iter const& __y) _NOEXCEPT {
     return __x.__current_ != __y.__current_;
   }
+#endif
+
+  // TODO(mordante) disable these overloads in the LLVM 20 release.
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR friend bool
   operator<(__bounded_iter const& __x, __bounded_iter const& __y) _NOEXCEPT {
     return __x.__current_ < __y.__current_;
@@ -222,6 +229,23 @@ struct __bounded_iter {
     return __x.__current_ >= __y.__current_;
   }
 
+#if _LIBCPP_STD_VER >= 20
+  _LIBCPP_HIDE_FROM_ABI constexpr friend strong_ordering
+  operator<=>(__bounded_iter const& __x, __bounded_iter const& __y) noexcept {
+    if constexpr (three_way_comparable<_Iterator, strong_ordering>) {
+      return __x.__current_ <=> __y.__current_;
+    } else {
+      if (__x.__current_ < __y.__current_)
+        return strong_ordering::less;
+
+      if (__x.__current_ == __y.__current_)
+        return strong_ordering::equal;
+
+      return strong_ordering::greater;
+    }
+  }
+#endif // _LIBCPP_STD_VER >= 20
+
 private:
   template <class>
   friend struct pointer_traits;
diff --git a/libcxx/include/__iterator/wrap_iter.h b/libcxx/include/__iterator/wrap_iter.h
index 252d13b26c9e2..56183c0ee794d 100644
--- a/libcxx/include/__iterator/wrap_iter.h
+++ b/libcxx/include/__iterator/wrap_iter.h
@@ -10,6 +10,8 @@
 #ifndef _LIBCPP___ITERATOR_WRAP_ITER_H
 #define _LIBCPP___ITERATOR_WRAP_ITER_H
 
+#include <__compare/ordering.h>
+#include <__compare/three_way_comparable.h>
 #include <__config>
 #include <__iterator/iterator_traits.h>
 #include <__memory/addressof.h>
@@ -131,6 +133,7 @@ operator<(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter2>& __y) _NOEXC
   return __x.base() < __y.base();
 }
 
+#if _LIBCPP_STD_VER <= 17
 template <class _Iter1>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool
 operator!=(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter1>& __y) _NOEXCEPT {
@@ -142,7 +145,9 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool
 operator!=(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter2>& __y) _NOEXCEPT {
   return !(__x == __y);
 }
+#endif
 
+// TODO(mordante) disable these overloads in the LLVM 20 release.
 template <class _Iter1>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool
 operator>(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter1>& __y) _NOEXCEPT {
@@ -179,6 +184,24 @@ operator<=(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter2>& __y) _NOEX
   return !(__y < __x);
 }
 
+#if _LIBCPP_STD_VER >= 20
+template <class _Iter1, class _Iter2>
+_LIBCPP_HIDE_FROM_ABI constexpr strong_ordering
+operator<=>(const __wrap_iter<_Iter1>& __x, const __wrap_iter<_Iter2>& __y) noexcept {
+  if constexpr (three_way_comparable_with<_Iter1, _Iter2, strong_ordering>) {
+    return __x.base() <=> __y.base();
+  } else {
+    if (__x.base() < __y.base())
+      return strong_ordering::less;
+
+    if (__x.base() == __y.base())
+      return strong_ordering::equal;
+
+    return strong_ordering::greater;
+  }
+}
+#endif // _LIBCPP_STD_VER >= 20
+
 template <class _Iter1, class _Iter2>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14
 #ifndef _LIBCPP_CXX03_LANG
diff --git a/libcxx/include/deque b/libcxx/include/deque
index 4fc994a6e229b..e73135a8647b9 100644
--- a/libcxx/include/deque
+++ b/libcxx/include/deque
@@ -376,10 +376,13 @@ public:
     return __x.__ptr_ == __y.__ptr_;
   }
 
+#if _LIBCPP_STD_VER <= 17
   _LIBCPP_HIDE_FROM_ABI friend bool operator!=(const __deque_iterator& __x, const __deque_iterator& __y) {
     return !(__x == __y);
   }
+#endif
 
+  // TODO(mordante) disable these overloads in the LLVM 20 release.
   _LIBCPP_HIDE_FROM_ABI friend bool operator<(const __deque_iterator& __x, const __deque_iterator& __y) {
     return __x.__m_iter_ < __y.__m_iter_ || (__x.__m_iter_ == __y.__m_iter_ && __x.__ptr_ < __y.__ptr_);
   }
@@ -396,6 +399,29 @@ public:
     return !(__x < __y);
   }
 
+#if _LIBCPP_STD_VER >= 20
+  _LIBCPP_HIDE_FROM_ABI friend strong_ordering operator<=>(const __deque_iterator& __x, const __deque_iterator& __y) {
+    if (__x.__m_iter_ < __y.__m_iter_)
+      return strong_ordering::less;
+
+    if (__x.__m_iter_ == __y.__m_iter_) {
+      if constexpr (three_way_comparable<pointer, strong_ordering>) {
+        return __x.__ptr_ <=> __y.__ptr_;
+      } else {
+        if (__x.__ptr_ < __y.__ptr_)
+          return strong_ordering::less;
+
+        if (__x.__ptr_ == __y.__ptr_)
+          return strong_ordering::equal;
+
+        return strong_ordering::greater;
+      }
+    }
+
+    return strong_ordering::greater;
+  }
+#endif // _LIBCPP_STD_VER >= 20
+
 private:
   _LIBCPP_HIDE_FROM_ABI explicit __deque_iterator(__map_iterator __m, pointer __p) _NOEXCEPT
       : __m_iter_(__m),
@@ -2530,8 +2556,7 @@ inline _LIBCPP_HIDE_FROM_ABI bool operator<=(const deque<_Tp, _Allocator>& __x,
 template <class _Tp, class _Allocator>
 _LIBCPP_HIDE_FROM_ABI __synth_three_way_result<_Tp>
 operator<=>(const deque<_Tp, _Allocator>& __x, const deque<_Tp, _Allocator>& __y) {
-  return std::lexicographical_compare_three_way(
-      __x.begin(), __x.end(), __y.begin(), __y.end(), std::__synth_three_way);
+  return std::lexicographical_compare_three_way(__x.begin(), __x.end(), __y.begin(), __y.end(), std::__synth_three_way);
 }
 
 #endif // _LIBCPP_STD_VER <= 17
diff --git a/libcxx/test/libcxx/iterators/bounded_iter/comparison.pass.cpp b/libcxx/test/libcxx/iterators/bounded_iter/comparison.pass.cpp
index 9c5df5da55b9c..cef2157469c8f 100644
--- a/libcxx/test/libcxx/iterators/bounded_iter/comparison.pass.cpp
+++ b/libcxx/test/libcxx/iterators/bounded_iter/comparison.pass.cpp
@@ -11,6 +11,7 @@
 //
 // Comparison operators
 
+#include <concepts>
 #include <__iterator/bounded_iter.h>
 
 #include "test_iterators.h"
@@ -59,6 +60,12 @@ TEST_CONSTEXPR_CXX14 bool tests() {
     assert(iter1 >= iter1);
   }
 
+#if TEST_STD_VER >= 20
+  // P1614
+  std::same_as<std::strong_ordering> decltype(auto) r1 = iter1 <=> iter2;
+  assert(r1 == std::strong_ordering::less);
+#endif
+
   return true;
 }
 
@@ -69,8 +76,11 @@ int main(int, char**) {
 #endif
 
 #if TEST_STD_VER > 17
-  tests<contiguous_iterator<int*> >();
-  static_assert(tests<contiguous_iterator<int*> >(), "");
+  tests<contiguous_iterator<int*>>();
+  static_assert(tests<contiguous_iterator<int*>>());
+
+  tests<three_way_contiguous_iterator<int*>>();
+  static_assert(tests<three_way_contiguous_iterator<int*>>());
 #endif
 
   return 0;
diff --git a/libcxx/test/std/containers/sequences/array/iterators.pass.cpp b/libcxx/test/std/containers/sequences/array/iterators.pass.cpp
index 106bc45c70998..710994c68295e 100644
--- a/libcxx/test/std/containers/sequences/array/iterators.pass.cpp
+++ b/libcxx/test/std/containers/sequences/array/iterators.pass.cpp
@@ -148,6 +148,15 @@ TEST_CONSTEXPR_CXX17 bool tests()
             assert(std::rbegin(c)  != std::rend(c));
             assert(std::cbegin(c)  != std::cend(c));
             assert(std::crbegin(c) != std::crend(c));
+
+#  if TEST_STD_VER >= 20
+            // P1614 + LWG3352
+            std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+            assert(r1 == std::strong_ordering::equal);
+
+            std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+            assert(r2 == std::strong_ordering::equal);
+#  endif
         }
         {
             typedef std::array<int, 0> C;
@@ -189,6 +198,15 @@ TEST_CONSTEXPR_CXX17 bool tests()
             assert(std::rbegin(c)  == std::rend(c));
             assert(std::cbegin(c)  == std::cend(c));
             assert(std::crbegin(c) == std::crend(c));
+
+#  if TEST_STD_VER >= 20
+            // P1614 + LWG3352
+            std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+            assert(r1 == std::strong_ordering::equal);
+
+            std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+            assert(r2 == std::strong_ordering::equal);
+#  endif
         }
     }
 #endif
diff --git a/libcxx/test/std/containers/sequences/deque/iterators.pass.cpp b/libcxx/test/std/containers/sequences/deque/iterators.pass.cpp
index 1f06ffde41ac2..484a2961fdb0c 100644
--- a/libcxx/test/std/containers/sequences/deque/iterators.pass.cpp
+++ b/libcxx/test/std/containers/sequences/deque/iterators.pass.cpp
@@ -41,7 +41,27 @@ int main(int, char**)
     i = c.begin();
     C::const_iterator j;
     j = c.cbegin();
+
     assert(i == j);
+    assert(!(i != j));
+
+    assert(!(i < j));
+    assert((i <= j));
+
+    assert(!(i > j));
+    assert((i >= j));
+
+#  if TEST_STD_VER >= 20
+    // P1614 + LWG3352
+    // When the allocator does not have operator<=> then the iterator uses a
+    // fallback to provide operator<=>.
+    // Make sure to test with an allocator that does not have operator<=>.
+    static_assert(!std::three_way_comparable<min_allocator<int>, std::strong_ordering>);
+    static_assert(std::three_way_comparable<typename C::iterator, std::strong_ordering>);
+
+    std::same_as<std::strong_ordering> decltype(auto) r1 = i <=> j;
+    assert(r1 == std::strong_ordering::equal);
+#  endif
     }
 #endif
 #if TEST_STD_VER > 11
@@ -74,6 +94,15 @@ int main(int, char**)
 //         assert ( cii != c.begin());
 //         assert ( cii != c.cend());
 //         assert ( ii1 != c.end());
+
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+        assert(r1 == std::strong_ordering::equal);
+
+        std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+        assert(r2 == std::strong_ordering::equal);
+#  endif // TEST_STD_VER > 20
     }
 #endif
 
diff --git a/libcxx/test/std/containers/sequences/vector.bool/iterators.pass.cpp b/libcxx/test/std/containers/sequences/vector.bool/iterators.pass.cpp
index 9aaaac7a5557f..1e4877e8d2443 100644
--- a/libcxx/test/std/containers/sequences/vector.bool/iterators.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector.bool/iterators.pass.cpp
@@ -77,7 +77,21 @@ TEST_CONSTEXPR_CXX20 bool tests()
         C::iterator i = c.begin();
         C::iterator j = c.end();
         assert(std::distance(i, j) == 0);
+
         assert(i == j);
+        assert(!(i != j));
+
+        assert(!(i < j));
+        assert((i <= j));
+
+        assert(!(i > j));
+        assert((i >= j));
+
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        std::same_as<std::strong_ordering> decltype(auto) r = i <=> j;
+        assert(r == std::strong_ordering::equal);
+#  endif
     }
     {
         typedef bool T;
@@ -86,7 +100,21 @@ TEST_CONSTEXPR_CXX20 bool tests()
         C::const_iterator i = c.begin();
         C::const_iterator j = c.end();
         assert(std::distance(i, j) == 0);
+
         assert(i == j);
+        assert(!(i != j));
+
+        assert(!(i < j));
+        assert((i <= j));
+
+        assert(!(i > j));
+        assert((i >= j));
+
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        std::same_as<std::strong_ordering> decltype(auto) r = i <=> j;
+        assert(r == std::strong_ordering::equal);
+#  endif
     }
     {
         typedef bool T;
@@ -131,6 +159,15 @@ TEST_CONSTEXPR_CXX20 bool tests()
         assert ( (cii >= ii1 ));
         assert (cii - ii1 == 0);
         assert (ii1 - cii == 0);
+
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+        assert(r1 == std::strong_ordering::equal);
+
+        std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+        assert(r2 == std::strong_ordering::equal);
+#  endif // TEST_STD_VER > 20
     }
 #endif
 
diff --git a/libcxx/test/std/containers/sequences/vector/iterators.pass.cpp b/libcxx/test/std/containers/sequences/vector/iterators.pass.cpp
index 70e0e35767e09..0aa7ad0d42ed7 100644
--- a/libcxx/test/std/containers/sequences/vector/iterators.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector/iterators.pass.cpp
@@ -87,7 +87,27 @@ TEST_CONSTEXPR_CXX20 bool tests()
         C::iterator i = c.begin();
         C::iterator j = c.end();
         assert(std::distance(i, j) == 0);
+
         assert(i == j);
+        assert(!(i != j));
+
+        assert(!(i < j));
+        assert((i <= j));
+
+        assert(!(i > j));
+        assert((i >= j));
+
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        // When the allocator does not have operator<=> then the iterator uses a
+        // fallback to provide operator<=>.
+        // Make sure to test with an allocator that does not have operator<=>.
+        static_assert(!std::three_way_comparable<min_allocator<int>, std::strong_ordering>);
+        static_assert(std::three_way_comparable<typename C::iterator, std::strong_ordering>);
+
+        std::same_as<std::strong_ordering> decltype(auto) r1 = i <=> j;
+        assert(r1 == std::strong_ordering::equal);
+#  endif
     }
     {
         typedef int T;
@@ -96,7 +116,26 @@ TEST_CONSTEXPR_CXX20 bool tests()
         C::const_iterator i = c.begin();
         C::const_iterator j = c.end();
         assert(std::distance(i, j) == 0);
+
         assert(i == j);
+        assert(!(i != j));
+
+        assert(!(i < j));
+        assert((i <= j));
+
+        assert(!(i > j));
+        assert((i >= j));
+
+#  if TEST_STD_VER >= 20
+        // When the allocator does not have operator<=> then the iterator uses a
+        // fallback to provide operator<=>.
+        // Make sure to test with an allocator that does not have operator<=>.
+        static_assert(!std::three_way_comparable<min_allocator<int>, std::strong_ordering>);
+        static_assert(std::three_way_comparable<typename C::iterator, std::strong_ordering>);
+
+        std::same_as<std::strong_ordering> decltype(auto) r1 = i <=> j;
+        assert(r1 == std::strong_ordering::equal);
+#  endif
     }
     {
         typedef int T;
@@ -164,8 +203,16 @@ TEST_CONSTEXPR_CXX20 bool tests()
         assert ( (cii >= ii1 ));
         assert (cii - ii1 == 0);
         assert (ii1 - cii == 0);
+#  if TEST_STD_VER >= 20
+        // P1614 + LWG3352
+        std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+        assert(r1 == std::strong_ordering::equal);
+
+        std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+        assert(r2 == std::strong_ordering::equal);
+#  endif // TEST_STD_VER > 20
     }
-#endif
+#endif // TEST_STD_VER > 11
 
     return true;
 }
diff --git a/libcxx/test/std/containers/views/views.span/span.iterators/iterator.pass.cpp b/libcxx/test/std/containers/views/views.span/span.iterators/iterator.pass.cpp
new file mode 100644
index 0000000000000..13a7628e6043d
--- /dev/null
+++ b/libcxx/test/std/containers/views/views.span/span.iterators/iterator.pass.cpp
@@ -0,0 +1,92 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: c++03, c++11, c++14, c++17
+
+// <span>
+
+// class iterator
+
+#include <cassert>
+#include <concepts>
+#include <iterator>
+#include <span>
+#include <string>
+#include <version> // __cpp_lib_ranges_as_const is not defined in span.
+
+#include "test_macros.h"
+
+template <class T>
+constexpr void test_type() {
+  using C = std::span<T>;
+  typename C::iterator ii1{}, ii2{};
+  typename C::iterator ii4 = ii1;
+  // TODO Test against C++23 after implementing
+  //  P2278R4 cbegin should always return a constant iterator
+  // The means adjusting the #ifdef to guard against C++23.
+#ifdef __cpp_lib_ranges_as_const
+  typename C::const_iterator cii{};
+#endif
+  assert(ii1 == ii2);
+  assert(ii1 == ii4);
+#ifdef __cpp_lib_ranges_as_const
+  assert(ii1 == cii);
+#endif
+
+  assert(!(ii1 != ii2));
+#ifdef __cpp_lib_ranges_as_const
+  assert(!(ii1 != cii));
+#endif
+
+  T v;
+  C c{&v, 1};
+  assert(c.begin() == std::begin(c));
+  assert(c.rbegin() == std::rbegin(c));
+#ifdef __cpp_lib_ranges_as_const
+  assert(c.cbegin() == std::cbegin(c));
+  assert(c.crbegin() == std::crbegin(c));
+#endif
+
+  assert(c.end() == std::end(c));
+  assert(c.rend() == std::rend(c));
+#ifdef __cpp_lib_ranges_as_const
+  assert(c.cend() == std::cend(c));
+  assert(c.crend() == std::crend(c));
+#endif
+
+  assert(std::begin(c) != std::end(c));
+  assert(std::rbegin(c) != std::rend(c));
+#ifdef __cpp_lib_ranges_as_const
+  assert(std::cbegin(c) != std::cend(c));
+  assert(std::crbegin(c) != std::crend(c));
+#endif
+
+  // P1614 + LWG3352
+  std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+  assert(r1 == std::strong_ordering::equal);
+
+#ifdef __cpp_lib_ranges_as_const
+  std::same_as<std::strong_ordering> decltype(auto) r2 = cii <=> ii2;
+  assert(r2 == std::strong_ordering::equal);
+#endif
+}
+
+constexpr bool test() {
+  test_type<char>();
+  test_type<int>();
+  test_type<std::string>();
+
+  return true;
+}
+
+int main(int, char**) {
+  test();
+  static_assert(test(), "");
+
+  return 0;
+}
diff --git a/libcxx/test/std/strings/string.view/string.view.iterators/iterators.pass.cpp b/libcxx/test/std/strings/string.view/string.view.iterators/iterators.pass.cpp
new file mode 100644
index 0000000000000..75d492bf7b3c6
--- /dev/null
+++ b/libcxx/test/std/strings/string.view/string.view.iterators/iterators.pass.cpp
@@ -0,0 +1,85 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: !stdlib=libc++ && (c++03 || c++11 || c++14)
+
+// <string_view>
+
+// class iterator
+
+#include <cassert>
+#include <concepts>
+#include <iterator>
+#include <string_view>
+
+#include "test_macros.h"
+#include "make_string.h"
+
+template <class CharT>
+TEST_CONSTEXPR_CXX14 void test_type() {
+  using C                  = std::basic_string_view<CharT>;
+  typename C::iterator ii1 = typename C::iterator(), ii2 = typename C::iterator();
+  typename C::iterator ii4       = ii1;
+  typename C::const_iterator cii = typename C::const_iterator();
+  assert(ii1 == ii2);
+  assert(ii1 == ii4);
+  assert(ii1 == cii);
+
+  assert(!(ii1 != ii2));
+  assert(!(ii1 != cii));
+
+#if TEST_STD_VER >= 17
+  C c = MAKE_STRING_VIEW(CharT, "abc");
+  assert(c.begin() == std::begin(c));
+  assert(c.rbegin() == std::rbegin(c));
+  assert(c.cbegin() == std::cbegin(c));
+  assert(c.crbegin() == std::crbegin(c));
+
+  assert(c.end() == std::end(c));
+  assert(c.rend() == std::rend(c));
+  assert(c.cend() == std::cend(c));
+  assert(c.crend() == std::crend(c));
+
+  assert(std::begin(c) != std::end(c));
+  assert(std::rbegin(c) != std::rend(c));
+  assert(std::cbegin(c) != std::cend(c));
+  assert(std::crbegin(c) != std::crend(c));
+#endif
+
+#if TEST_STD_VER >= 20
+  // P1614 + LWG3352
+  std::same_as<std::strong_ordering> decltype(auto) r1 = ii1 <=> ii2;
+  assert(r1 == std::strong_ordering::equal);
+
+  std::same_as<std::strong_ordering> decltype(auto) r2 = ii1 <=> ii2;
+  assert(r2 == std::strong_ordering::equal);
+#endif
+}
+
+TEST_CONSTEXPR_CXX14 bool test() {
+  test_type<char>();
+#ifndef TEST_HAS_NO_WIDE_CHARACTERS
+  test_type<wchar_t>();
+#endif
+#ifndef TEST_HAS_NO_CHAR8_T
+  test_type<char8_t>();
+#endif
+  test_type<char16_t>();
+  test_type<char32_t>();
+
+  return true;
+}
+
+int main(int, char**) {
+  test();
+#if TEST_STD_VER >= 14
+  static_assert(test(), "");
+#endif
+
+  return 0;
+}
diff --git a/libcxx/test/support/test_iterators.h b/libcxx/test/support/test_iterators.h
index 31564a3977317..95d1b7df0007c 100644
--- a/libcxx/test/support/test_iterators.h
+++ b/libcxx/test/support/test_iterators.h
@@ -389,6 +389,8 @@ class contiguous_iterator
     friend TEST_CONSTEXPR bool operator> (const contiguous_iterator& x, const contiguous_iterator& y) {return x.it_ >  y.it_;}
     friend TEST_CONSTEXPR bool operator>=(const contiguous_iterator& x, const contiguous_iterator& y) {return x.it_ >= y.it_;}
 
+    // Note no operator<=>, use three_way_contiguous_iterator for testing operator<=>
+
     friend TEST_CONSTEXPR It base(const contiguous_iterator& i) { return i.it_; }
 
     template <class T>

>From 577f886f2ef32dc19d7534a0d4873269cd4d1efe Mon Sep 17 00:00:00 2001
From: Eli Friedman <efriedma at quicinc.com>
Date: Wed, 24 Jul 2024 12:36:08 -0700
Subject: [PATCH 30/91] [ExprConstant] Handle shift overflow the same way as
 other kinds of overflow (#99579)

We have a mechanism to allow folding expressions that aren't ICEs as an
extension; use it more consistently.

This ends up causing bad effects on diagnostics in a few cases, but
that's not specific to shifts; it's a general issue with the way those
uses handle overflow diagnostics.

(cherry picked from commit 20eff684203287828d6722fc860b9d3621429542)
---
 clang/docs/ReleaseNotes.rst              |  2 ++
 clang/lib/AST/ExprConstant.cpp           | 26 ++++++++++++++++--------
 clang/lib/AST/Interp/Interp.h            | 22 ++++++++++++--------
 clang/lib/Sema/SemaExpr.cpp              |  3 ++-
 clang/test/CXX/basic/basic.types/p10.cpp |  2 +-
 clang/test/Sema/constant-builtins-2.c    | 12 ++++-------
 clang/test/SemaCXX/class.cpp             |  9 +++++++-
 clang/test/SemaCXX/enum.cpp              | 24 +++++++++++++++-------
 8 files changed, 65 insertions(+), 35 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 5b6ee9830b507..549da6812740f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -765,6 +765,8 @@ Improvements to Clang's diagnostics
 
      UsingWithAttr<int> objUsingWA; // warning: 'UsingWithAttr' is deprecated
 
+- Clang now diagnoses undefined behavior in constant expressions more consistently. This includes invalid shifts, and signed overflow in arithmetic.
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 03a606102a77e..5e57b5e8bc8f1 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -2839,6 +2839,8 @@ static bool handleIntIntBinOp(EvalInfo &Info, const BinaryOperator *E,
       // During constant-folding, a negative shift is an opposite shift. Such
       // a shift is not a constant expression.
       Info.CCEDiag(E, diag::note_constexpr_negative_shift) << RHS;
+      if (!Info.noteUndefinedBehavior())
+        return false;
       RHS = -RHS;
       goto shift_right;
     }
@@ -2849,19 +2851,23 @@ static bool handleIntIntBinOp(EvalInfo &Info, const BinaryOperator *E,
     if (SA != RHS) {
       Info.CCEDiag(E, diag::note_constexpr_large_shift)
         << RHS << E->getType() << LHS.getBitWidth();
+      if (!Info.noteUndefinedBehavior())
+        return false;
     } else if (LHS.isSigned() && !Info.getLangOpts().CPlusPlus20) {
       // C++11 [expr.shift]p2: A signed left shift must have a non-negative
       // operand, and must not overflow the corresponding unsigned type.
       // C++2a [expr.shift]p2: E1 << E2 is the unique value congruent to
       // E1 x 2^E2 module 2^N.
-      if (LHS.isNegative())
+      if (LHS.isNegative()) {
         Info.CCEDiag(E, diag::note_constexpr_lshift_of_negative) << LHS;
-      else if (LHS.countl_zero() < SA)
+        if (!Info.noteUndefinedBehavior())
+          return false;
+      } else if (LHS.countl_zero() < SA) {
         Info.CCEDiag(E, diag::note_constexpr_lshift_discards);
+        if (!Info.noteUndefinedBehavior())
+          return false;
+      }
     }
-    if (Info.EvalStatus.Diag && !Info.EvalStatus.Diag->empty() &&
-        Info.getLangOpts().CPlusPlus11)
-      return false;
     Result = LHS << SA;
     return true;
   }
@@ -2875,6 +2881,8 @@ static bool handleIntIntBinOp(EvalInfo &Info, const BinaryOperator *E,
       // During constant-folding, a negative shift is an opposite shift. Such a
       // shift is not a constant expression.
       Info.CCEDiag(E, diag::note_constexpr_negative_shift) << RHS;
+      if (!Info.noteUndefinedBehavior())
+        return false;
       RHS = -RHS;
       goto shift_left;
     }
@@ -2882,13 +2890,13 @@ static bool handleIntIntBinOp(EvalInfo &Info, const BinaryOperator *E,
     // C++11 [expr.shift]p1: Shift width must be less than the bit width of the
     // shifted type.
     unsigned SA = (unsigned) RHS.getLimitedValue(LHS.getBitWidth()-1);
-    if (SA != RHS)
+    if (SA != RHS) {
       Info.CCEDiag(E, diag::note_constexpr_large_shift)
         << RHS << E->getType() << LHS.getBitWidth();
+      if (!Info.noteUndefinedBehavior())
+        return false;
+    }
 
-    if (Info.EvalStatus.Diag && !Info.EvalStatus.Diag->empty() &&
-        Info.getLangOpts().CPlusPlus11)
-      return false;
     Result = LHS >> SA;
     return true;
   }
diff --git a/clang/lib/AST/Interp/Interp.h b/clang/lib/AST/Interp/Interp.h
index 8e96f78d90568..253a433e7340a 100644
--- a/clang/lib/AST/Interp/Interp.h
+++ b/clang/lib/AST/Interp/Interp.h
@@ -153,7 +153,8 @@ bool CheckShift(InterpState &S, CodePtr OpPC, const LT &LHS, const RT &RHS,
   if (RHS.isNegative()) {
     const SourceInfo &Loc = S.Current->getSource(OpPC);
     S.CCEDiag(Loc, diag::note_constexpr_negative_shift) << RHS.toAPSInt();
-    return false;
+    if (!S.noteUndefinedBehavior())
+      return false;
   }
 
   // C++11 [expr.shift]p1: Shift width must be less than the bit width of
@@ -163,17 +164,24 @@ bool CheckShift(InterpState &S, CodePtr OpPC, const LT &LHS, const RT &RHS,
     const APSInt Val = RHS.toAPSInt();
     QualType Ty = E->getType();
     S.CCEDiag(E, diag::note_constexpr_large_shift) << Val << Ty << Bits;
-    return !(S.getEvalStatus().Diag && !S.getEvalStatus().Diag->empty() && S.getLangOpts().CPlusPlus11);
+    if (!S.noteUndefinedBehavior())
+      return false;
   }
 
   if (LHS.isSigned() && !S.getLangOpts().CPlusPlus20) {
     const Expr *E = S.Current->getExpr(OpPC);
     // C++11 [expr.shift]p2: A signed left shift must have a non-negative
     // operand, and must not overflow the corresponding unsigned type.
-    if (LHS.isNegative())
+    if (LHS.isNegative()) {
       S.CCEDiag(E, diag::note_constexpr_lshift_of_negative) << LHS.toAPSInt();
-    else if (LHS.toUnsigned().countLeadingZeros() < static_cast<unsigned>(RHS))
+      if (!S.noteUndefinedBehavior())
+        return false;
+    } else if (LHS.toUnsigned().countLeadingZeros() <
+               static_cast<unsigned>(RHS)) {
       S.CCEDiag(E, diag::note_constexpr_lshift_discards);
+      if (!S.noteUndefinedBehavior())
+        return false;
+    }
   }
 
   // C++2a [expr.shift]p2: [P0907R4]:
@@ -2269,8 +2277,7 @@ inline bool DoShift(InterpState &S, CodePtr OpPC, LT &LHS, RT &RHS) {
     // shift is not a constant expression.
     const SourceInfo &Loc = S.Current->getSource(OpPC);
     S.CCEDiag(Loc, diag::note_constexpr_negative_shift) << RHS.toAPSInt();
-    if (S.getLangOpts().CPlusPlus11 && S.getEvalStatus().Diag &&
-        !S.getEvalStatus().Diag->empty())
+    if (!S.noteUndefinedBehavior())
       return false;
     RHS = -RHS;
     return DoShift < LT, RT,
@@ -2286,8 +2293,7 @@ inline bool DoShift(InterpState &S, CodePtr OpPC, LT &LHS, RT &RHS) {
       // E1 x 2^E2 module 2^N.
       const SourceInfo &Loc = S.Current->getSource(OpPC);
       S.CCEDiag(Loc, diag::note_constexpr_lshift_of_negative) << LHS.toAPSInt();
-      if (S.getLangOpts().CPlusPlus11 && S.getEvalStatus().Diag &&
-          !S.getEvalStatus().Diag->empty())
+      if (!S.noteUndefinedBehavior())
         return false;
     }
   }
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 206194930f3b4..74c0e01705905 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -17045,7 +17045,8 @@ Sema::VerifyIntegerConstantExpression(Expr *E, llvm::APSInt *Result,
   // not a constant expression as a side-effect.
   bool Folded =
       E->EvaluateAsRValue(EvalResult, Context, /*isConstantContext*/ true) &&
-      EvalResult.Val.isInt() && !EvalResult.HasSideEffects;
+      EvalResult.Val.isInt() && !EvalResult.HasSideEffects &&
+      (!getLangOpts().CPlusPlus || !EvalResult.HasUndefinedBehavior);
 
   if (!isa<ConstantExpr>(E))
     E = ConstantExpr::Create(Context, E, EvalResult.Val);
diff --git a/clang/test/CXX/basic/basic.types/p10.cpp b/clang/test/CXX/basic/basic.types/p10.cpp
index a543f248e5371..92d6da0035ea5 100644
--- a/clang/test/CXX/basic/basic.types/p10.cpp
+++ b/clang/test/CXX/basic/basic.types/p10.cpp
@@ -142,7 +142,7 @@ constexpr int arb(int n) { // expected-note {{declared here}}
                expected-note {{function parameter 'n' with unknown value cannot be used in a constant expression}}
 }
 constexpr long Overflow[(1 << 30) << 2]{}; // expected-warning {{requires 34 bits to represent}} \
-                                              expected-warning {{variable length array folded to constant array as an extension}} \
+                                              expected-error {{variable length array declaration not allowed at file scope}} \
                                               expected-warning {{variable length arrays in C++ are a Clang extension}} \
                                               expected-note {{signed left shift discards bits}}
 
diff --git a/clang/test/Sema/constant-builtins-2.c b/clang/test/Sema/constant-builtins-2.c
index 00767267cd6c2..37b63cf4f6b32 100644
--- a/clang/test/Sema/constant-builtins-2.c
+++ b/clang/test/Sema/constant-builtins-2.c
@@ -265,10 +265,8 @@ char clz52[__builtin_clzg((unsigned __int128)0x1) == BITSIZE(__int128) - 1 ? 1 :
 char clz53[__builtin_clzg((unsigned __int128)0x1, 42) == BITSIZE(__int128) - 1 ? 1 : -1];
 char clz54[__builtin_clzg((unsigned __int128)0xf) == BITSIZE(__int128) - 4 ? 1 : -1];
 char clz55[__builtin_clzg((unsigned __int128)0xf, 42) == BITSIZE(__int128) - 4 ? 1 : -1];
-char clz56[__builtin_clzg((unsigned __int128)(1 << (BITSIZE(__int128) - 1))) == 0 ? 1 : -1]; // expected-warning {{variable length array folded to constant array as an extension}}
-                                                                                             // expected-note at -1 {{shift count 127 >= width of type 'int' (32 bits)}}
-char clz57[__builtin_clzg((unsigned __int128)(1 << (BITSIZE(__int128) - 1)), 42) == 0 ? 1 : -1]; // expected-warning {{variable length array folded to constant array as an extension}}
-                                                                                                 // expected-note at -1 {{shift count 127 >= width of type 'int' (32 bits)}}
+char clz56[__builtin_clzg((unsigned __int128)(1 << (BITSIZE(__int128) - 1))) == 0 ? 1 : -1]; // expected-error {{variable length array declaration not allowed at file scope}}
+char clz57[__builtin_clzg((unsigned __int128)(1 << (BITSIZE(__int128) - 1)), 42) == 0 ? 1 : -1]; // expected-error {{variable length array declaration not allowed at file scope}}
 #endif
 int clz58 = __builtin_clzg((unsigned _BitInt(128))0); // expected-error {{not a compile-time constant}}
 char clz59[__builtin_clzg((unsigned _BitInt(128))0, 42) == 42 ? 1 : -1];
@@ -276,10 +274,8 @@ char clz60[__builtin_clzg((unsigned _BitInt(128))0x1) == BITSIZE(_BitInt(128)) -
 char clz61[__builtin_clzg((unsigned _BitInt(128))0x1, 42) == BITSIZE(_BitInt(128)) - 1 ? 1 : -1];
 char clz62[__builtin_clzg((unsigned _BitInt(128))0xf) == BITSIZE(_BitInt(128)) - 4 ? 1 : -1];
 char clz63[__builtin_clzg((unsigned _BitInt(128))0xf, 42) == BITSIZE(_BitInt(128)) - 4 ? 1 : -1];
-char clz64[__builtin_clzg((unsigned _BitInt(128))(1 << (BITSIZE(_BitInt(128)) - 1))) == 0 ? 1 : -1]; // expected-warning {{variable length array folded to constant array as an extension}}
-                                                                                                     // expected-note at -1 {{shift count 127 >= width of type 'int' (32 bits)}}
-char clz65[__builtin_clzg((unsigned _BitInt(128))(1 << (BITSIZE(_BitInt(128)) - 1)), 42) == 0 ? 1 : -1]; // expected-warning {{variable length array folded to constant array as an extension}}
-                                                                                                         // expected-note at -1 {{shift count 127 >= width of type 'int' (32 bits)}}
+char clz64[__builtin_clzg((unsigned _BitInt(128))(1 << (BITSIZE(_BitInt(128)) - 1))) == 0 ? 1 : -1]; // expected-error {{variable length array declaration not allowed at file scope}}
+char clz65[__builtin_clzg((unsigned _BitInt(128))(1 << (BITSIZE(_BitInt(128)) - 1)), 42) == 0 ? 1 : -1]; // expected-error {{variable length array declaration not allowed at file scope}}
 
 char ctz1[__builtin_ctz(1) == 0 ? 1 : -1];
 char ctz2[__builtin_ctz(8) == 3 ? 1 : -1];
diff --git a/clang/test/SemaCXX/class.cpp b/clang/test/SemaCXX/class.cpp
index f874b7be2b70e..2f59544e7f36c 100644
--- a/clang/test/SemaCXX/class.cpp
+++ b/clang/test/SemaCXX/class.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fsyntax-only -verify -Wc++11-compat %s 
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,cxx11 -Wc++11-compat %s
 // RUN: %clang_cc1 -fsyntax-only -verify -Wc++11-compat %s -std=c++98
 class C {
 public:
@@ -55,6 +55,13 @@ class C {
   // expected-error at -2 {{static const volatile data member must be initialized out of line}}
 #endif
   static const E evi = 0;
+  static const int overflow = 1000000*1000000; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
+                                               // expected-warning at -1 {{overflow in expression}}
+  static const int overflow_shift = 1<<32; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
+  static const int overflow_shift2 = 1>>32; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
+  static const int overflow_shift3 = 1<<-1; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
+  static const int overflow_shift4 = 1<<-1; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
+  static const int overflow_shift5 = -1<<1; // cxx11-error {{in-class initializer for static data member is not a constant expression}}
 
   void m() {
     sx = 0;
diff --git a/clang/test/SemaCXX/enum.cpp b/clang/test/SemaCXX/enum.cpp
index 739d35ec4a06b..9c398cc8da886 100644
--- a/clang/test/SemaCXX/enum.cpp
+++ b/clang/test/SemaCXX/enum.cpp
@@ -103,14 +103,14 @@ void PR8089() {
 // This is accepted as a GNU extension. In C++98, there was no provision for
 // expressions with UB to be non-constant.
 enum { overflow = 123456 * 234567 };
-#if __cplusplus >= 201103L
-// expected-warning at -2 {{expression is not an integral constant expression; folding it to a constant is a GNU extension}}
-// expected-note at -3 {{value 28958703552 is outside the range of representable values of type 'int'}}
-#else
-// expected-error at -5 {{expression is not an integral constant expression}}
-// expected-note at -6 {{value 28958703552 is outside the range of representable values of type 'int'}}
-// expected-warning at -7 {{overflow in expression; result is -1'106'067'520 with type 'int'}}
+// expected-error at -1 {{expression is not an integral constant expression}}
+// expected-note at -2 {{value 28958703552 is outside the range of representable values of type 'int'}}
+#if __cplusplus < 201103L
+// expected-warning at -4 {{overflow in expression; result is -1'106'067'520 with type 'int'}}
 #endif
+enum { overflow_shift = 1 << 32 };
+// expected-error at -1 {{expression is not an integral constant expression}}
+// expected-note at -2 {{shift count 32 >= width of type 'int' (32 bits)}}
 
 // FIXME: This is not consistent with the above case.
 enum NoFold : int { overflow2 = 123456 * 234567 };
@@ -123,6 +123,16 @@ enum NoFold : int { overflow2 = 123456 * 234567 };
 // expected-error at -7 {{expression is not an integral constant expression}}
 // expected-note at -8 {{value 28958703552 is outside the range of representable values of type 'int'}}
 #endif
+enum : int { overflow2_shift = 1 << 32 };
+#if __cplusplus >= 201103L
+// expected-error at -2 {{enumerator value is not a constant expression}}
+// expected-note at -3 {{shift count 32 >= width of type 'int' (32 bits)}}
+#else
+// expected-error at -5 {{expression is not an integral constant expression}}
+// expected-note at -6 {{shift count 32 >= width of type 'int' (32 bits)}}
+// expected-warning at -7 {{enumeration types with a fixed underlying type are a C++11 extension}}
+#endif
+
 
 // PR28903
 struct PR28903 {

>From 01bd0394c9d5809be2d125ae1dfd3faef8bf0942 Mon Sep 17 00:00:00 2001
From: Carlos Seo <carlos.seo at linaro.org>
Date: Wed, 24 Jul 2024 11:18:08 -0300
Subject: [PATCH 31/91] [AArch64] Implement INIT/ADJUST_TRAMPOLINE (#70267)

Add support for llvm.init.trampoline and llvm.adjust.trampoline
intrinsics for AArch64.

Fixes https://github.com/llvm/llvm-project/issues/65573
Fixes https://github.com/llvm/llvm-project/issues/76927
Fixes https://github.com/llvm/llvm-project/issues/83555
Updates https://github.com/llvm/llvm-project/pull/66157

(cherry picked from commit c4b66bf4d065d3bbc2e2fac8512a6df8e013c704)
---
 compiler-rt/lib/builtins/README.txt           |  5 ++
 compiler-rt/lib/builtins/trampoline_setup.c   | 42 ++++++++++++++
 .../builtins/Unit/trampoline_setup_test.c     |  2 +-
 .../Target/AArch64/AArch64ISelLowering.cpp    | 58 +++++++++++++++++++
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |  2 +
 llvm/test/CodeGen/AArch64/trampoline.ll       | 19 ++++++
 6 files changed, 127 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/AArch64/trampoline.ll

diff --git a/compiler-rt/lib/builtins/README.txt b/compiler-rt/lib/builtins/README.txt
index 2d213d95f333a..19f26c92a0f94 100644
--- a/compiler-rt/lib/builtins/README.txt
+++ b/compiler-rt/lib/builtins/README.txt
@@ -272,6 +272,11 @@ switch32
 switch8
 switchu8
 
+// This function generates a custom trampoline function with the specific
+// realFunc and localsPtr values.
+void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated,
+                        const void* realFunc, void* localsPtr);
+
 // There is no C interface to the *_vfp_d8_d15_regs functions.  There are
 // called in the prolog and epilog of Thumb1 functions.  When the C++ ABI use
 // SJLJ for exceptions, each function with a catch clause or destructors needs
diff --git a/compiler-rt/lib/builtins/trampoline_setup.c b/compiler-rt/lib/builtins/trampoline_setup.c
index 844eb27944142..830e25e4c0303 100644
--- a/compiler-rt/lib/builtins/trampoline_setup.c
+++ b/compiler-rt/lib/builtins/trampoline_setup.c
@@ -41,3 +41,45 @@ COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack,
   __clear_cache(trampOnStack, &trampOnStack[10]);
 }
 #endif // __powerpc__ && !defined(__powerpc64__)
+
+// The AArch64 compiler generates calls to __trampoline_setup() when creating
+// trampoline functions on the stack for use with nested functions.
+// This function creates a custom 36-byte trampoline function on the stack
+// which loads x18 with a pointer to the outer function's locals
+// and then jumps to the target nested function.
+// Note: x18 is a reserved platform register on Windows and macOS.
+
+#if defined(__aarch64__) && defined(__ELF__)
+COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack,
+                                        int trampSizeAllocated,
+                                        const void *realFunc, void *localsPtr) {
+  // This should never happen, but if compiler did not allocate
+  // enough space on stack for the trampoline, abort.
+  if (trampSizeAllocated < 36)
+    compilerrt_abort();
+
+  // create trampoline
+  // Load realFunc into x17. mov/movk 16 bits at a time.
+  trampOnStack[0] =
+      0xd2800000u | ((((uint64_t)realFunc >> 0) & 0xffffu) << 5) | 0x11;
+  trampOnStack[1] =
+      0xf2a00000u | ((((uint64_t)realFunc >> 16) & 0xffffu) << 5) | 0x11;
+  trampOnStack[2] =
+      0xf2c00000u | ((((uint64_t)realFunc >> 32) & 0xffffu) << 5) | 0x11;
+  trampOnStack[3] =
+      0xf2e00000u | ((((uint64_t)realFunc >> 48) & 0xffffu) << 5) | 0x11;
+  // Load localsPtr into x18
+  trampOnStack[4] =
+      0xd2800000u | ((((uint64_t)localsPtr >> 0) & 0xffffu) << 5) | 0x12;
+  trampOnStack[5] =
+      0xf2a00000u | ((((uint64_t)localsPtr >> 16) & 0xffffu) << 5) | 0x12;
+  trampOnStack[6] =
+      0xf2c00000u | ((((uint64_t)localsPtr >> 32) & 0xffffu) << 5) | 0x12;
+  trampOnStack[7] =
+      0xf2e00000u | ((((uint64_t)localsPtr >> 48) & 0xffffu) << 5) | 0x12;
+  trampOnStack[8] = 0xd61f0220; // br x17
+
+  // Clear instruction cache.
+  __clear_cache(trampOnStack, &trampOnStack[9]);
+}
+#endif // defined(__aarch64__) && !defined(__APPLE__) && !defined(_WIN64)
diff --git a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c
index da115fe764271..d51d35acaa02f 100644
--- a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c
+++ b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c
@@ -7,7 +7,7 @@
 
 /*
  * Tests nested functions
- * The ppc compiler generates a call to __trampoline_setup
+ * The ppc and aarch64 compilers generates a call to __trampoline_setup
  * The i386 and x86_64 compilers generate a call to ___enable_execute_stack
  */
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 87e7750768d2d..6d413a09407a9 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1080,6 +1080,10 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
   // Try to create BICs for vector ANDs.
   setTargetDAGCombine(ISD::AND);
 
+  // llvm.init.trampoline and llvm.adjust.trampoline
+  setOperationAction(ISD::INIT_TRAMPOLINE, MVT::Other, Custom);
+  setOperationAction(ISD::ADJUST_TRAMPOLINE, MVT::Other, Custom);
+
   // Vector add and sub nodes may conceal a high-half opportunity.
   // Also, try to fold ADD into CSINC/CSINV..
   setTargetDAGCombine({ISD::ADD, ISD::ABS, ISD::SUB, ISD::XOR, ISD::SINT_TO_FP,
@@ -6688,6 +6692,56 @@ static SDValue LowerFLDEXP(SDValue Op, SelectionDAG &DAG) {
   return Final;
 }
 
+SDValue AArch64TargetLowering::LowerADJUST_TRAMPOLINE(SDValue Op,
+                                                      SelectionDAG &DAG) const {
+  // Note: x18 cannot be used for the Nest parameter on Windows and macOS.
+  if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows())
+    report_fatal_error(
+        "ADJUST_TRAMPOLINE operation is only supported on Linux.");
+
+  return Op.getOperand(0);
+}
+
+SDValue AArch64TargetLowering::LowerINIT_TRAMPOLINE(SDValue Op,
+                                                    SelectionDAG &DAG) const {
+
+  // Note: x18 cannot be used for the Nest parameter on Windows and macOS.
+  if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows())
+    report_fatal_error("INIT_TRAMPOLINE operation is only supported on Linux.");
+
+  SDValue Chain = Op.getOperand(0);
+  SDValue Trmp = Op.getOperand(1); // trampoline
+  SDValue FPtr = Op.getOperand(2); // nested function
+  SDValue Nest = Op.getOperand(3); // 'nest' parameter value
+  SDLoc dl(Op);
+
+  EVT PtrVT = getPointerTy(DAG.getDataLayout());
+  Type *IntPtrTy = DAG.getDataLayout().getIntPtrType(*DAG.getContext());
+
+  TargetLowering::ArgListTy Args;
+  TargetLowering::ArgListEntry Entry;
+
+  Entry.Ty = IntPtrTy;
+  Entry.Node = Trmp;
+  Args.push_back(Entry);
+  Entry.Node = DAG.getConstant(20, dl, MVT::i64);
+  Args.push_back(Entry);
+
+  Entry.Node = FPtr;
+  Args.push_back(Entry);
+  Entry.Node = Nest;
+  Args.push_back(Entry);
+
+  // Lower to a call to __trampoline_setup(Trmp, TrampSize, FPtr, ctx_reg)
+  TargetLowering::CallLoweringInfo CLI(DAG);
+  CLI.setDebugLoc(dl).setChain(Chain).setLibCallee(
+      CallingConv::C, Type::getVoidTy(*DAG.getContext()),
+      DAG.getExternalSymbol("__trampoline_setup", PtrVT), std::move(Args));
+
+  std::pair<SDValue, SDValue> CallResult = LowerCallTo(CLI);
+  return CallResult.second;
+}
+
 SDValue AArch64TargetLowering::LowerOperation(SDValue Op,
                                               SelectionDAG &DAG) const {
   LLVM_DEBUG(dbgs() << "Custom lowering: ");
@@ -6705,6 +6759,10 @@ SDValue AArch64TargetLowering::LowerOperation(SDValue Op,
     return LowerGlobalTLSAddress(Op, DAG);
   case ISD::PtrAuthGlobalAddress:
     return LowerPtrAuthGlobalAddress(Op, DAG);
+  case ISD::ADJUST_TRAMPOLINE:
+    return LowerADJUST_TRAMPOLINE(Op, DAG);
+  case ISD::INIT_TRAMPOLINE:
+    return LowerINIT_TRAMPOLINE(Op, DAG);
   case ISD::SETCC:
   case ISD::STRICT_FSETCC:
   case ISD::STRICT_FSETCCS:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index ef45e4f01ecd3..81e15185f985d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -1143,6 +1143,8 @@ class AArch64TargetLowering : public TargetLowering {
   SDValue LowerSELECT_CC(ISD::CondCode CC, SDValue LHS, SDValue RHS,
                          SDValue TVal, SDValue FVal, const SDLoc &dl,
                          SelectionDAG &DAG) const;
+  SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerJumpTable(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerBR_JT(SDValue Op, SelectionDAG &DAG) const;
   SDValue LowerBRIND(SDValue Op, SelectionDAG &DAG) const;
diff --git a/llvm/test/CodeGen/AArch64/trampoline.ll b/llvm/test/CodeGen/AArch64/trampoline.ll
new file mode 100644
index 0000000000000..293e538a7459d
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/trampoline.ll
@@ -0,0 +1,19 @@
+; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s
+
+declare void @llvm.init.trampoline(ptr, ptr, ptr);
+declare ptr @llvm.adjust.trampoline(ptr);
+
+define i64 @f(ptr nest %c, i64 %x, i64 %y) {
+  %sum = add i64 %x, %y
+  ret i64 %sum
+}
+
+define i64 @main() {
+  %val = alloca i64
+  %nval = bitcast ptr %val to ptr
+  %tramp = alloca [36 x i8], align 8
+  ; CHECK:	bl	__trampoline_setup
+  call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
+  %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
+  ret i64 0
+}

>From 98b2bc5b08802ab0ee79b28e10ed3ea531588d67 Mon Sep 17 00:00:00 2001
From: Carlos Seo <carlos.seo at linaro.org>
Date: Wed, 24 Jul 2024 18:14:05 -0300
Subject: [PATCH 32/91] [Flang][Docs] Update information about AArch64
 trampolines (#100391)

Commits c4b66bf and 7647174 add support for AArch64 trampolines. Updated
documentation to reflect the changes.

(cherry picked from commit c6e69b041a7e6d18463f6cf684b10fd46a62c496)
---
 flang/docs/InternalProcedureTrampolines.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/flang/docs/InternalProcedureTrampolines.md b/flang/docs/InternalProcedureTrampolines.md
index ef02f1d737c82..41f6155332a47 100644
--- a/flang/docs/InternalProcedureTrampolines.md
+++ b/flang/docs/InternalProcedureTrampolines.md
@@ -239,7 +239,7 @@ automatically deallocated at the end of `host()` invocation.
 Unfortunately, this requires the program stack to be writeable and executable
 at the same time, which might be a security concern.
 
-> NOTE: LLVM's AArch64 backend supports `nest` attribute, but it does not seem to support trampoline intrinsics.
+> NOTE: LLVM's AArch64 backend supports `nest` attribute, but it requires the compiler-rt runtime selected via the `-rtlib=compiler-rt` flag.
 
 ## Alternative implementation(s)
 

>From 14fa8cd47eddf6f5837759872d99836b50eb55be Mon Sep 17 00:00:00 2001
From: Daniil Kovalev <dkovalev at accesssoftek.com>
Date: Thu, 25 Jul 2024 02:13:30 +0300
Subject: [PATCH 33/91] [PAC][clang] Enable `-fptrauth-indirect-gotos` as part
 of pauthtest ABI (#100480)

(cherry picked from commit 3f6eb13abf643afec17a73448ede380606531226)
---
 clang/lib/Driver/ToolChains/Clang.cpp | 4 ++++
 clang/test/Driver/aarch64-ptrauth.c   | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 78936fd634f33..5de29f1eca614 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -1516,6 +1516,10 @@ static void handlePAuthABI(const ArgList &DriverArgs, ArgStringList &CC1Args) {
           options::OPT_fno_ptrauth_vtable_pointer_type_discrimination))
     CC1Args.push_back("-fptrauth-vtable-pointer-type-discrimination");
 
+  if (!DriverArgs.hasArg(options::OPT_fptrauth_indirect_gotos,
+                         options::OPT_fno_ptrauth_indirect_gotos))
+    CC1Args.push_back("-fptrauth-indirect-gotos");
+
   if (!DriverArgs.hasArg(options::OPT_fptrauth_init_fini,
                          options::OPT_fno_ptrauth_init_fini))
     CC1Args.push_back("-fptrauth-init-fini");
diff --git a/clang/test/Driver/aarch64-ptrauth.c b/clang/test/Driver/aarch64-ptrauth.c
index d13930e8f4b37..eeb9500792d75 100644
--- a/clang/test/Driver/aarch64-ptrauth.c
+++ b/clang/test/Driver/aarch64-ptrauth.c
@@ -19,16 +19,16 @@
 // RUN: %clang -### -c --target=aarch64-linux-pauthtest %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI1
 // PAUTHABI1:      "-cc1"{{.*}} "-triple" "aarch64-unknown-linux-pauthtest"
 // PAUTHABI1-SAME: "-target-abi" "pauthtest"
-// PAUTHABI1-SAME: "-fptrauth-intrinsics" "-fptrauth-calls" "-fptrauth-returns" "-fptrauth-auth-traps" "-fptrauth-vtable-pointer-address-discrimination" "-fptrauth-vtable-pointer-type-discrimination" "-fptrauth-init-fini"
+// PAUTHABI1-SAME: "-fptrauth-intrinsics" "-fptrauth-calls" "-fptrauth-returns" "-fptrauth-auth-traps" "-fptrauth-vtable-pointer-address-discrimination" "-fptrauth-vtable-pointer-type-discrimination" "-fptrauth-indirect-gotos" "-fptrauth-init-fini"
 
 // RUN: %clang -### -c --target=aarch64 -mabi=pauthtest -fno-ptrauth-intrinsics \
 // RUN:   -fno-ptrauth-calls -fno-ptrauth-returns -fno-ptrauth-auth-traps \
 // RUN:   -fno-ptrauth-vtable-pointer-address-discrimination -fno-ptrauth-vtable-pointer-type-discrimination \
-// RUN:   -fno-ptrauth-init-fini %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI2
+// RUN:   -fno-ptrauth-indirect-gotos -fno-ptrauth-init-fini %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI2
 // RUN: %clang -### -c --target=aarch64-pauthtest -fno-ptrauth-intrinsics \
 // RUN:   -fno-ptrauth-calls -fno-ptrauth-returns -fno-ptrauth-auth-traps \
 // RUN:   -fno-ptrauth-vtable-pointer-address-discrimination -fno-ptrauth-vtable-pointer-type-discrimination \
-// RUN:   -fno-ptrauth-init-fini %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI2
+// RUN:   -fno-ptrauth-indirect-gotos -fno-ptrauth-init-fini %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI2
 // PAUTHABI2:     "-cc1"
 // PAUTHABI2-NOT: "-fptrauth-
 

>From bdeb078aa24734ce6d2e573701f63e6111d7bab4 Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Wed, 24 Jul 2024 21:06:19 -0500
Subject: [PATCH 34/91] [libc] Only add '-fno-builtin-*' on the entrypoints
 that use them (#100481)

Summary:
The GPU build needs to be able to inline stuff in LTO. Builtin
transformations cause problems on the functions that the optimizer does
heavy libcall recognition on. Previously we moved to using
`-fno-builtin-*` to allow us to only disable the problematic ones.
However, this still didn't allow inlining because each function had the
attribute that told the inliner not to inlining a nobuiltin function
into a non-nobuiltin function

This patch fixes that by only applying these attributes to the
entrypoints that define them. That is enough to prevent recursive calls
within the definitoins themselves.

(cherry picked from commit 8e43acbfedf53ded43ec693ddaaf518cb7416c1c)
---
 libc/cmake/modules/LLVMLibCCompileOptionRules.cmake | 11 +----------
 libc/cmake/modules/LLVMLibCObjectRules.cmake        | 11 +++++++++++
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/libc/cmake/modules/LLVMLibCCompileOptionRules.cmake b/libc/cmake/modules/LLVMLibCCompileOptionRules.cmake
index 97d1c7262d24d..7a1c45a814eb6 100644
--- a/libc/cmake/modules/LLVMLibCCompileOptionRules.cmake
+++ b/libc/cmake/modules/LLVMLibCCompileOptionRules.cmake
@@ -104,16 +104,7 @@ function(_get_common_compile_options output_var flags)
       list(APPEND compile_options "-ffixed-point")
     endif()
 
-    # Builtin recognition causes issues when trying to implement the builtin
-    # functions themselves. The GPU backends do not use libcalls so we disable
-    # the known problematic ones. This allows inlining during LTO linking.
-    if(LIBC_TARGET_OS_IS_GPU)
-      set(libc_builtins bcmp strlen memmem bzero memcmp memcpy memmem memmove
-                        memset strcmp strstr)
-      foreach(builtin ${libc_builtins})
-        list(APPEND compile_options "-fno-builtin-${builtin}")
-      endforeach()
-    else()
+    if(NOT LIBC_TARGET_OS_IS_GPU)
       list(APPEND compile_options "-fno-builtin")
     endif()
 
diff --git a/libc/cmake/modules/LLVMLibCObjectRules.cmake b/libc/cmake/modules/LLVMLibCObjectRules.cmake
index 2d3db38ecd8a3..68b5ed1ed51c0 100644
--- a/libc/cmake/modules/LLVMLibCObjectRules.cmake
+++ b/libc/cmake/modules/LLVMLibCObjectRules.cmake
@@ -279,6 +279,17 @@ function(create_entrypoint_object fq_target_name)
   add_dependencies(${fq_target_name} ${full_deps_list})
   target_link_libraries(${fq_target_name} ${full_deps_list})
 
+  # Builtin recognition causes issues when trying to implement the builtin
+  # functions themselves. The GPU backends do not use libcalls so we disable the
+  # known problematic ones on the entrypoints that implement them.
+  if(LIBC_TARGET_OS_IS_GPU)
+    set(libc_builtins bcmp strlen memmem bzero memcmp memcpy memmem memmove
+                      memset strcmp strstr)
+    if(${ADD_ENTRYPOINT_OBJ_NAME} IN_LIST libc_builtins)
+      target_compile_options(${fq_target_name} PRIVATE -fno-builtin-${ADD_ENTRYPOINT_OBJ_NAME})
+    endif()
+  endif()
+
   set_target_properties(
     ${fq_target_name}
     PROPERTIES

>From 86e7adaa1b77089c7d8e39f13b8365a7fa92dde6 Mon Sep 17 00:00:00 2001
From: Kiran Chandramohan <kiran.chandramohan at arm.com>
Date: Wed, 24 Jul 2024 16:28:24 +0100
Subject: [PATCH 35/91] [Flang][Driver] Enable config file options (#100343)

Config files provide a facility to invoke the compiler with a predefined
set of options. The patch only enables these options in the flang
driver. Functionality was always there.

(cherry picked from commit 8a77961280536b680c404a49002a00b988ca45fc)
---
 clang/include/clang/Driver/Options.td         | 10 +--
 flang/test/Driver/Inputs/config-1.cfg         |  1 +
 flang/test/Driver/Inputs/config-2.cfg         |  1 +
 flang/test/Driver/Inputs/config-2a.cfg        |  1 +
 flang/test/Driver/Inputs/config-6.cfg         |  1 +
 flang/test/Driver/Inputs/config/config-4.cfg  |  1 +
 flang/test/Driver/Inputs/config2/config-4.cfg |  1 +
 flang/test/Driver/config-file.f90             | 63 +++++++++++++++++++
 8 files changed, 74 insertions(+), 5 deletions(-)
 create mode 100644 flang/test/Driver/Inputs/config-1.cfg
 create mode 100644 flang/test/Driver/Inputs/config-2.cfg
 create mode 100644 flang/test/Driver/Inputs/config-2a.cfg
 create mode 100644 flang/test/Driver/Inputs/config-6.cfg
 create mode 100644 flang/test/Driver/Inputs/config/config-4.cfg
 create mode 100644 flang/test/Driver/Inputs/config2/config-4.cfg
 create mode 100644 flang/test/Driver/config-file.f90

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 69269cf7537b0..359a698ea87dd 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1165,19 +1165,19 @@ def client__name : JoinedOrSeparate<["-"], "client_name">;
 def combine : Flag<["-", "--"], "combine">, Flags<[NoXarchOption, Unsupported]>;
 def compatibility__version : JoinedOrSeparate<["-"], "compatibility_version">;
 def config : Joined<["--"], "config=">, Flags<[NoXarchOption]>,
-  Visibility<[ClangOption, CLOption, DXCOption]>, MetaVarName<"<file>">,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>, MetaVarName<"<file>">,
   HelpText<"Specify configuration file">;
-def : Separate<["--"], "config">, Alias<config>;
+def : Separate<["--"], "config">, Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>, Alias<config>;
 def no_default_config : Flag<["--"], "no-default-config">,
-  Flags<[NoXarchOption]>, Visibility<[ClangOption, CLOption, DXCOption]>,
+  Flags<[NoXarchOption]>, Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>,
   HelpText<"Disable loading default configuration files">;
 def config_system_dir_EQ : Joined<["--"], "config-system-dir=">,
   Flags<[NoXarchOption, HelpHidden]>,
-  Visibility<[ClangOption, CLOption, DXCOption]>,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>,
   HelpText<"System directory for configuration files">;
 def config_user_dir_EQ : Joined<["--"], "config-user-dir=">,
   Flags<[NoXarchOption, HelpHidden]>,
-  Visibility<[ClangOption, CLOption, DXCOption]>,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>,
   HelpText<"User directory for configuration files">;
 def coverage : Flag<["-", "--"], "coverage">, Group<Link_Group>,
   Visibility<[ClangOption, CLOption]>;
diff --git a/flang/test/Driver/Inputs/config-1.cfg b/flang/test/Driver/Inputs/config-1.cfg
new file mode 100644
index 0000000000000..824e128a42b63
--- /dev/null
+++ b/flang/test/Driver/Inputs/config-1.cfg
@@ -0,0 +1 @@
+-flto
diff --git a/flang/test/Driver/Inputs/config-2.cfg b/flang/test/Driver/Inputs/config-2.cfg
new file mode 100644
index 0000000000000..4e8d01b668e83
--- /dev/null
+++ b/flang/test/Driver/Inputs/config-2.cfg
@@ -0,0 +1 @@
+-fno-signed-zeros
diff --git a/flang/test/Driver/Inputs/config-2a.cfg b/flang/test/Driver/Inputs/config-2a.cfg
new file mode 100644
index 0000000000000..cd2916c98afe2
--- /dev/null
+++ b/flang/test/Driver/Inputs/config-2a.cfg
@@ -0,0 +1 @@
+-fopenmp
diff --git a/flang/test/Driver/Inputs/config-6.cfg b/flang/test/Driver/Inputs/config-6.cfg
new file mode 100644
index 0000000000000..81e9830f63be4
--- /dev/null
+++ b/flang/test/Driver/Inputs/config-6.cfg
@@ -0,0 +1 @@
+-fstack-arrays
diff --git a/flang/test/Driver/Inputs/config/config-4.cfg b/flang/test/Driver/Inputs/config/config-4.cfg
new file mode 100644
index 0000000000000..d15a7108d4e21
--- /dev/null
+++ b/flang/test/Driver/Inputs/config/config-4.cfg
@@ -0,0 +1 @@
+-O3
diff --git a/flang/test/Driver/Inputs/config2/config-4.cfg b/flang/test/Driver/Inputs/config2/config-4.cfg
new file mode 100644
index 0000000000000..9d1c3e38c8680
--- /dev/null
+++ b/flang/test/Driver/Inputs/config2/config-4.cfg
@@ -0,0 +1 @@
+-ffp-contract=fast
diff --git a/flang/test/Driver/config-file.f90 b/flang/test/Driver/config-file.f90
new file mode 100644
index 0000000000000..70316dd971f36
--- /dev/null
+++ b/flang/test/Driver/config-file.f90
@@ -0,0 +1,63 @@
+!--- Config file (full path) in output of -###
+!
+! RUN: %flang --config-system-dir=%S/Inputs/config --config-user-dir=%S/Inputs/config2 -o /dev/null -v 2>&1 | FileCheck %s -check-prefix CHECK-DIRS
+! CHECK-DIRS: System configuration file directory: {{.*}}/Inputs/config
+! CHECK-DIRS: User configuration file directory: {{.*}}/Inputs/config2
+!
+!--- Config file (full path) in output of -###
+!
+! RUN: %flang --config %S/Inputs/config-1.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-HHH
+! RUN: %flang --config=%S/Inputs/config-1.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-HHH
+! CHECK-HHH: Configuration file: {{.*}}Inputs{{.}}config-1.cfg
+! CHECK-HHH: -flto
+!
+!
+!--- Config file (full path) in output of -v
+!
+! RUN: %flang --config %S/Inputs/config-1.cfg -S %s -o /dev/null -v 2>&1 | FileCheck %s -check-prefix CHECK-V
+! CHECK-V: Configuration file: {{.*}}Inputs{{.}}config-1.cfg
+! CHECK-V: -flto
+!
+!--- Config file in output of -###
+!
+! RUN: %flang --config-system-dir=%S/Inputs --config-user-dir= --config config-1.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-HHH2
+! CHECK-HHH2: Configuration file: {{.*}}Inputs{{.}}config-1.cfg
+! CHECK-HHH2: -flto
+!
+!--- Config file in output of -v
+!
+! RUN: %flang --config-system-dir=%S/Inputs --config-user-dir= --config config-1.cfg -S %s -o /dev/null -v 2>&1 | FileCheck %s -check-prefix CHECK-V2
+! CHECK-V2: Configuration file: {{.*}}Inputs{{.}}config-1.cfg
+! CHECK-V2: -flto
+!
+!--- Nested config files
+!
+! RUN: %flang --config-system-dir=%S/Inputs --config-user-dir= --config config-2.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-NESTED
+! CHECK-NESTED: Configuration file: {{.*}}Inputs{{.}}config-2.cfg
+! CHECK-NESTED: -fno-signed-zeros
+!
+! RUN: %flang --config-system-dir=%S/Inputs --config-user-dir=%S/Inputs/config --config config-6.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-NESTED2
+! CHECK-NESTED2: Configuration file: {{.*}}Inputs{{.}}config-6.cfg
+! CHECK-NESTED2: -fstack-arrays
+!
+!
+! RUN: %flang --config %S/Inputs/config-2a.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-NESTEDa
+! CHECK-NESTEDa: Configuration file: {{.*}}Inputs{{.}}config-2a.cfg
+! CHECK-NESTEDa: -fopenmp
+!
+! RUN: %flang --config-system-dir=%S/Inputs --config-user-dir= --config config-2a.cfg -S %s -### 2>&1 | FileCheck %s -check-prefix CHECK-NESTED2a
+! CHECK-NESTED2a: Configuration file: {{.*}}Inputs{{.}}config-2a.cfg
+! CHECK-NESTED2a: -fopenmp
+!
+!--- User directory is searched first.
+!
+! RUN: %flang --config-system-dir=%S/Inputs/config --config-user-dir=%S/Inputs/config2 --config config-4.cfg -S %s -o /dev/null -v 2>&1 | FileCheck %s -check-prefix CHECK-PRECEDENCE
+! CHECK-PRECEDENCE: Configuration file: {{.*}}Inputs{{.}}config2{{.}}config-4.cfg
+! CHECK-PRECEDENCE: -ffp-contract=fast
+!
+!--- Multiple configuration files can be specified.
+! RUN: %flang --config-system-dir=%S/Inputs/config --config-user-dir= --config config-4.cfg --config %S/Inputs/config2/config-4.cfg -S %s -o /dev/null -v 2>&1 | FileCheck %s -check-prefix CHECK-TWO-CONFIGS
+! CHECK-TWO-CONFIGS: Configuration file: {{.*}}Inputs{{.}}config{{.}}config-4.cfg
+! CHECK-TWO-CONFIGS-NEXT: Configuration file: {{.*}}Inputs{{.}}config2{{.}}config-4.cfg
+! CHECK-TWO-CONFIGS: -ffp-contract=fast
+! CHECK-TWO-CONFIGS: -O3

>From 3e4abe985a584f3067448b5169c05509d3b571d2 Mon Sep 17 00:00:00 2001
From: Kerry McLaughlin <kerry.mclaughlin at arm.com>
Date: Wed, 24 Jul 2024 14:30:25 +0100
Subject: [PATCH 36/91] [AArch64][SME] Rewrite __arm_get_current_vg to preserve
 required registers (#100143)

The documentation for the __arm_get_current_vg support routine specifies
that the following registers are call-preserved:
 - X1-X15, X19-X29 and SP
 - Z0-Z31
 - P0-P15

This patch rewrites the implementation of this routine in compiler-rt,
as the current version does not guarantee that these registers will be
preserved.

(cherry picked from commit 6da6772bf0a33131aa8540c9d4f60d5db75c32b5)
---
 compiler-rt/lib/builtins/aarch64/sme-abi-vg.c | 28 ------------
 compiler-rt/lib/builtins/aarch64/sme-abi.S    | 44 +++++++++++++++++++
 2 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/compiler-rt/lib/builtins/aarch64/sme-abi-vg.c b/compiler-rt/lib/builtins/aarch64/sme-abi-vg.c
index 062cf80fc6848..20061012e16c6 100644
--- a/compiler-rt/lib/builtins/aarch64/sme-abi-vg.c
+++ b/compiler-rt/lib/builtins/aarch64/sme-abi-vg.c
@@ -10,15 +10,6 @@ struct FEATURES {
 
 extern struct FEATURES __aarch64_cpu_features;
 
-struct SME_STATE {
-  long PSTATE;
-  long TPIDR2_EL0;
-};
-
-extern struct SME_STATE __arm_sme_state(void) __arm_streaming_compatible;
-
-extern bool __aarch64_has_sme_and_tpidr2_el0;
-
 #if __GNUC__ >= 9
 #pragma GCC diagnostic ignored "-Wprio-ctor-dtor"
 #endif
@@ -28,22 +19,3 @@ __attribute__((constructor(90))) static void get_aarch64_cpu_features(void) {
 
   __init_cpu_features();
 }
-
-__attribute__((target("sve"))) long
-__arm_get_current_vg(void) __arm_streaming_compatible {
-  struct SME_STATE State = __arm_sme_state();
-  unsigned long long features =
-      __atomic_load_n(&__aarch64_cpu_features.features, __ATOMIC_RELAXED);
-  bool HasSVE = features & (1ULL << FEAT_SVE);
-
-  if (!HasSVE && !__aarch64_has_sme_and_tpidr2_el0)
-    return 0;
-
-  if (HasSVE || (State.PSTATE & 1)) {
-    long vl;
-    __asm__ __volatile__("cntd %0" : "=r"(vl));
-    return vl;
-  }
-
-  return 0;
-}
diff --git a/compiler-rt/lib/builtins/aarch64/sme-abi.S b/compiler-rt/lib/builtins/aarch64/sme-abi.S
index 4c0ff66931db7..cd8153f60670f 100644
--- a/compiler-rt/lib/builtins/aarch64/sme-abi.S
+++ b/compiler-rt/lib/builtins/aarch64/sme-abi.S
@@ -12,11 +12,15 @@
 #if !defined(__APPLE__)
 #define TPIDR2_SYMBOL SYMBOL_NAME(__aarch64_has_sme_and_tpidr2_el0)
 #define TPIDR2_SYMBOL_OFFSET :lo12:SYMBOL_NAME(__aarch64_has_sme_and_tpidr2_el0)
+#define CPU_FEATS_SYMBOL SYMBOL_NAME(__aarch64_cpu_features)
+#define CPU_FEATS_SYMBOL_OFFSET :lo12:SYMBOL_NAME(__aarch64_cpu_features)
 #else
 // MachO requires @page/@pageoff directives because the global is defined
 // in a different file. Otherwise this file may fail to build.
 #define TPIDR2_SYMBOL SYMBOL_NAME(__aarch64_has_sme_and_tpidr2_el0)@page
 #define TPIDR2_SYMBOL_OFFSET SYMBOL_NAME(__aarch64_has_sme_and_tpidr2_el0)@pageoff
+#define CPU_FEATS_SYMBOL SYMBOL_NAME(__aarch64_cpu_features)@page
+#define CPU_FEATS_SYMBOL_OFFSET SYMBOL_NAME(__aarch64_cpu_features)@pageoff
 #endif
 
 .arch armv9-a+sme
@@ -180,6 +184,46 @@ DEFINE_COMPILERRT_OUTLINE_FUNCTION_UNMANGLED(__arm_za_disable)
   ret
 END_COMPILERRT_OUTLINE_FUNCTION(__arm_za_disable)
 
+DEFINE_COMPILERRT_OUTLINE_FUNCTION_UNMANGLED(__arm_get_current_vg)
+  .variant_pcs __arm_get_current_vg
+  BTI_C
+
+  stp     x29, x30, [sp, #-16]!
+  .cfi_def_cfa_offset 16
+  mov     x29, sp
+  .cfi_def_cfa w29, 16
+  .cfi_offset w30, -8
+  .cfi_offset w29, -16
+  adrp    x17, CPU_FEATS_SYMBOL
+  ldr     w17, [x17, CPU_FEATS_SYMBOL_OFFSET]
+  tbnz    w17, #30, 0f
+  adrp    x16, TPIDR2_SYMBOL
+  ldrb    w16, [x16, TPIDR2_SYMBOL_OFFSET]
+  cbz     w16, 1f
+0:
+  mov     x18, x1
+  bl      __arm_sme_state
+  mov     x1, x18
+  and     x17, x17, #0x40000000
+  bfxil   x17, x0, #0, #1
+  cbz     x17, 1f
+  cntd    x0
+  .cfi_def_cfa wsp, 16
+  ldp     x29, x30, [sp], #16
+  .cfi_def_cfa_offset 0
+  .cfi_restore w30
+  .cfi_restore w29
+  ret
+1:
+  mov     x0, xzr
+  .cfi_def_cfa wsp, 16
+  ldp     x29, x30, [sp], #16
+  .cfi_def_cfa_offset 0
+  .cfi_restore w30
+  .cfi_restore w29
+  ret
+END_COMPILERRT_OUTLINE_FUNCTION(__arm_get_current_vg)
+
 NO_EXEC_STACK_DIRECTIVE
 
 // GNU property note for BTI and PAC

>From d767c52c26d7f9a143b23934917645efe0763364 Mon Sep 17 00:00:00 2001
From: Vlad Serebrennikov <serebrennikov.vladislav at gmail.com>
Date: Thu, 25 Jul 2024 20:15:14 +0400
Subject: [PATCH 37/91] [clang] Remove `__is_layout_compatible` from revertible
 type traits list (#100572)

`__is_layout_compatible` was added in Clang 19 (#81506), and at that
time it wasn't entirely clear whether it should be a revertible type
trait or not. We decided to follow the example of other type traits.
Since then #95969 happened, and now we know that we don't want new
revertible type traits.

This patch removes `__is_layout_compatible` from revertible type traits
list, and leaves a comment what revertible type traits are, and that new
type traits should not be added there.

The intention is to also cherry-pick this to 19 branch.

(cherry picked from commit 3295d377f37a60597321f502d164b5d6b1948e28)
---
 clang/lib/Parse/ParseExpr.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Parse/ParseExpr.cpp b/clang/lib/Parse/ParseExpr.cpp
index 0a017ae79de75..e82b565272831 100644
--- a/clang/lib/Parse/ParseExpr.cpp
+++ b/clang/lib/Parse/ParseExpr.cpp
@@ -763,6 +763,9 @@ class CastExpressionIdValidator final : public CorrectionCandidateCallback {
 bool Parser::isRevertibleTypeTrait(const IdentifierInfo *II,
                                    tok::TokenKind *Kind) {
   if (RevertibleTypeTraits.empty()) {
+// Revertible type trait is a feature for backwards compatibility with older
+// standard libraries that declare their own structs with the same name as
+// the builtins listed below. New builtins should NOT be added to this list.
 #define RTT_JOIN(X, Y) X##Y
 #define REVERTIBLE_TYPE_TRAIT(Name)                                            \
   RevertibleTypeTraits[PP.getIdentifierInfo(#Name)] = RTT_JOIN(tok::kw_, Name)
@@ -790,7 +793,6 @@ bool Parser::isRevertibleTypeTrait(const IdentifierInfo *II,
     REVERTIBLE_TYPE_TRAIT(__is_fundamental);
     REVERTIBLE_TYPE_TRAIT(__is_integral);
     REVERTIBLE_TYPE_TRAIT(__is_interface_class);
-    REVERTIBLE_TYPE_TRAIT(__is_layout_compatible);
     REVERTIBLE_TYPE_TRAIT(__is_literal);
     REVERTIBLE_TYPE_TRAIT(__is_lvalue_expr);
     REVERTIBLE_TYPE_TRAIT(__is_lvalue_reference);

>From ec17a7a75911547b4567bb16fca72042abd105ff Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Thu, 25 Jul 2024 12:16:48 -0500
Subject: [PATCH 38/91] [libc++] Add missing xlocale.h include on Apple and
 FreeBSD (#99689)

The `<locale>` header uses `strtoll_l` and friends which are defined in
`<xlocale.h>` on these platforms. While this works via transitive
includes when modules are disabled, this doesn't work anymore if the
platforms are modularized properly.

(cherry picked from commit a55df237375e98cfc2520d5eb1a23b302ef02ba0)
---
 libcxx/include/locale | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/libcxx/include/locale b/libcxx/include/locale
index dbec23a2c936d..573910a85bef5 100644
--- a/libcxx/include/locale
+++ b/libcxx/include/locale
@@ -232,6 +232,10 @@ template <class charT> class messages_byname;
 #    include <__locale_dir/locale_base_api/bsd_locale_fallbacks.h>
 #  endif
 
+#  if defined(__APPLE__) || defined(__FreeBSD__)
+#    include <xlocale.h>
+#  endif
+
 #  if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #    pragma GCC system_header
 #  endif

>From 511b8b094dd89f826cc95b52a68804e68d854a10 Mon Sep 17 00:00:00 2001
From: Anton Korobeynikov <anton at korobeynikov.info>
Date: Thu, 25 Jul 2024 11:57:46 -0700
Subject: [PATCH 39/91] Normalize ptrauth handling in sanitizer runtime
 (#100483)

1. Include `ptrauth.h` if `ptrauth_intrinsics` language feature is specified (per ptrauth spec, this is what enables `ptrauh.h` usage and functions like `ptrauth_strip`)
 2. For PAC-RET fallback implement two changes:
    1. Switch to macro, so we can ignore key argument
    2. Ensure the unsigned value is erased from LR, so the possibility of gadget reuse is reduced.

Fixes #100467

(cherry picked from commit cc4f98979b079b517edd8a71f56a8975f436e63d)
---
 .../lib/sanitizer_common/sanitizer_ptrauth.h  | 46 ++++++++++---------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h b/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
index 5200354694851..b5215c0d49c06 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
@@ -9,31 +9,33 @@
 #ifndef SANITIZER_PTRAUTH_H
 #define SANITIZER_PTRAUTH_H
 
-#if __has_feature(ptrauth_calls)
-#include <ptrauth.h>
+#if __has_feature(ptrauth_intrinsics)
+#  include <ptrauth.h>
 #elif defined(__ARM_FEATURE_PAC_DEFAULT) && !defined(__APPLE__)
-inline unsigned long ptrauth_strip(void* __value, unsigned int __key) {
-  // On the stack the link register is protected with Pointer
-  // Authentication Code when compiled with -mbranch-protection.
-  // Let's stripping the PAC unconditionally because xpaclri is in
-  // the NOP space so will do nothing when it is not enabled or not available.
-  unsigned long ret;
-  asm volatile(
-      "mov x30, %1\n\t"
-      "hint #7\n\t"  // xpaclri
-      "mov %0, x30\n\t"
-      : "=r"(ret)
-      : "r"(__value)
-      : "x30");
-  return ret;
-}
-#define ptrauth_auth_data(__value, __old_key, __old_data) __value
-#define ptrauth_string_discriminator(__string) ((int)0)
+// On the stack the link register is protected with Pointer
+// Authentication Code when compiled with -mbranch-protection.
+// Let's stripping the PAC unconditionally because xpaclri is in
+// the NOP space so will do nothing when it is not enabled or not available.
+#  define ptrauth_strip(__value, __key) \
+    ({                                  \
+      unsigned long ret;                \
+      asm volatile(                     \
+          "mov x30, %1\n\t"             \
+          "hint #7\n\t"                 \
+          "mov %0, x30\n\t"             \
+          "mov x30, xzr\n\t"            \
+          : "=r"(ret)                   \
+          : "r"(__value)                \
+          : "x30");                     \
+      ret;                              \
+    })
+#  define ptrauth_auth_data(__value, __old_key, __old_data) __value
+#  define ptrauth_string_discriminator(__string) ((int)0)
 #else
 // Copied from <ptrauth.h>
-#define ptrauth_strip(__value, __key) __value
-#define ptrauth_auth_data(__value, __old_key, __old_data) __value
-#define ptrauth_string_discriminator(__string) ((int)0)
+#  define ptrauth_strip(__value, __key) __value
+#  define ptrauth_auth_data(__value, __old_key, __old_data) __value
+#  define ptrauth_string_discriminator(__string) ((int)0)
 #endif
 
 #define STRIP_PAC_PC(pc) ((uptr)ptrauth_strip(pc, 0))

>From debf818b93b5fe5fbe3a8e2434f90d0a830b865b Mon Sep 17 00:00:00 2001
From: Tom Eccles <tom.eccles at arm.com>
Date: Thu, 25 Jul 2024 16:53:27 +0100
Subject: [PATCH 40/91] [flang][OpenMP] Initialize privatised derived type
 variables (#100417)

Fixes #91928

(cherry picked from commit 98e733eaf2af1a5c1d9392e279d21182ffdf560d)
---
 flang/include/flang/Lower/ConvertVariable.h   |  8 ++++
 flang/lib/Lower/ConvertVariable.cpp           | 23 ++++-----
 .../lib/Lower/OpenMP/DataSharingProcessor.cpp |  6 +++
 .../Lower/OpenMP/private-derived-type.f90     | 47 +++++++++++++++++++
 4 files changed, 73 insertions(+), 11 deletions(-)
 create mode 100644 flang/test/Lower/OpenMP/private-derived-type.f90

diff --git a/flang/include/flang/Lower/ConvertVariable.h b/flang/include/flang/Lower/ConvertVariable.h
index 515f4695951b4..de394a39e112e 100644
--- a/flang/include/flang/Lower/ConvertVariable.h
+++ b/flang/include/flang/Lower/ConvertVariable.h
@@ -62,6 +62,14 @@ using AggregateStoreMap = llvm::DenseMap<AggregateStoreKey, mlir::Value>;
 void instantiateVariable(AbstractConverter &, const pft::Variable &var,
                          SymMap &symMap, AggregateStoreMap &storeMap);
 
+/// Does this variable have a default initialization?
+bool hasDefaultInitialization(const Fortran::semantics::Symbol &sym);
+
+/// Call default initialization runtime routine to initialize \p var.
+void defaultInitializeAtRuntime(Fortran::lower::AbstractConverter &converter,
+                                const Fortran::semantics::Symbol &sym,
+                                Fortran::lower::SymMap &symMap);
+
 /// Create a fir::GlobalOp given a module variable definition. This is intended
 /// to be used when lowering a module definition, not when lowering variables
 /// used from a module. For used variables instantiateVariable must directly be
diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp
index 47ad48fb322cc..4fcfa0b126e04 100644
--- a/flang/lib/Lower/ConvertVariable.cpp
+++ b/flang/lib/Lower/ConvertVariable.cpp
@@ -72,7 +72,8 @@ static mlir::Value genScalarValue(Fortran::lower::AbstractConverter &converter,
 }
 
 /// Does this variable have a default initialization?
-static bool hasDefaultInitialization(const Fortran::semantics::Symbol &sym) {
+bool Fortran::lower::hasDefaultInitialization(
+    const Fortran::semantics::Symbol &sym) {
   if (sym.has<Fortran::semantics::ObjectEntityDetails>() && sym.size())
     if (!Fortran::semantics::IsAllocatableOrPointer(sym))
       if (const Fortran::semantics::DeclTypeSpec *declTypeSpec = sym.GetType())
@@ -353,7 +354,7 @@ static mlir::Value genComponentDefaultInit(
       // global constructor since this has no runtime cost.
       componentValue = fir::factory::createUnallocatedBox(
           builder, loc, componentTy, std::nullopt);
-    } else if (hasDefaultInitialization(component)) {
+    } else if (Fortran::lower::hasDefaultInitialization(component)) {
       // Component type has default initialization.
       componentValue = genDefaultInitializerValue(converter, loc, component,
                                                   componentTy, stmtCtx);
@@ -556,7 +557,7 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter,
                 builder.createConvert(loc, symTy, fir::getBase(initVal));
             builder.create<fir::HasValueOp>(loc, castTo);
           });
-    } else if (hasDefaultInitialization(sym)) {
+    } else if (Fortran::lower::hasDefaultInitialization(sym)) {
       Fortran::lower::createGlobalInitialization(
           builder, global, [&](fir::FirOpBuilder &builder) {
             Fortran::lower::StatementContext stmtCtx(
@@ -752,17 +753,15 @@ mustBeDefaultInitializedAtRuntime(const Fortran::lower::pft::Variable &var) {
     return true;
   // Local variables (including function results), and intent(out) dummies must
   // be default initialized at runtime if their type has default initialization.
-  return hasDefaultInitialization(sym);
+  return Fortran::lower::hasDefaultInitialization(sym);
 }
 
 /// Call default initialization runtime routine to initialize \p var.
-static void
-defaultInitializeAtRuntime(Fortran::lower::AbstractConverter &converter,
-                           const Fortran::lower::pft::Variable &var,
-                           Fortran::lower::SymMap &symMap) {
+void Fortran::lower::defaultInitializeAtRuntime(
+    Fortran::lower::AbstractConverter &converter,
+    const Fortran::semantics::Symbol &sym, Fortran::lower::SymMap &symMap) {
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   mlir::Location loc = converter.getCurrentLocation();
-  const Fortran::semantics::Symbol &sym = var.getSymbol();
   fir::ExtendedValue exv = converter.getSymbolExtendedValue(sym, &symMap);
   if (Fortran::semantics::IsOptional(sym)) {
     // 15.5.2.12 point 3, absent optional dummies are not initialized.
@@ -927,7 +926,8 @@ static void instantiateLocal(Fortran::lower::AbstractConverter &converter,
   if (needDummyIntentoutFinalization(var))
     finalizeAtRuntime(converter, var, symMap);
   if (mustBeDefaultInitializedAtRuntime(var))
-    defaultInitializeAtRuntime(converter, var, symMap);
+    Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(),
+                                               symMap);
   if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol())) {
     auto *builder = &converter.getFirOpBuilder();
     mlir::Location loc = converter.getCurrentLocation();
@@ -1168,7 +1168,8 @@ static void instantiateAlias(Fortran::lower::AbstractConverter &converter,
   // do not try optimizing this to single default initializations of
   // the equivalenced storages. Keep lowering simple.
   if (mustBeDefaultInitializedAtRuntime(var))
-    defaultInitializeAtRuntime(converter, var, symMap);
+    Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(),
+                                               symMap);
 }
 
 //===--------------------------------------------------------------===//
diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
index 7e76a81e0df92..a340b62eb7b66 100644
--- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
@@ -13,6 +13,7 @@
 #include "DataSharingProcessor.h"
 
 #include "Utils.h"
+#include "flang/Lower/ConvertVariable.h"
 #include "flang/Lower/PFTBuilder.h"
 #include "flang/Lower/SymbolMap.h"
 #include "flang/Optimizer/Builder/HLFIRTools.h"
@@ -117,6 +118,11 @@ void DataSharingProcessor::cloneSymbol(const semantics::Symbol *sym) {
   bool success = converter.createHostAssociateVarClone(*sym);
   (void)success;
   assert(success && "Privatization failed due to existing binding");
+
+  bool isFirstPrivate = sym->test(semantics::Symbol::Flag::OmpFirstPrivate);
+  if (!isFirstPrivate &&
+      Fortran::lower::hasDefaultInitialization(sym->GetUltimate()))
+    Fortran::lower::defaultInitializeAtRuntime(converter, *sym, *symTable);
 }
 
 void DataSharingProcessor::copyFirstPrivateSymbol(
diff --git a/flang/test/Lower/OpenMP/private-derived-type.f90 b/flang/test/Lower/OpenMP/private-derived-type.f90
new file mode 100644
index 0000000000000..230484f20c11d
--- /dev/null
+++ b/flang/test/Lower/OpenMP/private-derived-type.f90
@@ -0,0 +1,47 @@
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s | FileCheck %s
+! RUN: bbc -emit-hlfir -fopenmp -o - %s | FileCheck %s
+
+subroutine s4
+  type y3
+    integer,allocatable::x
+  end type y3
+  type(y3)::v
+  !$omp parallel
+  !$omp do private(v)
+  do i=1,10
+    v%x=1
+  end do
+  !$omp end do
+  !$omp end parallel
+end subroutine s4
+
+
+! CHECK-LABEL:   func.func @_QPs4() {
+!                  Example of how the lowering for regular derived type variables:
+! CHECK:           %[[VAL_8:.*]] = fir.alloca !fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}> {bindc_name = "v", uniq_name = "_QFs4Ev"}
+! CHECK:           %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]] {uniq_name = "_QFs4Ev"} : (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>, !fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>)
+! CHECK:           %[[VAL_10:.*]] = fir.embox %[[VAL_9]]#1 : (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>
+! CHECK:           %[[VAL_11:.*]] = fir.address_of
+! CHECK:           %[[VAL_12:.*]] = arith.constant 4 : i32
+! CHECK:           %[[VAL_13:.*]] = fir.convert %[[VAL_10]] : (!fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<none>
+! CHECK:           %[[VAL_14:.*]] = fir.convert %[[VAL_11]] : (!fir.ref<!fir.char<1,{{.*}}>>) -> !fir.ref<i8>
+! CHECK:           %[[VAL_15:.*]] = fir.call @_FortranAInitialize(%[[VAL_13]], %[[VAL_14]], %[[VAL_12]]) fastmath<contract> : (!fir.box<none>, !fir.ref<i8>, i32) -> none
+! CHECK:           omp.parallel {
+! CHECK:             %[[VAL_23:.*]] = fir.alloca !fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}> {bindc_name = "v", pinned, uniq_name = "_QFs4Ev"}
+! CHECK:             %[[VAL_24:.*]]:2 = hlfir.declare %[[VAL_23]] {uniq_name = "_QFs4Ev"} : (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>, !fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>)
+! CHECK:             %[[VAL_25:.*]] = fir.embox %[[VAL_24]]#1 : (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>
+! CHECK:             %[[VAL_26:.*]] = fir.address_of
+! CHECK:             %[[VAL_27:.*]] = arith.constant 4 : i32
+! CHECK:             %[[VAL_28:.*]] = fir.convert %[[VAL_25]] : (!fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<none>
+! CHECK:             %[[VAL_29:.*]] = fir.convert %[[VAL_26]] : (!fir.ref<!fir.char<1,{{.*}}>>) -> !fir.ref<i8>
+!                    Check we do call FortranAInitialize on the derived type
+! CHECK:             %[[VAL_30:.*]] = fir.call @_FortranAInitialize(%[[VAL_28]], %[[VAL_29]], %[[VAL_27]]) fastmath<contract> : (!fir.box<none>, !fir.ref<i8>, i32) -> none
+! CHECK:             omp.wsloop {
+! CHECK:             omp.terminator
+! CHECK:           }
+! CHECK:           %[[VAL_39:.*]] = fir.embox %[[VAL_9]]#1 : (!fir.ref<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>
+! CHECK:           %[[VAL_40:.*]] = fir.convert %[[VAL_39]] : (!fir.box<!fir.type<_QFs4Ty3{x:!fir.box<!fir.heap<i32>>}>>) -> !fir.box<none>
+!                  Check the derived type is destroyed
+! CHECK:           %[[VAL_41:.*]] = fir.call @_FortranADestroy(%[[VAL_40]]) fastmath<contract> : (!fir.box<none>) -> none
+! CHECK:           return
+! CHECK:         }

>From 50b9db3a2df2a69c490b50fcd2021c882ae34b85 Mon Sep 17 00:00:00 2001
From: Alan Zhao <ayzhao at google.com>
Date: Thu, 25 Jul 2024 17:38:44 -0700
Subject: [PATCH 41/91] [compiler-rt][ubsan][nfc-ish] Fix a type conversion bug
 (#100665)

If the inline asm version of `ptrauth_strip` is used instead of the
builtin, the inline asm implementation currently returns an unsigned
long, causing an incompatible pointer conversion issue. The spec for
`ptrauth_sign` is that the result has the same type as the original
value, so we add a cast to the result of the inline asm.

(cherry picked from commit 25f9415713f9f57760a5322876906dc11385ef8e)
---
 compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h b/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
index b5215c0d49c06..265a9925a15a0 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_ptrauth.h
@@ -18,7 +18,7 @@
 // the NOP space so will do nothing when it is not enabled or not available.
 #  define ptrauth_strip(__value, __key) \
     ({                                  \
-      unsigned long ret;                \
+      __typeof(__value) ret;            \
       asm volatile(                     \
           "mov x30, %1\n\t"             \
           "hint #7\n\t"                 \

>From 61b0a2fdf2c1d7432f54cfd0cd2e75556bdc0d33 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Thu, 25 Jul 2024 12:25:19 +0200
Subject: [PATCH 42/91] [BasicAA] Fix handling of indirect assumption based
 results (#100130)

If a result is potentially based on a not yet proven assumption,
BasicAA will remember it inside AssumptionBasedResults and remove
the cache entry if an assumption higher up is later disproved.
However, we currently miss the case where another cache entry ends
up depending on such an AssumptionBased result.

Fix this by introducing an additional AssumptionBased state for
cache entries. If such a result is used, we'll still increment
AAQI.NumAssumptionUses, which means that the using entry will
also become AssumptionBased and be cleared if the assumption is
disproved.

At the end of the root query, convert remaining AssumptionBased
results into definitive results.

Fixes https://github.com/llvm/llvm-project/issues/98978.

(cherry picked from commit 91073380ac5a0dceebdd09f360a1dc194d7ee93f)
---
 llvm/include/llvm/Analysis/AliasAnalysis.h    |  17 ++-
 llvm/lib/Analysis/BasicAliasAnalysis.cpp      |  28 ++++-
 .../Transforms/SLPVectorizer/X86/pr98978.ll   | 106 ++++++++++++++++++
 3 files changed, 144 insertions(+), 7 deletions(-)
 create mode 100644 llvm/test/Transforms/SLPVectorizer/X86/pr98978.ll

diff --git a/llvm/include/llvm/Analysis/AliasAnalysis.h b/llvm/include/llvm/Analysis/AliasAnalysis.h
index 812b5a9f72a3a..4140387a1f341 100644
--- a/llvm/include/llvm/Analysis/AliasAnalysis.h
+++ b/llvm/include/llvm/Analysis/AliasAnalysis.h
@@ -244,12 +244,23 @@ class AAQueryInfo {
 public:
   using LocPair = std::pair<AACacheLoc, AACacheLoc>;
   struct CacheEntry {
+    /// Cache entry is neither an assumption nor does it use a (non-definitive)
+    /// assumption.
+    static constexpr int Definitive = -2;
+    /// Cache entry is not an assumption itself, but may be using an assumption
+    /// from higher up the stack.
+    static constexpr int AssumptionBased = -1;
+
     AliasResult Result;
-    /// Number of times a NoAlias assumption has been used.
-    /// 0 for assumptions that have not been used, -1 for definitive results.
+    /// Number of times a NoAlias assumption has been used, 0 for assumptions
+    /// that have not been used. Can also take one of the Definitive or
+    /// AssumptionBased values documented above.
     int NumAssumptionUses;
+
     /// Whether this is a definitive (non-assumption) result.
-    bool isDefinitive() const { return NumAssumptionUses < 0; }
+    bool isDefinitive() const { return NumAssumptionUses == Definitive; }
+    /// Whether this is an assumption that has not been proven yet.
+    bool isAssumption() const { return NumAssumptionUses >= 0; }
   };
 
   // Alias analysis result aggregration using which this query is performed.
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 161a3034e4829..e474899fb548e 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1692,9 +1692,12 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
   if (!Pair.second) {
     auto &Entry = Pair.first->second;
     if (!Entry.isDefinitive()) {
-      // Remember that we used an assumption.
-      ++Entry.NumAssumptionUses;
+      // Remember that we used an assumption. This may either be a direct use
+      // of an assumption, or a use of an entry that may itself be based on an
+      // assumption.
       ++AAQI.NumAssumptionUses;
+      if (Entry.isAssumption())
+        ++Entry.NumAssumptionUses;
     }
     // Cache contains sorted {V1,V2} pairs but we should return original order.
     auto Result = Entry.Result;
@@ -1722,7 +1725,6 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
   Entry.Result = Result;
   // Cache contains sorted {V1,V2} pairs.
   Entry.Result.swap(Swapped);
-  Entry.NumAssumptionUses = -1;
 
   // If the assumption has been disproven, remove any results that may have
   // been based on this assumption. Do this after the Entry updates above to
@@ -1734,8 +1736,26 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
   // The result may still be based on assumptions higher up in the chain.
   // Remember it, so it can be purged from the cache later.
   if (OrigNumAssumptionUses != AAQI.NumAssumptionUses &&
-      Result != AliasResult::MayAlias)
+      Result != AliasResult::MayAlias) {
     AAQI.AssumptionBasedResults.push_back(Locs);
+    Entry.NumAssumptionUses = AAQueryInfo::CacheEntry::AssumptionBased;
+  } else {
+    Entry.NumAssumptionUses = AAQueryInfo::CacheEntry::Definitive;
+  }
+
+  // Depth is incremented before this function is called, so Depth==1 indicates
+  // a root query.
+  if (AAQI.Depth == 1) {
+    // Any remaining assumption based results must be based on proven
+    // assumptions, so convert them to definitive results.
+    for (const auto &Loc : AAQI.AssumptionBasedResults) {
+      auto It = AAQI.AliasCache.find(Loc);
+      if (It != AAQI.AliasCache.end())
+        It->second.NumAssumptionUses = AAQueryInfo::CacheEntry::Definitive;
+    }
+    AAQI.AssumptionBasedResults.clear();
+    AAQI.NumAssumptionUses = 0;
+  }
   return Result;
 }
 
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/pr98978.ll b/llvm/test/Transforms/SLPVectorizer/X86/pr98978.ll
new file mode 100644
index 0000000000000..429bf13b2b87a
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/pr98978.ll
@@ -0,0 +1,106 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=slp-vectorizer < %s | FileCheck %s
+
+target triple = "x86_64-redhat-linux-gnu"
+
+; The load+store sequence inside bb10 should not get vectorized. Previously,
+; we incorrectly determined that the pointers do not alias, because a cache
+; entry based indirectly on a disproven NoAlias assumption was not cleared
+; from the BatchAA cache.
+define void @test(ptr %p1, i64 %arg1, i64 %arg2) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: ptr [[P1:%.*]], i64 [[ARG1:%.*]], i64 [[ARG2:%.*]]) {
+; CHECK-NEXT:  [[_PREHEADER48_PREHEADER_1:.*]]:
+; CHECK-NEXT:    br label %[[_LOOPEXIT49_1:.*]]
+; CHECK:       [[_LOOPEXIT49_1]]:
+; CHECK-NEXT:    [[I:%.*]] = phi ptr [ [[I21:%.*]], %[[BB20:.*]] ], [ [[P1]], %[[_PREHEADER48_PREHEADER_1]] ]
+; CHECK-NEXT:    br i1 false, label %[[BB22:.*]], label %[[DOTPREHEADER48_PREHEADER_1:.*]]
+; CHECK:       [[DEAD:.*]]:
+; CHECK-NEXT:    br label %[[DOTPREHEADER48_PREHEADER_1]]
+; CHECK:       [[_PREHEADER48_PREHEADER_2:.*:]]
+; CHECK-NEXT:    [[I5:%.*]] = phi ptr [ [[I]], %[[DEAD]] ], [ [[I]], %[[_LOOPEXIT49_1]] ]
+; CHECK-NEXT:    br label %[[DOTLOOPEXIT49_1:.*]]
+; CHECK:       [[DEAD1:.*]]:
+; CHECK-NEXT:    br i1 false, label %[[DOTLOOPEXIT49_1]], label %[[BB20]]
+; CHECK:       [[_LOOPEXIT49_2:.*:]]
+; CHECK-NEXT:    [[I6:%.*]] = phi ptr [ [[I5]], %[[DEAD1]] ], [ [[I5]], %[[DOTPREHEADER48_PREHEADER_1]] ]
+; CHECK-NEXT:    [[I7:%.*]] = getelementptr i8, ptr [[I6]], i64 [[ARG1]]
+; CHECK-NEXT:    br label %[[BB10:.*]]
+; CHECK:       [[DEAD2:.*]]:
+; CHECK-NEXT:    br label %[[BB10]]
+; CHECK:       [[BB10]]:
+; CHECK-NEXT:    [[I11:%.*]] = phi ptr [ [[I7]], %[[DOTLOOPEXIT49_1]] ], [ null, %[[DEAD2]] ]
+; CHECK-NEXT:    [[I16:%.*]] = getelementptr i8, ptr [[I11]], i64 8
+; CHECK-NEXT:    [[I17:%.*]] = load i64, ptr [[I16]], align 1
+; CHECK-NEXT:    store i64 [[I17]], ptr [[I6]], align 1
+; CHECK-NEXT:    [[I18:%.*]] = getelementptr i8, ptr [[I6]], i64 8
+; CHECK-NEXT:    [[I19:%.*]] = load i64, ptr [[I11]], align 1
+; CHECK-NEXT:    store i64 [[I19]], ptr [[I18]], align 1
+; CHECK-NEXT:    br label %[[BB20]]
+; CHECK:       [[BB20]]:
+; CHECK-NEXT:    [[I21]] = phi ptr [ [[I5]], %[[DEAD1]] ], [ [[I6]], %[[BB10]] ]
+; CHECK-NEXT:    br label %[[_LOOPEXIT49_1]]
+; CHECK:       [[BB22]]:
+; CHECK-NEXT:    [[I23:%.*]] = getelementptr i8, ptr [[I]], i64 [[ARG2]]
+; CHECK-NEXT:    [[I25:%.*]] = getelementptr i8, ptr [[I23]], i64 8
+; CHECK-NEXT:    br label %[[BB26:.*]]
+; CHECK:       [[BB26]]:
+; CHECK-NEXT:    [[I27:%.*]] = phi ptr [ null, %[[BB26]] ], [ [[I25]], %[[BB22]] ]
+; CHECK-NEXT:    store i64 0, ptr [[I27]], align 1
+; CHECK-NEXT:    [[I28:%.*]] = getelementptr i8, ptr [[I27]], i64 8
+; CHECK-NEXT:    [[I29:%.*]] = load i64, ptr [[I23]], align 1
+; CHECK-NEXT:    store i64 0, ptr [[I28]], align 1
+; CHECK-NEXT:    br label %[[BB26]]
+;
+entry:
+  br label %loop1
+
+loop1:                                            ; preds = %bb20, %entry
+  %i = phi ptr [ %i21, %bb20 ], [ %p1, %entry ]
+  br i1 false, label %bb22, label %.preheader48.preheader.1
+
+dead:                                             ; No predecessors!
+  br label %.preheader48.preheader.1
+
+.preheader48.preheader.1:                         ; preds = %dead, %loop1
+  %i5 = phi ptr [ %i, %dead ], [ %i, %loop1 ]
+  br label %.loopexit49.1
+
+dead1:                                    ; No predecessors!
+  br i1 false, label %.loopexit49.1, label %bb20
+
+.loopexit49.1:                                    ; preds = %dead1, %.preheader48.preheader.1
+  %i6 = phi ptr [ %i5, %dead1 ], [ %i5, %.preheader48.preheader.1 ]
+  %i7 = getelementptr i8, ptr %i6, i64 %arg1
+  br label %bb10
+
+dead2:                                            ; No predecessors!
+  br label %bb10
+
+bb10:                                             ; preds = %dead2, %.loopexit49.1
+  %i11 = phi ptr [ %i7, %.loopexit49.1 ], [ null, %dead2 ]
+  %i16 = getelementptr i8, ptr %i11, i64 8
+  %i17 = load i64, ptr %i16, align 1
+  store i64 %i17, ptr %i6, align 1
+  %i18 = getelementptr i8, ptr %i6, i64 8
+  %i19 = load i64, ptr %i11, align 1
+  store i64 %i19, ptr %i18, align 1
+  br label %bb20
+
+bb20:                                             ; preds = %bb10, %dead1
+  %i21 = phi ptr [ %i5, %dead1 ], [ %i6, %bb10 ]
+  br label %loop1
+
+bb22:                                             ; preds = %loop1
+  %i23 = getelementptr i8, ptr %i, i64 %arg2
+  %i25 = getelementptr i8, ptr %i23, i64 8
+  br label %bb26
+
+bb26:                                             ; preds = %bb26, %bb22
+  %i27 = phi ptr [ null, %bb26 ], [ %i25, %bb22 ]
+  store i64 0, ptr %i27, align 1
+  %i28 = getelementptr i8, ptr %i27, i64 8
+  %i29 = load i64, ptr %i23, align 1
+  store i64 0, ptr %i28, align 1
+  br label %bb26
+}

>From 58f851dfb66dcd0af89d0e2da483a358c3643114 Mon Sep 17 00:00:00 2001
From: Daniil Kovalev <dkovalev at accesssoftek.com>
Date: Thu, 25 Jul 2024 22:21:03 +0300
Subject: [PATCH 43/91] [PAC] Sign LR with B key for non-leaf functions with
 ptrauth-returns attr (#100552)

For pauthtest ABI, there is a bunch of ptrauth-* options, including
ptrauth-returns. Use "ptrauth-returns" function attribute to indicate
need for LR signing with B key for non-leaf function to avoid using
"sign-return-address" and "sign-return-address-key" which were
originally designed for pac-ret.

Co-authored-by: Ahmed Bougacha <ahmed at bougacha.org>
Co-authored-by: Anatoly Trosinenko <atrosinenko at accesssoftek.com>
(cherry picked from commit 56fd2472d887392855ad85c53df5782a2c3f8ddb)
---
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |   7 +-
 .../AArch64/AArch64MachineFunctionInfo.cpp    |   4 +
 .../lib/Target/AArch64/AArch64PointerAuth.cpp |   3 +-
 llvm/lib/Target/AArch64/AArch64Subtarget.cpp  |   9 +-
 llvm/lib/Target/AArch64/AArch64Subtarget.h    |   3 +-
 llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll |  98 ++++++++
 llvm/test/CodeGen/AArch64/ptrauth-ret.ll      | 225 ++++++++++++++++++
 7 files changed, 344 insertions(+), 5 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll
 create mode 100644 llvm/test/CodeGen/AArch64/ptrauth-ret.ll

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 1b301a4a05fc5..377bcd5868fb6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -8339,7 +8339,8 @@ AArch64InstrInfo::getOutliningCandidateInfo(
     NumBytesToCreateFrame += 8;
 
     // PAuth is enabled - set extra tail call cost, if any.
-    auto LRCheckMethod = Subtarget.getAuthenticatedLRCheckMethod();
+    auto LRCheckMethod = Subtarget.getAuthenticatedLRCheckMethod(
+        *RepeatedSequenceLocs[0].getMF());
     NumBytesToCheckLRInTCEpilogue =
         AArch64PAuth::getCheckerSizeInBytes(LRCheckMethod);
     // Checking the authenticated LR value may significantly impact
@@ -8700,6 +8701,10 @@ void AArch64InstrInfo::mergeOutliningCandidateAttributes(
   // behaviour of one of them
   const auto &CFn = Candidates.front().getMF()->getFunction();
 
+  if (CFn.hasFnAttribute("ptrauth-returns"))
+    F.addFnAttr(CFn.getFnAttribute("ptrauth-returns"));
+  if (CFn.hasFnAttribute("ptrauth-auth-traps"))
+    F.addFnAttr(CFn.getFnAttribute("ptrauth-auth-traps"));
   // Since all candidates belong to the same module, just copy the
   // function-level attributes of an arbitrary function.
   if (CFn.hasFnAttribute("sign-return-address"))
diff --git a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp
index 201e8047b3686..e96c5a953ff2b 100644
--- a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.cpp
@@ -38,6 +38,8 @@ void AArch64FunctionInfo::initializeBaseYamlFields(
 }
 
 static std::pair<bool, bool> GetSignReturnAddress(const Function &F) {
+  if (F.hasFnAttribute("ptrauth-returns"))
+    return {true, false}; // non-leaf
   // The function should be signed in the following situations:
   // - sign-return-address=all
   // - sign-return-address=non-leaf and the functions spills the LR
@@ -56,6 +58,8 @@ static std::pair<bool, bool> GetSignReturnAddress(const Function &F) {
 }
 
 static bool ShouldSignWithBKey(const Function &F, const AArch64Subtarget &STI) {
+  if (F.hasFnAttribute("ptrauth-returns"))
+    return true;
   if (!F.hasFnAttribute("sign-return-address-key")) {
     if (STI.getTargetTriple().isOSWindows())
       return true;
diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
index 465e689d4a7a5..92ab4b5c3d251 100644
--- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
@@ -341,7 +341,8 @@ bool AArch64PointerAuth::checkAuthenticatedLR(
   AArch64PACKey::ID KeyId =
       MFnI->shouldSignWithBKey() ? AArch64PACKey::IB : AArch64PACKey::IA;
 
-  AuthCheckMethod Method = Subtarget->getAuthenticatedLRCheckMethod();
+  AuthCheckMethod Method =
+      Subtarget->getAuthenticatedLRCheckMethod(*TI->getMF());
 
   if (Method == AuthCheckMethod::None)
     return false;
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 32a355fe38f1c..642006e706c13 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -565,8 +565,13 @@ bool AArch64Subtarget::useAA() const { return UseAA; }
 // exception on its own. Later, if the callee spills the signed LR value and
 // neither FEAT_PAuth2 nor FEAT_EPAC are implemented, the valid PAC replaces
 // the higher bits of LR thus hiding the authentication failure.
-AArch64PAuth::AuthCheckMethod
-AArch64Subtarget::getAuthenticatedLRCheckMethod() const {
+AArch64PAuth::AuthCheckMethod AArch64Subtarget::getAuthenticatedLRCheckMethod(
+    const MachineFunction &MF) const {
+  // TODO: Check subtarget for the scheme. Present variant is a default for
+  // pauthtest ABI.
+  if (MF.getFunction().hasFnAttribute("ptrauth-returns") &&
+      MF.getFunction().hasFnAttribute("ptrauth-auth-traps"))
+    return AArch64PAuth::AuthCheckMethod::HighBitsNoTBI;
   if (AuthenticatedLRCheckMethod.getNumOccurrences())
     return AuthenticatedLRCheckMethod;
 
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.h b/llvm/lib/Target/AArch64/AArch64Subtarget.h
index e585aad2f7a68..0f3a637f98fbe 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -413,7 +413,8 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
   }
 
   /// Choose a method of checking LR before performing a tail call.
-  AArch64PAuth::AuthCheckMethod getAuthenticatedLRCheckMethod() const;
+  AArch64PAuth::AuthCheckMethod
+  getAuthenticatedLRCheckMethod(const MachineFunction &MF) const;
 
   /// Compute the integer discriminator for a given BlockAddress constant, if
   /// blockaddress signing is enabled, or std::nullopt otherwise.
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll b/llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll
new file mode 100644
index 0000000000000..42a3050eda112
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/ptrauth-ret-trap.ll
@@ -0,0 +1,98 @@
+; RUN: llc -mtriple aarch64-linux-gnu -mattr=+pauth -asm-verbose=false -disable-post-ra -o - %s | FileCheck %s
+
+; CHECK-LABEL:  test_tailcall:
+; CHECK-NEXT:   pacibsp
+; CHECK-NEXT:   str x30, [sp, #-16]!
+; CHECK-NEXT:   bl bar
+; CHECK-NEXT:   ldr x30, [sp], #16
+; CHECK-NEXT:   autibsp
+; CHECK-NEXT:   eor x16, x30, x30, lsl #1
+; CHECK-NEXT:   tbnz x16, #62, [[BAD:.L.*]]
+; CHECK-NEXT:   b bar
+; CHECK-NEXT:   [[BAD]]:
+; CHECK-NEXT:   brk #0xc471
+define i32 @test_tailcall() #0 {
+  call i32 @bar()
+  %c = tail call i32 @bar()
+  ret i32 %c
+}
+
+; CHECK-LABEL: test_tailcall_noframe:
+; CHECK-NEXT:  b bar
+define i32 @test_tailcall_noframe() #0 {
+  %c = tail call i32 @bar()
+  ret i32 %c
+}
+
+; CHECK-LABEL: test_tailcall_indirect:
+; CHECK:         autibsp
+; CHECK:         eor     x16, x30, x30, lsl #1
+; CHECK:         tbnz    x16, #62, [[BAD:.L.*]]
+; CHECK:         br      x0
+; CHECK: [[BAD]]:
+; CHECK:         brk     #0xc471
+define void @test_tailcall_indirect(ptr %fptr) #0 {
+  call i32 @test_tailcall()
+  tail call void %fptr()
+  ret void
+}
+
+; CHECK-LABEL: test_tailcall_indirect_in_x9:
+; CHECK:         autibsp
+; CHECK:         eor     x16, x30, x30, lsl #1
+; CHECK:         tbnz    x16, #62, [[BAD:.L.*]]
+; CHECK:         br      x9
+; CHECK: [[BAD]]:
+; CHECK:         brk     #0xc471
+define void @test_tailcall_indirect_in_x9(ptr sret(i64) %ret, [8 x i64] %in, ptr %fptr) #0 {
+  %ptr = alloca i8, i32 16
+  call i32 @test_tailcall()
+  tail call void %fptr(ptr sret(i64) %ret, [8 x i64] %in)
+  ret void
+}
+
+; CHECK-LABEL: test_auth_tailcall_indirect:
+; CHECK:         autibsp
+; CHECK:         eor     x16, x30, x30, lsl #1
+; CHECK:         tbnz    x16, #62, [[BAD:.L.*]]
+; CHECK:         mov x16, #42
+; CHECK:         braa      x0, x16
+; CHECK: [[BAD]]:
+; CHECK:         brk     #0xc471
+define void @test_auth_tailcall_indirect(ptr %fptr) #0 {
+  call i32 @test_tailcall()
+  tail call void %fptr() [ "ptrauth"(i32 0, i64 42) ]
+  ret void
+}
+
+; CHECK-LABEL: test_auth_tailcall_indirect_in_x9:
+; CHECK:         autibsp
+; CHECK:         eor     x16, x30, x30, lsl #1
+; CHECK:         tbnz    x16, #62, [[BAD:.L.*]]
+; CHECK:         brabz      x9
+; CHECK: [[BAD]]:
+; CHECK:         brk     #0xc471
+define void @test_auth_tailcall_indirect_in_x9(ptr sret(i64) %ret, [8 x i64] %in, ptr %fptr) #0 {
+  %ptr = alloca i8, i32 16
+  call i32 @test_tailcall()
+  tail call void %fptr(ptr sret(i64) %ret, [8 x i64] %in) [ "ptrauth"(i32 1, i64 0) ]
+  ret void
+}
+
+; CHECK-LABEL: test_auth_tailcall_indirect_bti:
+; CHECK:         autibsp
+; CHECK:         eor     x17, x30, x30, lsl #1
+; CHECK:         tbnz    x17, #62, [[BAD:.L.*]]
+; CHECK:         brabz      x16
+; CHECK: [[BAD]]:
+; CHECK:         brk     #0xc471
+define void @test_auth_tailcall_indirect_bti(ptr sret(i64) %ret, [8 x i64] %in, ptr %fptr) #0 "branch-target-enforcement"="true" {
+  %ptr = alloca i8, i32 16
+  call i32 @test_tailcall()
+  tail call void %fptr(ptr sret(i64) %ret, [8 x i64] %in) [ "ptrauth"(i32 1, i64 0) ]
+  ret void
+}
+
+declare i32 @bar()
+
+attributes #0 = { nounwind "ptrauth-returns" "ptrauth-auth-traps" }
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-ret.ll b/llvm/test/CodeGen/AArch64/ptrauth-ret.ll
new file mode 100644
index 0000000000000..61f5f6d9d23b7
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/ptrauth-ret.ll
@@ -0,0 +1,225 @@
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -verify-machineinstrs -disable-post-ra \
+; RUN:   -global-isel=0 -o - %s | FileCheck %s
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -verify-machineinstrs -disable-post-ra \
+; RUN:   -global-isel=1 -global-isel-abort=1 -o - %s | FileCheck %s
+
+define i32 @test() #0 {
+; CHECK-LABEL: test:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    str x19, [sp, #-16]!
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    ldr x19, [sp], #16
+; CHECK-NEXT:    ret
+  call void asm sideeffect "", "~{x19}"()
+  ret i32 0
+}
+
+define i32 @test_alloca() #0 {
+; CHECK-LABEL: test_alloca:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    sub sp, sp, #32
+; CHECK-NEXT:    mov x8, sp
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    add sp, sp, #32
+; CHECK-NEXT:    ret
+  %p = alloca i8, i32 32
+  call void asm sideeffect "", "r"(ptr %p)
+  ret i32 0
+}
+
+define i32 @test_realign_alloca() #0 {
+; CHECK-LABEL: test_realign_alloca:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    stp x29, x30, [sp, #-16]!
+; CHECK-NEXT:    mov x29, sp
+; CHECK-NEXT:    sub x9, sp, #112
+; CHECK-NEXT:    and sp, x9, #0xffffffffffffff80
+; CHECK-NEXT:    mov x8, sp
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    mov sp, x29
+; CHECK-NEXT:    ldp x29, x30, [sp], #16
+; CHECK-NEXT:    retab
+  %p = alloca i8, i32 32, align 128
+  call void asm sideeffect "", "r"(ptr %p)
+  ret i32 0
+}
+
+define i32 @test_big_alloca() #0 {
+; CHECK-LABEL: test_big_alloca:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    str x29, [sp, #-16]!
+; CHECK-NEXT:    sub sp, sp, #1024
+; CHECK-NEXT:    mov x8, sp
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    add sp, sp, #1024
+; CHECK-NEXT:    ldr x29, [sp], #16
+; CHECK-NEXT:    ret
+  %p = alloca i8, i32 1024
+  call void asm sideeffect "", "r"(ptr %p)
+  ret i32 0
+}
+
+define i32 @test_var_alloca(i32 %s) #0 {
+  %p = alloca i8, i32 %s
+  call void asm sideeffect "", "r"(ptr %p)
+  ret i32 0
+}
+
+define i32 @test_noframe_saved(ptr %p) #0 {
+; CHECK-LABEL: test_noframe_saved:
+; CHECK:       %bb.0:
+
+
+; CHECK-NEXT:  str     x29, [sp, #-96]!
+; CHECK-NEXT:  stp     x28, x27, [sp, #16]
+; CHECK-NEXT:  stp     x26, x25, [sp, #32]
+; CHECK-NEXT:  stp     x24, x23, [sp, #48]
+; CHECK-NEXT:  stp     x22, x21, [sp, #64]
+; CHECK-NEXT:  stp     x20, x19, [sp, #80]
+; CHECK-NEXT:  ldr     w29, [x0]
+; CHECK-NEXT:  //APP
+; CHECK-NEXT:  //NO_APP
+; CHECK-NEXT:  mov     w0, w29
+; CHECK-NEXT:  ldp     x20, x19, [sp, #80]
+; CHECK-NEXT:  ldp     x22, x21, [sp, #64]
+; CHECK-NEXT:  ldp     x24, x23, [sp, #48]
+; CHECK-NEXT:  ldp     x26, x25, [sp, #32]
+; CHECK-NEXT:  ldp     x28, x27, [sp, #16]
+; CHECK-NEXT:  ldr     x29, [sp], #96
+; CHECK-NEXT:  ret
+  %v = load i32, ptr %p
+  call void asm sideeffect "", "~{x0},~{x1},~{x2},~{x3},~{x4},~{x5},~{x6},~{x7},~{x8},~{x9},~{x10},~{x11},~{x12},~{x13},~{x14},~{x15},~{x16},~{x17},~{x18},~{x19},~{x20},~{x21},~{x22},~{x23},~{x24},~{x25},~{x26},~{x27},~{x28}"()
+  ret i32 %v
+}
+
+define void @test_noframe() #0 {
+; CHECK-LABEL: test_noframe:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    ret
+  ret void
+}
+
+; FIXME: Inefficient lowering of @llvm.returnaddress
+define ptr @test_returnaddress_0() #0 {
+; CHECK-LABEL: test_returnaddress_0:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    str x30, [sp, #-16]!
+; CHECK-NEXT:    xpaci x30
+; CHECK-NEXT:    mov x0, x30
+; CHECK-NEXT:    ldr x30, [sp], #16
+; CHECK-NEXT:    retab
+  %r = call ptr @llvm.returnaddress(i32 0)
+  ret ptr %r
+}
+
+define ptr @test_returnaddress_1() #0 {
+; CHECK-LABEL: test_returnaddress_1:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    stp x29, x30, [sp, #-16]!
+; CHECK-NEXT:    mov x29, sp
+; CHECK-NEXT:    ldr x8, [x29]
+; CHECK-NEXT:    ldr x0, [x8, #8]
+; CHECK-NEXT:    xpaci x0
+; CHECK-NEXT:    ldp x29, x30, [sp], #16
+; CHECK-NEXT:    retab
+  %r = call ptr @llvm.returnaddress(i32 1)
+  ret ptr %r
+}
+
+define void @test_noframe_alloca() #0 {
+; CHECK-LABEL: test_noframe_alloca:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    sub sp, sp, #16
+; CHECK-NEXT:    add x8, sp, #12
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    add sp, sp, #16
+; CHECK-NEXT:    ret
+  %p = alloca i8, i32 1
+  call void asm sideeffect "", "r"(ptr %p)
+  ret void
+}
+
+define void @test_call() #0 {
+; CHECK-LABEL: test_call:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    str x30, [sp, #-16]!
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldr x30, [sp], #16
+; CHECK-NEXT:    retab
+  call i32 @bar()
+  ret void
+}
+
+define void @test_call_alloca() #0 {
+; CHECK-LABEL: test_call_alloca:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    str x30, [sp, #-16]
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldr x30, [sp], #16
+; CHECK-NEXT:    retab
+  alloca i8
+  call i32 @bar()
+  ret void
+}
+
+define void @test_call_shrinkwrapping(i1 %c) #0 {
+; CHECK-LABEL: test_call_shrinkwrapping:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    tbz w0, #0, .LBB12_2
+; CHECK-NEXT:  %bb.1:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    str x30, [sp, #-16]!
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldr x30, [sp], #16
+; CHECK-NEXT:    autibsp
+; CHECK-NEXT:  LBB12_2:
+; CHECK-NEXT:    ret
+  br i1 %c, label %tbb, label %fbb
+tbb:
+  call i32 @bar()
+  br label %fbb
+fbb:
+  ret void
+}
+
+define i32 @test_tailcall() #0 {
+; CHECK-LABEL: test_tailcall:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    pacibsp
+; CHECK-NEXT:    str x30, [sp, #-16]!
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldr x30, [sp], #16
+; CHECK-NEXT:    autibsp
+; CHECK-NEXT:    b bar
+  call i32 @bar()
+  %c = tail call i32 @bar()
+  ret i32 %c
+}
+
+define i32 @test_tailcall_noframe() #0 {
+; CHECK-LABEL: test_tailcall_noframe:
+; CHECK:       %bb.0:
+; CHECK-NEXT:    b bar
+  %c = tail call i32 @bar()
+  ret i32 %c
+}
+
+declare i32 @bar()
+
+declare ptr @llvm.returnaddress(i32)
+
+attributes #0 = { nounwind "ptrauth-returns" }

>From 2a986c55d135d1da7268c73ede08789497b7f992 Mon Sep 17 00:00:00 2001
From: Abid Qadeer <haqadeer at amd.com>
Date: Thu, 25 Jul 2024 13:52:50 +0100
Subject: [PATCH 44/91] [flang][debug] Set scope of internal functions
 correctly. (#99531)

Summary:
The functions internal to subroutine should have the scope set to the
parent function. This allows a user to evaluate local variables of
parent function when control is stopped in the child.

Fixes #96314

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: https://phabricator.intern.facebook.com/D60250527

(cherry picked from commit 626022bfd18f335ef62a461992a05dfed4e6d715)
---
 .../lib/Optimizer/Transforms/AddDebugInfo.cpp | 181 ++++++++++--------
 flang/test/Transforms/debug-96314.fir         |  26 +++
 2 files changed, 132 insertions(+), 75 deletions(-)
 create mode 100644 flang/test/Transforms/debug-96314.fir

diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp
index 685a8645fa2fc..8751a3b2c322f 100644
--- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp
+++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp
@@ -66,6 +66,9 @@ class AddDebugInfoPass : public fir::impl::AddDebugInfoBase<AddDebugInfoPass> {
   void handleGlobalOp(fir::GlobalOp glocalOp, mlir::LLVM::DIFileAttr fileAttr,
                       mlir::LLVM::DIScopeAttr scope,
                       mlir::SymbolTable *symbolTable);
+  void handleFuncOp(mlir::func::FuncOp funcOp, mlir::LLVM::DIFileAttr fileAttr,
+                    mlir::LLVM::DICompileUnitAttr cuAttr,
+                    mlir::SymbolTable *symbolTable);
 };
 
 static uint32_t getLineFromLoc(mlir::Location loc) {
@@ -204,11 +207,112 @@ void AddDebugInfoPass::handleGlobalOp(fir::GlobalOp globalOp,
   globalOp->setLoc(builder.getFusedLoc({globalOp->getLoc()}, gvAttr));
 }
 
+void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp,
+                                    mlir::LLVM::DIFileAttr fileAttr,
+                                    mlir::LLVM::DICompileUnitAttr cuAttr,
+                                    mlir::SymbolTable *symbolTable) {
+  mlir::Location l = funcOp->getLoc();
+  // If fused location has already been created then nothing to do
+  // Otherwise, create a fused location.
+  if (debugInfoIsAlreadySet(l))
+    return;
+
+  mlir::ModuleOp module = getOperation();
+  mlir::MLIRContext *context = &getContext();
+  mlir::OpBuilder builder(context);
+  llvm::StringRef fileName(fileAttr.getName());
+  llvm::StringRef filePath(fileAttr.getDirectory());
+  unsigned int CC = (funcOp.getName() == fir::NameUniquer::doProgramEntry())
+                        ? llvm::dwarf::getCallingConvention("DW_CC_program")
+                        : llvm::dwarf::getCallingConvention("DW_CC_normal");
+
+  if (auto funcLoc = mlir::dyn_cast<mlir::FileLineColLoc>(l)) {
+    fileName = llvm::sys::path::filename(funcLoc.getFilename().getValue());
+    filePath = llvm::sys::path::parent_path(funcLoc.getFilename().getValue());
+  }
+
+  mlir::StringAttr fullName = mlir::StringAttr::get(context, funcOp.getName());
+  mlir::Attribute attr = funcOp->getAttr(fir::getInternalFuncNameAttrName());
+  mlir::StringAttr funcName =
+      (attr) ? mlir::cast<mlir::StringAttr>(attr)
+             : mlir::StringAttr::get(context, funcOp.getName());
+
+  auto result = fir::NameUniquer::deconstruct(funcName);
+  funcName = mlir::StringAttr::get(context, result.second.name);
+
+  llvm::SmallVector<mlir::LLVM::DITypeAttr> types;
+  fir::DebugTypeGenerator typeGen(module);
+  for (auto resTy : funcOp.getResultTypes()) {
+    auto tyAttr = typeGen.convertType(resTy, fileAttr, cuAttr, funcOp.getLoc());
+    types.push_back(tyAttr);
+  }
+  for (auto inTy : funcOp.getArgumentTypes()) {
+    auto tyAttr = typeGen.convertType(fir::unwrapRefType(inTy), fileAttr,
+                                      cuAttr, funcOp.getLoc());
+    types.push_back(tyAttr);
+  }
+
+  mlir::LLVM::DISubroutineTypeAttr subTypeAttr =
+      mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types);
+  mlir::LLVM::DIFileAttr funcFileAttr =
+      mlir::LLVM::DIFileAttr::get(context, fileName, filePath);
+
+  // Only definitions need a distinct identifier and a compilation unit.
+  mlir::DistinctAttr id;
+  mlir::LLVM::DIScopeAttr Scope = fileAttr;
+  mlir::LLVM::DICompileUnitAttr compilationUnit;
+  mlir::LLVM::DISubprogramFlags subprogramFlags =
+      mlir::LLVM::DISubprogramFlags{};
+  if (isOptimized)
+    subprogramFlags = mlir::LLVM::DISubprogramFlags::Optimized;
+  if (!funcOp.isExternal()) {
+    id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context));
+    compilationUnit = cuAttr;
+    subprogramFlags =
+        subprogramFlags | mlir::LLVM::DISubprogramFlags::Definition;
+  }
+  unsigned line = getLineFromLoc(l);
+  if (fir::isInternalProcedure(funcOp)) {
+    // For contained functions, the scope is the parent subroutine.
+    mlir::SymbolRefAttr sym = mlir::cast<mlir::SymbolRefAttr>(
+        funcOp->getAttr(fir::getHostSymbolAttrName()));
+    if (sym) {
+      if (auto func =
+              symbolTable->lookup<mlir::func::FuncOp>(sym.getLeafReference())) {
+        // Make sure that parent is processed.
+        handleFuncOp(func, fileAttr, cuAttr, symbolTable);
+        if (auto fusedLoc =
+                mlir::dyn_cast_if_present<mlir::FusedLoc>(func.getLoc())) {
+          if (auto spAttr =
+                  mlir::dyn_cast_if_present<mlir::LLVM::DISubprogramAttr>(
+                      fusedLoc.getMetadata()))
+            Scope = spAttr;
+        }
+      }
+    }
+  } else if (!result.second.modules.empty()) {
+    Scope = getOrCreateModuleAttr(result.second.modules[0], fileAttr, cuAttr,
+                                  line - 1, false);
+  }
+
+  auto spAttr = mlir::LLVM::DISubprogramAttr::get(
+      context, id, compilationUnit, Scope, funcName, fullName, funcFileAttr,
+      line, line, subprogramFlags, subTypeAttr);
+  funcOp->setLoc(builder.getFusedLoc({funcOp->getLoc()}, spAttr));
+
+  // Don't process variables if user asked for line tables only.
+  if (debugLevel == mlir::LLVM::DIEmissionKind::LineTablesOnly)
+    return;
+
+  funcOp.walk([&](fir::cg::XDeclareOp declOp) {
+    handleDeclareOp(declOp, fileAttr, spAttr, typeGen, symbolTable);
+  });
+}
+
 void AddDebugInfoPass::runOnOperation() {
   mlir::ModuleOp module = getOperation();
   mlir::MLIRContext *context = &getContext();
   mlir::SymbolTable symbolTable(module);
-  mlir::OpBuilder builder(context);
   llvm::StringRef fileName;
   std::string filePath;
   // We need 2 type of file paths here.
@@ -245,80 +349,7 @@ void AddDebugInfoPass::runOnOperation() {
       isOptimized, debugLevel);
 
   module.walk([&](mlir::func::FuncOp funcOp) {
-    mlir::Location l = funcOp->getLoc();
-    // If fused location has already been created then nothing to do
-    // Otherwise, create a fused location.
-    if (debugInfoIsAlreadySet(l))
-      return;
-
-    unsigned int CC = (funcOp.getName() == fir::NameUniquer::doProgramEntry())
-                          ? llvm::dwarf::getCallingConvention("DW_CC_program")
-                          : llvm::dwarf::getCallingConvention("DW_CC_normal");
-
-    if (auto funcLoc = mlir::dyn_cast<mlir::FileLineColLoc>(l)) {
-      fileName = llvm::sys::path::filename(funcLoc.getFilename().getValue());
-      filePath = llvm::sys::path::parent_path(funcLoc.getFilename().getValue());
-    }
-
-    mlir::StringAttr fullName =
-        mlir::StringAttr::get(context, funcOp.getName());
-    mlir::Attribute attr = funcOp->getAttr(fir::getInternalFuncNameAttrName());
-    mlir::StringAttr funcName =
-        (attr) ? mlir::cast<mlir::StringAttr>(attr)
-               : mlir::StringAttr::get(context, funcOp.getName());
-
-    auto result = fir::NameUniquer::deconstruct(funcName);
-    funcName = mlir::StringAttr::get(context, result.second.name);
-
-    llvm::SmallVector<mlir::LLVM::DITypeAttr> types;
-    fir::DebugTypeGenerator typeGen(module);
-    for (auto resTy : funcOp.getResultTypes()) {
-      auto tyAttr =
-          typeGen.convertType(resTy, fileAttr, cuAttr, funcOp.getLoc());
-      types.push_back(tyAttr);
-    }
-    for (auto inTy : funcOp.getArgumentTypes()) {
-      auto tyAttr = typeGen.convertType(fir::unwrapRefType(inTy), fileAttr,
-                                        cuAttr, funcOp.getLoc());
-      types.push_back(tyAttr);
-    }
-
-    mlir::LLVM::DISubroutineTypeAttr subTypeAttr =
-        mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types);
-    mlir::LLVM::DIFileAttr funcFileAttr =
-        mlir::LLVM::DIFileAttr::get(context, fileName, filePath);
-
-    // Only definitions need a distinct identifier and a compilation unit.
-    mlir::DistinctAttr id;
-    mlir::LLVM::DIScopeAttr Scope = fileAttr;
-    mlir::LLVM::DICompileUnitAttr compilationUnit;
-    mlir::LLVM::DISubprogramFlags subprogramFlags =
-        mlir::LLVM::DISubprogramFlags{};
-    if (isOptimized)
-      subprogramFlags = mlir::LLVM::DISubprogramFlags::Optimized;
-    if (!funcOp.isExternal()) {
-      id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context));
-      compilationUnit = cuAttr;
-      subprogramFlags =
-          subprogramFlags | mlir::LLVM::DISubprogramFlags::Definition;
-    }
-    unsigned line = getLineFromLoc(l);
-    if (!result.second.modules.empty())
-      Scope = getOrCreateModuleAttr(result.second.modules[0], fileAttr, cuAttr,
-                                    line - 1, false);
-
-    auto spAttr = mlir::LLVM::DISubprogramAttr::get(
-        context, id, compilationUnit, Scope, funcName, fullName, funcFileAttr,
-        line, line, subprogramFlags, subTypeAttr);
-    funcOp->setLoc(builder.getFusedLoc({funcOp->getLoc()}, spAttr));
-
-    // Don't process variables if user asked for line tables only.
-    if (debugLevel == mlir::LLVM::DIEmissionKind::LineTablesOnly)
-      return;
-
-    funcOp.walk([&](fir::cg::XDeclareOp declOp) {
-      handleDeclareOp(declOp, fileAttr, spAttr, typeGen, &symbolTable);
-    });
+    handleFuncOp(funcOp, fileAttr, cuAttr, &symbolTable);
   });
   // Process any global which was not processed through DeclareOp.
   if (debugLevel == mlir::LLVM::DIEmissionKind::Full) {
diff --git a/flang/test/Transforms/debug-96314.fir b/flang/test/Transforms/debug-96314.fir
new file mode 100644
index 0000000000000..e2d0f24a1105c
--- /dev/null
+++ b/flang/test/Transforms/debug-96314.fir
@@ -0,0 +1,26 @@
+// RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s
+
+module attributes {dlti.dl_spec = #dlti.dl_spec<>} {
+  func.func @_QMhelperPmod_sub(%arg0: !fir.ref<i32> {fir.bindc_name = "a"} ) {
+    return
+  } loc(#loc1)
+  func.func private @_QMhelperFmod_subPchild1(%arg0: !fir.ref<i32> {fir.bindc_name = "b"} ) attributes {fir.host_symbol = @_QMhelperPmod_sub, llvm.linkage = #llvm.linkage<internal>} {
+    return
+  } loc(#loc2)
+  func.func @global_sub_(%arg0: !fir.ref<i32> {fir.bindc_name = "n"} ) attributes {fir.internal_name = "_QPglobal_sub"} {
+    return
+  } loc(#loc3)
+  func.func private @_QFglobal_subPchild2(%arg0: !fir.ref<i32> {fir.bindc_name = "c"}) attributes {fir.host_symbol = @global_sub_, llvm.linkage = #llvm.linkage<internal>} {
+    return
+  } loc(#loc4)
+}
+
+#loc1 = loc("test.f90":5:1)
+#loc2 = loc("test.f90":15:1)
+#loc3 = loc("test.f90":25:1)
+#loc4 = loc("test.f90":35:1)
+
+// CHECK-DAG: #[[SP1:.*]] = #llvm.di_subprogram<{{.*}}name = "mod_sub"{{.*}}>
+// CHECK-DAG: #llvm.di_subprogram<{{.*}}scope = #[[SP1]], name = "child1"{{.*}}>
+// CHECK-DAG: #[[SP2:.*]] = #llvm.di_subprogram<{{.*}}linkageName = "global_sub_"{{.*}}>
+// CHECK-DAG: #llvm.di_subprogram<{{.*}}scope = #[[SP2]], name = "child2"{{.*}}>

>From 8bbb8ba5e7178a5f61361448d7bc345d3f29b997 Mon Sep 17 00:00:00 2001
From: Tobias Hieta <tobias at hieta.se>
Date: Tue, 23 Jul 2024 16:52:51 +0200
Subject: [PATCH 45/91] [Utils] Updates to bump-version.py (#100089)

* Add support for --git flag to bump version for a git suffix
* Update location of the new file where the version is stored
---
 llvm/utils/release/bump-version.py | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/llvm/utils/release/bump-version.py b/llvm/utils/release/bump-version.py
index abff67ae926ac..b1799cba9363e 100755
--- a/llvm/utils/release/bump-version.py
+++ b/llvm/utils/release/bump-version.py
@@ -12,6 +12,9 @@
 
 
 class Processor:
+    def __init__(self, args):
+        self.args = args
+
     def process_line(self, line: str) -> str:
         raise NotImplementedError()
 
@@ -23,6 +26,13 @@ def process_file(self, fpath: Path, version: packaging.version.Version) -> None:
             version.micro,
             version.pre,
         )
+
+        if self.args.rc:
+            self.suffix = f"-rc{self.args.rc}"
+
+        if self.args.git:
+            self.suffix = "git"
+
         data = fpath.read_text()
         new_data = []
 
@@ -64,7 +74,7 @@ def process_line(self, line: str) -> str:
             if self.suffix:
                 nline = re.sub(
                     r"set\(LLVM_VERSION_SUFFIX(.*)\)",
-                    f"set(LLVM_VERSION_SUFFIX -{self.suffix[0]}{self.suffix[1]})",
+                    f"set(LLVM_VERSION_SUFFIX {self.suffix})",
                     line,
                 )
             else:
@@ -144,6 +154,7 @@ def process_line(self, line: str) -> str:
     )
     parser.add_argument("version", help="Version to bump to, e.g. 15.0.1", default=None)
     parser.add_argument("--rc", default=None, type=int, help="RC version")
+    parser.add_argument("--git", action="store_true", help="Git version")
     parser.add_argument(
         "-s",
         "--source-root",
@@ -153,9 +164,10 @@ def process_line(self, line: str) -> str:
 
     args = parser.parse_args()
 
+    if args.rc and args.git:
+        raise RuntimeError("Can't specify --git and --rc at the same time!")
+
     verstr = args.version
-    if args.rc:
-        verstr += f"-rc{args.rc}"
 
     # parse the version string with distutils.
     # note that -rc will end up as version.pre here
@@ -170,20 +182,20 @@ def process_line(self, line: str) -> str:
 
     files_to_update = (
         # Main CMakeLists.
-        (source_root / "llvm" / "CMakeLists.txt", CMakeProcessor()),
+        (source_root / "cmake" / "Modules" / "LLVMVersion.cmake", CMakeProcessor(args)),
         # Lit configuration
         (
             "llvm/utils/lit/lit/__init__.py",
-            LitProcessor(),
+            LitProcessor(args),
         ),
         # GN build system
         (
             "llvm/utils/gn/secondary/llvm/version.gni",
-            GNIProcessor(),
+            GNIProcessor(args),
         ),
         (
             "libcxx/include/__config",
-            LibCXXProcessor(),
+            LibCXXProcessor(args),
         ),
     )
 

>From 07c2140ed9a17a4c3485ad8dba7d8d0c78593ab0 Mon Sep 17 00:00:00 2001
From: Aiden Grossman <aidengrossman at google.com>
Date: Tue, 23 Jul 2024 12:54:18 -0700
Subject: [PATCH 46/91] [MLGO][Infra] Add mlgo-utils to bump-version script
 (#100186)

This patch adds support in the bump-version script for bumping the
version of the mlgo-utils package. This should hopefully streamline the
processor for that with the rest of the project and prevent having to
manually update this package individually.
---
 llvm/utils/release/bump-version.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/llvm/utils/release/bump-version.py b/llvm/utils/release/bump-version.py
index b1799cba9363e..5db62e88fec1d 100755
--- a/llvm/utils/release/bump-version.py
+++ b/llvm/utils/release/bump-version.py
@@ -188,6 +188,11 @@ def process_line(self, line: str) -> str:
             "llvm/utils/lit/lit/__init__.py",
             LitProcessor(args),
         ),
+        # mlgo-utils configuration
+        (
+            "llvm/utils/mlgo-utils/mlgo/__init__.py",
+            LitProcessor(args),
+        ),
         # GN build system
         (
             "llvm/utils/gn/secondary/llvm/version.gni",

>From a4902a36d5c27c3a5199cd1ba91eba17910fdd68 Mon Sep 17 00:00:00 2001
From: Tobias Hieta <tobias at hieta.se>
Date: Fri, 26 Jul 2024 14:00:03 +0200
Subject: [PATCH 47/91] Set version to 19.1.0-rc1

---
 cmake/Modules/LLVMVersion.cmake        | 2 +-
 llvm/utils/mlgo-utils/mlgo/__init__.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/cmake/Modules/LLVMVersion.cmake b/cmake/Modules/LLVMVersion.cmake
index aea9b880180ab..1ad844cbea838 100644
--- a/cmake/Modules/LLVMVersion.cmake
+++ b/cmake/Modules/LLVMVersion.cmake
@@ -10,6 +10,6 @@ if(NOT DEFINED LLVM_VERSION_PATCH)
   set(LLVM_VERSION_PATCH 0)
 endif()
 if(NOT DEFINED LLVM_VERSION_SUFFIX)
-  set(LLVM_VERSION_SUFFIX git)
+  set(LLVM_VERSION_SUFFIX -rc1)
 endif()
 
diff --git a/llvm/utils/mlgo-utils/mlgo/__init__.py b/llvm/utils/mlgo-utils/mlgo/__init__.py
index c5b208cfba360..09e95f3d62058 100644
--- a/llvm/utils/mlgo-utils/mlgo/__init__.py
+++ b/llvm/utils/mlgo-utils/mlgo/__init__.py
@@ -4,7 +4,7 @@
 
 from datetime import timezone, datetime
 
-__versioninfo__ = (19, 0, 0)
+__versioninfo__ = (19, 1, 0)
 __version__ = (
     ".".join(str(v) for v in __versioninfo__)
     + "dev"

>From fb26667a0e53f27c55bec9e6e9dc97f05905d423 Mon Sep 17 00:00:00 2001
From: Owen Pan <owenpiano at gmail.com>
Date: Wed, 24 Jul 2024 19:22:18 -0700
Subject: [PATCH 48/91] Revert "[clang-format] Fix a bug in annotating `*` in
 `#define`s (#99433)"

This reverts commit ce1a87437cc143889665c41046107e84cdf6246e.

Closes #100304.

(cherry picked from commit 7e7a9069d4240d2ae619cb50eba09f948c537ce3)
---
 clang/lib/Format/TokenAnnotator.cpp           | 19 +++++-------------
 clang/unittests/Format/TokenAnnotatorTest.cpp | 20 -------------------
 2 files changed, 5 insertions(+), 34 deletions(-)

diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index 21924a8fe17d1..5c11f3cb1a874 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -372,6 +372,10 @@ class AnnotatingParser {
                OpeningParen.Previous->is(tok::kw__Generic)) {
       Contexts.back().ContextType = Context::C11GenericSelection;
       Contexts.back().IsExpression = true;
+    } else if (Line.InPPDirective &&
+               (!OpeningParen.Previous ||
+                OpeningParen.Previous->isNot(tok::identifier))) {
+      Contexts.back().IsExpression = true;
     } else if (Contexts[Contexts.size() - 2].CaretFound) {
       // This is the parameter list of an ObjC block.
       Contexts.back().IsExpression = false;
@@ -384,20 +388,7 @@ class AnnotatingParser {
                OpeningParen.Previous->MatchingParen->isOneOf(
                    TT_ObjCBlockLParen, TT_FunctionTypeLParen)) {
       Contexts.back().IsExpression = false;
-    } else if (Line.InPPDirective) {
-      auto IsExpr = [&OpeningParen] {
-        const auto *Tok = OpeningParen.Previous;
-        if (!Tok || Tok->isNot(tok::identifier))
-          return true;
-        Tok = Tok->Previous;
-        while (Tok && Tok->endsSequence(tok::coloncolon, tok::identifier)) {
-          assert(Tok->Previous);
-          Tok = Tok->Previous->Previous;
-        }
-        return !Tok || !Tok->Tok.getIdentifierInfo();
-      };
-      Contexts.back().IsExpression = IsExpr();
-    } else if (!Line.MustBeDeclaration) {
+    } else if (!Line.MustBeDeclaration && !Line.InPPDirective) {
       bool IsForOrCatch =
           OpeningParen.Previous &&
           OpeningParen.Previous->isOneOf(tok::kw_for, tok::kw_catch);
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index b01ca322505b1..51810ad047a26 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -75,26 +75,6 @@ TEST_F(TokenAnnotatorTest, UnderstandsUsesOfStarAndAmp) {
   EXPECT_TOKEN(Tokens[10], tok::r_paren, TT_TypeDeclarationParen);
   EXPECT_TOKEN(Tokens[11], tok::star, TT_PointerOrReference);
 
-  Tokens = annotate("#define FOO bar(a * b)");
-  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
-  EXPECT_TOKEN(Tokens[6], tok::star, TT_BinaryOperator);
-
-  Tokens = annotate("#define FOO foo.bar(a & b)");
-  ASSERT_EQ(Tokens.size(), 12u) << Tokens;
-  EXPECT_TOKEN(Tokens[8], tok::amp, TT_BinaryOperator);
-
-  Tokens = annotate("#define FOO foo::bar(a && b)");
-  ASSERT_EQ(Tokens.size(), 12u) << Tokens;
-  EXPECT_TOKEN(Tokens[8], tok::ampamp, TT_BinaryOperator);
-
-  Tokens = annotate("#define FOO foo bar(a *b)");
-  ASSERT_EQ(Tokens.size(), 11u) << Tokens;
-  EXPECT_TOKEN(Tokens[7], tok::star, TT_PointerOrReference);
-
-  Tokens = annotate("#define FOO void foo::bar(a &b)");
-  ASSERT_EQ(Tokens.size(), 13u) << Tokens;
-  EXPECT_TOKEN(Tokens[9], tok::amp, TT_PointerOrReference);
-
   Tokens = annotate("void f() {\n"
                     "  while (p < a && *p == 'a')\n"
                     "    p++;\n"

>From 102ecd39c421b3f1938b856d160c149a027eedac Mon Sep 17 00:00:00 2001
From: wanglei <wanglei at loongson.cn>
Date: Fri, 26 Jul 2024 14:36:54 +0800
Subject: [PATCH 49/91] [LoongArch][MC] Support %[ld_/gd_/desc_]pcrel_20

Reviewed By: SixWeining, MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/100104

(cherry picked from commit e27358c8ed7abac200546e808ea30a86aa9aa580)
---
 .../AsmParser/LoongArchAsmParser.cpp          | 24 +++++++++++++++++++
 .../Target/LoongArch/LoongArchInstrInfo.td    |  6 ++++-
 .../MCTargetDesc/LoongArchFixupKinds.h        |  8 +++++++
 .../MCTargetDesc/LoongArchMCCodeEmitter.cpp   | 12 ++++++++++
 .../MCTargetDesc/LoongArchMCExpr.cpp          | 15 ++++++++++++
 .../LoongArch/MCTargetDesc/LoongArchMCExpr.h  |  4 ++++
 .../test/MC/LoongArch/Basic/Integer/invalid.s |  6 +++--
 .../MC/LoongArch/Relocations/relocations.s    | 20 ++++++++++++++++
 8 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp b/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
index 208bd3db8f14e..f52e188f87792 100644
--- a/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
+++ b/llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
@@ -450,6 +450,24 @@ class LoongArchOperand : public MCParsedAsmOperand {
                      IsValidKind;
   }
 
+  bool isSImm20pcaddi() const {
+    if (!isImm())
+      return false;
+
+    int64_t Imm;
+    LoongArchMCExpr::VariantKind VK = LoongArchMCExpr::VK_LoongArch_None;
+    bool IsConstantImm = evaluateConstantImm(getImm(), Imm, VK);
+    bool IsValidKind = VK == LoongArchMCExpr::VK_LoongArch_None ||
+                       VK == LoongArchMCExpr::VK_LoongArch_PCREL20_S2 ||
+                       VK == LoongArchMCExpr::VK_LoongArch_TLS_LD_PCREL20_S2 ||
+                       VK == LoongArchMCExpr::VK_LoongArch_TLS_GD_PCREL20_S2 ||
+                       VK == LoongArchMCExpr::VK_LoongArch_TLS_DESC_PCREL20_S2;
+    return IsConstantImm
+               ? isInt<20>(Imm) && IsValidKind
+               : LoongArchAsmParser::classifySymbolRef(getImm(), VK) &&
+                     IsValidKind;
+  }
+
   bool isSImm21lsl2() const {
     if (!isImm())
       return false;
@@ -1676,6 +1694,12 @@ bool LoongArchAsmParser::MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
         /*Upper=*/(1 << 19) - 1,
         "operand must be a symbol with modifier (e.g. %call36) or an integer "
         "in the range");
+  case Match_InvalidSImm20pcaddi:
+    return generateImmOutOfRangeError(
+        Operands, ErrorInfo, /*Lower=*/-(1 << 19),
+        /*Upper=*/(1 << 19) - 1,
+        "operand must be a symbol with modifier (e.g. %pcrel_20) or an integer "
+        "in the range");
   case Match_InvalidSImm21lsl2:
     return generateImmOutOfRangeError(
         Operands, ErrorInfo, /*Lower=*/-(1 << 22), /*Upper=*/(1 << 22) - 4,
diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
index ec0d071453c3f..ef647a4277873 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
@@ -397,6 +397,10 @@ def simm20_pcaddu18i : SImm20Operand {
   let ParserMatchClass = SImmAsmOperand<20, "pcaddu18i">;
 }
 
+def simm20_pcaddi : SImm20Operand {
+  let ParserMatchClass = SImmAsmOperand<20, "pcaddi">;
+}
+
 def simm21_lsl2 : Operand<OtherVT> {
   let ParserMatchClass = SImmAsmOperand<21, "lsl2">;
   let EncoderMethod = "getImmOpValueAsr<2>";
@@ -754,7 +758,7 @@ def SLT  : ALU_3R<0x00120000>;
 def SLTU : ALU_3R<0x00128000>;
 def SLTI  : ALU_2RI12<0x02000000, simm12>;
 def SLTUI : ALU_2RI12<0x02400000, simm12>;
-def PCADDI    : ALU_1RI20<0x18000000, simm20>;
+def PCADDI    : ALU_1RI20<0x18000000, simm20_pcaddi>;
 def PCADDU12I : ALU_1RI20<0x1c000000, simm20>;
 def PCALAU12I : ALU_1RI20<0x1a000000, simm20_pcalau12i>;
 def AND  : ALU_3R<0x00148000>;
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
index 29ed14f6e4d62..370f5b0189b51 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchFixupKinds.h
@@ -111,6 +111,8 @@ enum Fixups {
   fixup_loongarch_relax = FirstLiteralRelocationKind + ELF::R_LARCH_RELAX,
   // Generate an R_LARCH_ALIGN which indicates the linker may fixup align here.
   fixup_loongarch_align = FirstLiteralRelocationKind + ELF::R_LARCH_ALIGN,
+  // 20-bit fixup corresponding to %pcrel_20(foo) for instruction pcaddi.
+  fixup_loongarch_pcrel20_s2,
   // 36-bit fixup corresponding to %call36(foo) for a pair instructions:
   // pcaddu18i+jirl.
   fixup_loongarch_call36 = FirstLiteralRelocationKind + ELF::R_LARCH_CALL36,
@@ -142,6 +144,12 @@ enum Fixups {
   fixup_loongarch_tls_le_add_r,
   // 12-bit fixup corresponding to %le_lo12_r(foo) for instruction addi.w/d.
   fixup_loongarch_tls_le_lo12_r,
+  // 20-bit fixup corresponding to %ld_pcrel_20(foo) for instruction pcaddi.
+  fixup_loongarch_tls_ld_pcrel20_s2,
+  // 20-bit fixup corresponding to %gd_pcrel_20(foo) for instruction pcaddi.
+  fixup_loongarch_tls_gd_pcrel20_s2,
+  // 20-bit fixup corresponding to %desc_pcrel_20(foo) for instruction pcaddi.
+  fixup_loongarch_tls_desc_pcrel20_s2,
 };
 } // end namespace LoongArch
 } // end namespace llvm
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
index efbfce35dbfca..4f7f93fa4783e 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
@@ -287,6 +287,18 @@ LoongArchMCCodeEmitter::getExprOpValue(const MCInst &MI, const MCOperand &MO,
     case LoongArchMCExpr::VK_LoongArch_TLS_LE_LO12_R:
       FixupKind = LoongArch::fixup_loongarch_tls_le_lo12_r;
       break;
+    case LoongArchMCExpr::VK_LoongArch_PCREL20_S2:
+      FixupKind = LoongArch::fixup_loongarch_pcrel20_s2;
+      break;
+    case LoongArchMCExpr::VK_LoongArch_TLS_LD_PCREL20_S2:
+      FixupKind = LoongArch::fixup_loongarch_tls_ld_pcrel20_s2;
+      break;
+    case LoongArchMCExpr::VK_LoongArch_TLS_GD_PCREL20_S2:
+      FixupKind = LoongArch::fixup_loongarch_tls_gd_pcrel20_s2;
+      break;
+    case LoongArchMCExpr::VK_LoongArch_TLS_DESC_PCREL20_S2:
+      FixupKind = LoongArch::fixup_loongarch_tls_desc_pcrel20_s2;
+      break;
     }
   } else if (Kind == MCExpr::SymbolRef &&
              cast<MCSymbolRefExpr>(Expr)->getKind() ==
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
index 98b9b1ab6d3f4..53d46cca913e5 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.cpp
@@ -166,6 +166,14 @@ StringRef LoongArchMCExpr::getVariantKindName(VariantKind Kind) {
     return "le_add_r";
   case VK_LoongArch_TLS_LE_LO12_R:
     return "le_lo12_r";
+  case VK_LoongArch_PCREL20_S2:
+    return "pcrel_20";
+  case VK_LoongArch_TLS_LD_PCREL20_S2:
+    return "ld_pcrel_20";
+  case VK_LoongArch_TLS_GD_PCREL20_S2:
+    return "gd_pcrel_20";
+  case VK_LoongArch_TLS_DESC_PCREL20_S2:
+    return "desc_pcrel_20";
   }
 }
 
@@ -222,6 +230,10 @@ LoongArchMCExpr::getVariantKindForName(StringRef name) {
       .Case("le_hi20_r", VK_LoongArch_TLS_LE_HI20_R)
       .Case("le_add_r", VK_LoongArch_TLS_LE_ADD_R)
       .Case("le_lo12_r", VK_LoongArch_TLS_LE_LO12_R)
+      .Case("pcrel_20", VK_LoongArch_PCREL20_S2)
+      .Case("ld_pcrel_20", VK_LoongArch_TLS_LD_PCREL20_S2)
+      .Case("gd_pcrel_20", VK_LoongArch_TLS_GD_PCREL20_S2)
+      .Case("desc_pcrel_20", VK_LoongArch_TLS_DESC_PCREL20_S2)
       .Default(VK_LoongArch_Invalid);
 }
 
@@ -264,6 +276,9 @@ void LoongArchMCExpr::fixELFSymbolsInTLSFixups(MCAssembler &Asm) const {
   case VK_LoongArch_TLS_GD_HI20:
   case VK_LoongArch_TLS_DESC_PC_HI20:
   case VK_LoongArch_TLS_DESC_HI20:
+  case VK_LoongArch_TLS_LD_PCREL20_S2:
+  case VK_LoongArch_TLS_GD_PCREL20_S2:
+  case VK_LoongArch_TLS_DESC_PCREL20_S2:
     break;
   }
   fixELFSymbolsInTLSFixupsImpl(getSubExpr(), Asm);
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
index 1546d894d56ab..91215b863ccc2 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCExpr.h
@@ -75,6 +75,10 @@ class LoongArchMCExpr : public MCTargetExpr {
     VK_LoongArch_TLS_LE_HI20_R,
     VK_LoongArch_TLS_LE_ADD_R,
     VK_LoongArch_TLS_LE_LO12_R,
+    VK_LoongArch_PCREL20_S2,
+    VK_LoongArch_TLS_LD_PCREL20_S2,
+    VK_LoongArch_TLS_GD_PCREL20_S2,
+    VK_LoongArch_TLS_DESC_PCREL20_S2,
     VK_LoongArch_Invalid // Must be the last item.
   };
 
diff --git a/llvm/test/MC/LoongArch/Basic/Integer/invalid.s b/llvm/test/MC/LoongArch/Basic/Integer/invalid.s
index 958d5cab6f2f3..08a131d4d43f9 100644
--- a/llvm/test/MC/LoongArch/Basic/Integer/invalid.s
+++ b/llvm/test/MC/LoongArch/Basic/Integer/invalid.s
@@ -99,11 +99,13 @@ jirl $a0, $a0, 0x20000
 # CHECK: :[[#@LINE-1]]:16: error: operand must be a symbol with modifier (e.g. %b16) or an integer in the range [-131072, 131068]
 
 ## simm20
-pcaddi $a0, -0x80001
-# CHECK: :[[#@LINE-1]]:13: error: immediate must be an integer in the range [-524288, 524287]
 pcaddu12i $a0, 0x80000
 # CHECK: :[[#@LINE-1]]:16: error: immediate must be an integer in the range [-524288, 524287]
 
+## simm20_pcaddi
+pcaddi $a0, -0x80001
+# CHECK: :[[#@LINE-1]]:13: error: operand must be a symbol with modifier (e.g. %pcrel_20) or an integer in the range [-524288, 524287]
+
 ## simm20_lu12iw
 lu12i.w $a0, -0x80001
 # CHECK: :[[#@LINE-1]]:14: error: operand must be a symbol with modifier (e.g. %abs_hi20) or an integer in the range [-524288, 524287]
diff --git a/llvm/test/MC/LoongArch/Relocations/relocations.s b/llvm/test/MC/LoongArch/Relocations/relocations.s
index e83b67199e656..091dce200b7de 100644
--- a/llvm/test/MC/LoongArch/Relocations/relocations.s
+++ b/llvm/test/MC/LoongArch/Relocations/relocations.s
@@ -288,3 +288,23 @@ addi.d $t1, $a2, %le_lo12_r(foo)
 # RELOC: R_LARCH_TLS_LE_LO12_R foo 0x0
 # INSTR: addi.d $t1, $a2, %le_lo12_r(foo)
 # FIXUP: fixup A - offset: 0, value: %le_lo12_r(foo), kind: FK_NONE
+
+pcaddi $t1, %pcrel_20(foo)
+# RELOC: R_LARCH_PCREL20_S2 foo 0x0
+# INSTR: pcaddi $t1, %pcrel_20(foo)
+# FIXUP: fixup A - offset: 0, value: %pcrel_20(foo), kind: FK_NONE
+
+pcaddi $t1, %ld_pcrel_20(foo)
+# RELOC: R_LARCH_TLS_LD_PCREL20_S2 foo 0x0
+# INSTR: pcaddi $t1, %ld_pcrel_20(foo)
+# FIXUP: fixup A - offset: 0, value: %ld_pcrel_20(foo), kind: FK_NONE
+
+pcaddi $t1, %gd_pcrel_20(foo)
+# RELOC: R_LARCH_TLS_GD_PCREL20_S2 foo 0x0
+# INSTR: pcaddi $t1, %gd_pcrel_20(foo)
+# FIXUP: fixup A - offset: 0, value: %gd_pcrel_20(foo), kind: FK_NONE
+
+pcaddi $t1, %desc_pcrel_20(foo)
+# RELOC: R_LARCH_TLS_DESC_PCREL20_S2 foo 0x0
+# INSTR: pcaddi $t1, %desc_pcrel_20(foo)
+# FIXUP: fixup A - offset: 0, value: %desc_pcrel_20(foo), kind: FK_NONE

>From 47fafad155a8d8394720dbc689558d304662e205 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Fri, 26 Jul 2024 13:10:06 -0500
Subject: [PATCH 50/91] [libc++] Fix bug in atomic_ref's calculation of
 lock_free-ness (#99570)

The builtin __atomic_always_lock_free takes into account the type of the
pointer provided as the second argument. Because we were passing void*,
rather than T*, the calculation failed. This meant that
atomic_ref<T>::is_always_lock_free was only true for char & bool.

This bug exists elsewhere in the atomic library (when using GCC, we fail
to pass a pointer at all, and we fail to correctly align the atomic like
_Atomic would).

This change also attempts to start sorting out testing difficulties with
this function that caused the bug to exist by using the
__GCC_ATOMIC_(CHAR|SHORT|INT|LONG|LLONG|POINTER)_IS_LOCK_FREE predefined
macros to establish an expected value for `is_always_lock_free` and
`is_lock_free` for the respective types, as well as types with matching
sizes and compatible alignment values.

Using these compiler pre-defines we can actually validate that certain
types, like char and int, are actually always lock free like they are on
every platform in the wild.

Note that this patch was actually authored by Eric Fiselier but I picked
up the patch and GitHub won't let me set Eric as the primary author.

Co-authored-by: Eric Fiselier <eric at efcs.ca>
(cherry picked from commit cc1dfb37aa84d1524243b83fadb8ff0f821e03e9)
---
 libcxx/include/__atomic/atomic_ref.h          |  15 +-
 .../atomics.lockfree/is_always_lock_free.cpp  | 165 ++++++++++++++++++
 .../isalwayslockfree.pass.cpp                 | 120 -------------
 .../atomics.ref/is_always_lock_free.pass.cpp  |  34 +++-
 libcxx/test/support/atomic_helpers.h          | 103 +++++++++++
 5 files changed, 312 insertions(+), 125 deletions(-)
 create mode 100644 libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp
 delete mode 100644 libcxx/test/std/atomics/atomics.lockfree/isalwayslockfree.pass.cpp

diff --git a/libcxx/include/__atomic/atomic_ref.h b/libcxx/include/__atomic/atomic_ref.h
index 156f1961151c1..2849b82e1a3dd 100644
--- a/libcxx/include/__atomic/atomic_ref.h
+++ b/libcxx/include/__atomic/atomic_ref.h
@@ -42,6 +42,19 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 
 #if _LIBCPP_STD_VER >= 20
 
+// These types are required to make __atomic_is_always_lock_free work across GCC and Clang.
+// The purpose of this trick is to make sure that we provide an object with the correct alignment
+// to __atomic_is_always_lock_free, since that answer depends on the alignment.
+template <size_t _Alignment>
+struct __alignment_checker_type {
+  alignas(_Alignment) char __data;
+};
+
+template <size_t _Alignment>
+struct __get_aligner_instance {
+  static constexpr __alignment_checker_type<_Alignment> __instance{};
+};
+
 template <class _Tp>
 struct __atomic_ref_base {
 protected:
@@ -105,7 +118,7 @@ struct __atomic_ref_base {
   // that the pointer is going to be aligned properly at runtime because that is a (checked) precondition
   // of atomic_ref's constructor.
   static constexpr bool is_always_lock_free =
-      __atomic_always_lock_free(sizeof(_Tp), reinterpret_cast<void*>(-required_alignment));
+      __atomic_always_lock_free(sizeof(_Tp), &__get_aligner_instance<required_alignment>::__instance);
 
   _LIBCPP_HIDE_FROM_ABI bool is_lock_free() const noexcept { return __atomic_is_lock_free(sizeof(_Tp), __ptr_); }
 
diff --git a/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp
new file mode 100644
index 0000000000000..2dc7f5c765419
--- /dev/null
+++ b/libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.cpp
@@ -0,0 +1,165 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// UNSUPPORTED: c++03, c++11, c++14
+
+// <atomic>
+//
+// template <class T>
+// class atomic;
+//
+// static constexpr bool is_always_lock_free;
+
+#include <atomic>
+#include <cassert>
+#include <cstddef>
+
+#include "test_macros.h"
+#include "atomic_helpers.h"
+
+template <typename T>
+void check_always_lock_free(std::atomic<T> const& a) {
+  using InfoT = LockFreeStatusInfo<T>;
+
+  constexpr std::same_as<const bool> decltype(auto) is_always_lock_free = std::atomic<T>::is_always_lock_free;
+
+  // If we know the status of T for sure, validate the exact result of the function.
+  if constexpr (InfoT::status_known) {
+    constexpr LockFreeStatus known_status = InfoT::value;
+    if constexpr (known_status == LockFreeStatus::always) {
+      static_assert(is_always_lock_free, "is_always_lock_free is inconsistent with known lock-free status");
+      assert(a.is_lock_free() && "is_lock_free() is inconsistent with known lock-free status");
+    } else if constexpr (known_status == LockFreeStatus::never) {
+      static_assert(!is_always_lock_free, "is_always_lock_free is inconsistent with known lock-free status");
+      assert(!a.is_lock_free() && "is_lock_free() is inconsistent with known lock-free status");
+    } else {
+      assert(a.is_lock_free() || !a.is_lock_free()); // This is kinda dumb, but we might as well call the function once.
+    }
+  }
+
+  // In all cases, also sanity-check it based on the implication always-lock-free => lock-free.
+  if (is_always_lock_free) {
+    std::same_as<bool> decltype(auto) is_lock_free = a.is_lock_free();
+    assert(is_lock_free);
+  }
+  ASSERT_NOEXCEPT(a.is_lock_free());
+}
+
+#define CHECK_ALWAYS_LOCK_FREE(T)                                                                                      \
+  do {                                                                                                                 \
+    typedef T type;                                                                                                    \
+    type obj{};                                                                                                        \
+    std::atomic<type> a(obj);                                                                                          \
+    check_always_lock_free(a);                                                                                         \
+  } while (0)
+
+void test() {
+  char c = 'x';
+  check_always_lock_free(std::atomic<char>(c));
+
+  int i = 0;
+  check_always_lock_free(std::atomic<int>(i));
+
+  float f = 0.f;
+  check_always_lock_free(std::atomic<float>(f));
+
+  int* p = &i;
+  check_always_lock_free(std::atomic<int*>(p));
+
+  CHECK_ALWAYS_LOCK_FREE(bool);
+  CHECK_ALWAYS_LOCK_FREE(char);
+  CHECK_ALWAYS_LOCK_FREE(signed char);
+  CHECK_ALWAYS_LOCK_FREE(unsigned char);
+#if TEST_STD_VER > 17 && defined(__cpp_char8_t)
+  CHECK_ALWAYS_LOCK_FREE(char8_t);
+#endif
+  CHECK_ALWAYS_LOCK_FREE(char16_t);
+  CHECK_ALWAYS_LOCK_FREE(char32_t);
+  CHECK_ALWAYS_LOCK_FREE(wchar_t);
+  CHECK_ALWAYS_LOCK_FREE(short);
+  CHECK_ALWAYS_LOCK_FREE(unsigned short);
+  CHECK_ALWAYS_LOCK_FREE(int);
+  CHECK_ALWAYS_LOCK_FREE(unsigned int);
+  CHECK_ALWAYS_LOCK_FREE(long);
+  CHECK_ALWAYS_LOCK_FREE(unsigned long);
+  CHECK_ALWAYS_LOCK_FREE(long long);
+  CHECK_ALWAYS_LOCK_FREE(unsigned long long);
+  CHECK_ALWAYS_LOCK_FREE(std::nullptr_t);
+  CHECK_ALWAYS_LOCK_FREE(void*);
+  CHECK_ALWAYS_LOCK_FREE(float);
+  CHECK_ALWAYS_LOCK_FREE(double);
+  CHECK_ALWAYS_LOCK_FREE(long double);
+#if __has_attribute(vector_size) && defined(_LIBCPP_VERSION)
+  CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(1 * sizeof(int)))));
+  CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(2 * sizeof(int)))));
+  CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(4 * sizeof(int)))));
+  CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(16 * sizeof(int)))));
+  CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(32 * sizeof(int)))));
+  CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(1 * sizeof(float)))));
+  CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(2 * sizeof(float)))));
+  CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(4 * sizeof(float)))));
+  CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(16 * sizeof(float)))));
+  CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(32 * sizeof(float)))));
+  CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(1 * sizeof(double)))));
+  CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(2 * sizeof(double)))));
+  CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(4 * sizeof(double)))));
+  CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(16 * sizeof(double)))));
+  CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(32 * sizeof(double)))));
+#endif // __has_attribute(vector_size) && defined(_LIBCPP_VERSION)
+  CHECK_ALWAYS_LOCK_FREE(struct Empty{});
+  CHECK_ALWAYS_LOCK_FREE(struct OneInt { int i; });
+  CHECK_ALWAYS_LOCK_FREE(struct IntArr2 { int i[2]; });
+  CHECK_ALWAYS_LOCK_FREE(struct FloatArr3 { float i[3]; });
+  CHECK_ALWAYS_LOCK_FREE(struct LLIArr2 { long long int i[2]; });
+  CHECK_ALWAYS_LOCK_FREE(struct LLIArr4 { long long int i[4]; });
+  CHECK_ALWAYS_LOCK_FREE(struct LLIArr8 { long long int i[8]; });
+  CHECK_ALWAYS_LOCK_FREE(struct LLIArr16 { long long int i[16]; });
+  CHECK_ALWAYS_LOCK_FREE(struct Padding {
+    char c; /* padding */
+    long long int i;
+  });
+  CHECK_ALWAYS_LOCK_FREE(union IntFloat {
+    int i;
+    float f;
+  });
+  CHECK_ALWAYS_LOCK_FREE(enum class CharEnumClass : char{foo});
+
+  // C macro and static constexpr must be consistent.
+  enum class CharEnumClass : char { foo };
+  static_assert(std::atomic<bool>::is_always_lock_free == (2 == ATOMIC_BOOL_LOCK_FREE), "");
+  static_assert(std::atomic<char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
+  static_assert(std::atomic<CharEnumClass>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
+  static_assert(std::atomic<signed char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
+  static_assert(std::atomic<unsigned char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
+#if TEST_STD_VER > 17 && defined(__cpp_char8_t)
+  static_assert(std::atomic<char8_t>::is_always_lock_free == (2 == ATOMIC_CHAR8_T_LOCK_FREE), "");
+#endif
+  static_assert(std::atomic<char16_t>::is_always_lock_free == (2 == ATOMIC_CHAR16_T_LOCK_FREE), "");
+  static_assert(std::atomic<char32_t>::is_always_lock_free == (2 == ATOMIC_CHAR32_T_LOCK_FREE), "");
+  static_assert(std::atomic<wchar_t>::is_always_lock_free == (2 == ATOMIC_WCHAR_T_LOCK_FREE), "");
+  static_assert(std::atomic<short>::is_always_lock_free == (2 == ATOMIC_SHORT_LOCK_FREE), "");
+  static_assert(std::atomic<unsigned short>::is_always_lock_free == (2 == ATOMIC_SHORT_LOCK_FREE), "");
+  static_assert(std::atomic<int>::is_always_lock_free == (2 == ATOMIC_INT_LOCK_FREE), "");
+  static_assert(std::atomic<unsigned int>::is_always_lock_free == (2 == ATOMIC_INT_LOCK_FREE), "");
+  static_assert(std::atomic<long>::is_always_lock_free == (2 == ATOMIC_LONG_LOCK_FREE), "");
+  static_assert(std::atomic<unsigned long>::is_always_lock_free == (2 == ATOMIC_LONG_LOCK_FREE), "");
+  static_assert(std::atomic<long long>::is_always_lock_free == (2 == ATOMIC_LLONG_LOCK_FREE), "");
+  static_assert(std::atomic<unsigned long long>::is_always_lock_free == (2 == ATOMIC_LLONG_LOCK_FREE), "");
+  static_assert(std::atomic<void*>::is_always_lock_free == (2 == ATOMIC_POINTER_LOCK_FREE), "");
+  static_assert(std::atomic<std::nullptr_t>::is_always_lock_free == (2 == ATOMIC_POINTER_LOCK_FREE), "");
+
+#if TEST_STD_VER >= 20
+  static_assert(std::atomic_signed_lock_free::is_always_lock_free, "");
+  static_assert(std::atomic_unsigned_lock_free::is_always_lock_free, "");
+#endif
+}
+
+int main(int, char**) {
+  test();
+  return 0;
+}
diff --git a/libcxx/test/std/atomics/atomics.lockfree/isalwayslockfree.pass.cpp b/libcxx/test/std/atomics/atomics.lockfree/isalwayslockfree.pass.cpp
deleted file mode 100644
index 6d6e6477bc251..0000000000000
--- a/libcxx/test/std/atomics/atomics.lockfree/isalwayslockfree.pass.cpp
+++ /dev/null
@@ -1,120 +0,0 @@
-//===----------------------------------------------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// UNSUPPORTED: c++03, c++11, c++14
-
-// <atomic>
-
-// static constexpr bool is_always_lock_free;
-
-#include <atomic>
-#include <cassert>
-#include <cstddef>
-
-#include "test_macros.h"
-
-template <typename T>
-void checkAlwaysLockFree() {
-  if (std::atomic<T>::is_always_lock_free) {
-    assert(std::atomic<T>().is_lock_free());
-  }
-}
-
-void run()
-{
-// structs and unions can't be defined in the template invocation.
-// Work around this with a typedef.
-#define CHECK_ALWAYS_LOCK_FREE(T)                                              \
-  do {                                                                         \
-    typedef T type;                                                            \
-    checkAlwaysLockFree<type>();                                               \
-  } while (0)
-
-    CHECK_ALWAYS_LOCK_FREE(bool);
-    CHECK_ALWAYS_LOCK_FREE(char);
-    CHECK_ALWAYS_LOCK_FREE(signed char);
-    CHECK_ALWAYS_LOCK_FREE(unsigned char);
-#if TEST_STD_VER > 17 && defined(__cpp_char8_t)
-    CHECK_ALWAYS_LOCK_FREE(char8_t);
-#endif
-    CHECK_ALWAYS_LOCK_FREE(char16_t);
-    CHECK_ALWAYS_LOCK_FREE(char32_t);
-    CHECK_ALWAYS_LOCK_FREE(wchar_t);
-    CHECK_ALWAYS_LOCK_FREE(short);
-    CHECK_ALWAYS_LOCK_FREE(unsigned short);
-    CHECK_ALWAYS_LOCK_FREE(int);
-    CHECK_ALWAYS_LOCK_FREE(unsigned int);
-    CHECK_ALWAYS_LOCK_FREE(long);
-    CHECK_ALWAYS_LOCK_FREE(unsigned long);
-    CHECK_ALWAYS_LOCK_FREE(long long);
-    CHECK_ALWAYS_LOCK_FREE(unsigned long long);
-    CHECK_ALWAYS_LOCK_FREE(std::nullptr_t);
-    CHECK_ALWAYS_LOCK_FREE(void*);
-    CHECK_ALWAYS_LOCK_FREE(float);
-    CHECK_ALWAYS_LOCK_FREE(double);
-    CHECK_ALWAYS_LOCK_FREE(long double);
-#if __has_attribute(vector_size) && defined(_LIBCPP_VERSION)
-    CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(1 * sizeof(int)))));
-    CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(2 * sizeof(int)))));
-    CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(4 * sizeof(int)))));
-    CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(16 * sizeof(int)))));
-    CHECK_ALWAYS_LOCK_FREE(int __attribute__((vector_size(32 * sizeof(int)))));
-    CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(1 * sizeof(float)))));
-    CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(2 * sizeof(float)))));
-    CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(4 * sizeof(float)))));
-    CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(16 * sizeof(float)))));
-    CHECK_ALWAYS_LOCK_FREE(float __attribute__((vector_size(32 * sizeof(float)))));
-    CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(1 * sizeof(double)))));
-    CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(2 * sizeof(double)))));
-    CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(4 * sizeof(double)))));
-    CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(16 * sizeof(double)))));
-    CHECK_ALWAYS_LOCK_FREE(double __attribute__((vector_size(32 * sizeof(double)))));
-#endif // __has_attribute(vector_size) && defined(_LIBCPP_VERSION)
-    CHECK_ALWAYS_LOCK_FREE(struct Empty {});
-    CHECK_ALWAYS_LOCK_FREE(struct OneInt { int i; });
-    CHECK_ALWAYS_LOCK_FREE(struct IntArr2 { int i[2]; });
-    CHECK_ALWAYS_LOCK_FREE(struct FloatArr3 { float i[3]; });
-    CHECK_ALWAYS_LOCK_FREE(struct LLIArr2 { long long int i[2]; });
-    CHECK_ALWAYS_LOCK_FREE(struct LLIArr4 { long long int i[4]; });
-    CHECK_ALWAYS_LOCK_FREE(struct LLIArr8 { long long int i[8]; });
-    CHECK_ALWAYS_LOCK_FREE(struct LLIArr16 { long long int i[16]; });
-    CHECK_ALWAYS_LOCK_FREE(struct Padding { char c; /* padding */ long long int i; });
-    CHECK_ALWAYS_LOCK_FREE(union IntFloat { int i; float f; });
-    CHECK_ALWAYS_LOCK_FREE(enum class CharEnumClass : char { foo });
-
-    // C macro and static constexpr must be consistent.
-    enum class CharEnumClass : char { foo };
-    static_assert(std::atomic<bool>::is_always_lock_free == (2 == ATOMIC_BOOL_LOCK_FREE), "");
-    static_assert(std::atomic<char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
-    static_assert(std::atomic<CharEnumClass>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
-    static_assert(std::atomic<signed char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
-    static_assert(std::atomic<unsigned char>::is_always_lock_free == (2 == ATOMIC_CHAR_LOCK_FREE), "");
-#if TEST_STD_VER > 17 && defined(__cpp_char8_t)
-    static_assert(std::atomic<char8_t>::is_always_lock_free == (2 == ATOMIC_CHAR8_T_LOCK_FREE), "");
-#endif
-    static_assert(std::atomic<char16_t>::is_always_lock_free == (2 == ATOMIC_CHAR16_T_LOCK_FREE), "");
-    static_assert(std::atomic<char32_t>::is_always_lock_free == (2 == ATOMIC_CHAR32_T_LOCK_FREE), "");
-    static_assert(std::atomic<wchar_t>::is_always_lock_free == (2 == ATOMIC_WCHAR_T_LOCK_FREE), "");
-    static_assert(std::atomic<short>::is_always_lock_free == (2 == ATOMIC_SHORT_LOCK_FREE), "");
-    static_assert(std::atomic<unsigned short>::is_always_lock_free == (2 == ATOMIC_SHORT_LOCK_FREE), "");
-    static_assert(std::atomic<int>::is_always_lock_free == (2 == ATOMIC_INT_LOCK_FREE), "");
-    static_assert(std::atomic<unsigned int>::is_always_lock_free == (2 == ATOMIC_INT_LOCK_FREE), "");
-    static_assert(std::atomic<long>::is_always_lock_free == (2 == ATOMIC_LONG_LOCK_FREE), "");
-    static_assert(std::atomic<unsigned long>::is_always_lock_free == (2 == ATOMIC_LONG_LOCK_FREE), "");
-    static_assert(std::atomic<long long>::is_always_lock_free == (2 == ATOMIC_LLONG_LOCK_FREE), "");
-    static_assert(std::atomic<unsigned long long>::is_always_lock_free == (2 == ATOMIC_LLONG_LOCK_FREE), "");
-    static_assert(std::atomic<void*>::is_always_lock_free == (2 == ATOMIC_POINTER_LOCK_FREE), "");
-    static_assert(std::atomic<std::nullptr_t>::is_always_lock_free == (2 == ATOMIC_POINTER_LOCK_FREE), "");
-
-#if TEST_STD_VER >= 20
-    static_assert(std::atomic_signed_lock_free::is_always_lock_free, "");
-    static_assert(std::atomic_unsigned_lock_free::is_always_lock_free, "");
-#endif
-}
-
-int main(int, char**) { run(); return 0; }
diff --git a/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp b/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
index 94f65e3b4b669..acdbf63a24d85 100644
--- a/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
+++ b/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
@@ -9,7 +9,10 @@
 // UNSUPPORTED: c++03, c++11, c++14, c++17
 
 // <atomic>
-
+//
+// template <class T>
+// class atomic_ref;
+//
 // static constexpr bool is_always_lock_free;
 // bool is_lock_free() const noexcept;
 
@@ -18,10 +21,29 @@
 #include <concepts>
 
 #include "test_macros.h"
+#include "atomic_helpers.h"
 
 template <typename T>
-void check_always_lock_free(std::atomic_ref<T> const a) {
-  std::same_as<const bool> decltype(auto) is_always_lock_free = std::atomic_ref<T>::is_always_lock_free;
+void check_always_lock_free(std::atomic_ref<T> const& a) {
+  using InfoT = LockFreeStatusInfo<T>;
+
+  constexpr std::same_as<const bool> decltype(auto) is_always_lock_free = std::atomic_ref<T>::is_always_lock_free;
+
+  // If we know the status of T for sure, validate the exact result of the function.
+  if constexpr (InfoT::status_known) {
+    constexpr LockFreeStatus known_status = InfoT::value;
+    if constexpr (known_status == LockFreeStatus::always) {
+      static_assert(is_always_lock_free, "is_always_lock_free is inconsistent with known lock-free status");
+      assert(a.is_lock_free() && "is_lock_free() is inconsistent with known lock-free status");
+    } else if constexpr (known_status == LockFreeStatus::never) {
+      static_assert(!is_always_lock_free, "is_always_lock_free is inconsistent with known lock-free status");
+      assert(!a.is_lock_free() && "is_lock_free() is inconsistent with known lock-free status");
+    } else {
+      assert(a.is_lock_free() || !a.is_lock_free()); // This is kinda dumb, but we might as well call the function once.
+    }
+  }
+
+  // In all cases, also sanity-check it based on the implication always-lock-free => lock-free.
   if (is_always_lock_free) {
     std::same_as<bool> decltype(auto) is_lock_free = a.is_lock_free();
     assert(is_lock_free);
@@ -33,10 +55,14 @@ void check_always_lock_free(std::atomic_ref<T> const a) {
   do {                                                                                                                 \
     typedef T type;                                                                                                    \
     type obj{};                                                                                                        \
-    check_always_lock_free(std::atomic_ref<type>(obj));                                                                \
+    std::atomic_ref<type> a(obj);                                                                                      \
+    check_always_lock_free(a);                                                                                         \
   } while (0)
 
 void test() {
+  char c = 'x';
+  check_always_lock_free(std::atomic_ref<char>(c));
+
   int i = 0;
   check_always_lock_free(std::atomic_ref<int>(i));
 
diff --git a/libcxx/test/support/atomic_helpers.h b/libcxx/test/support/atomic_helpers.h
index 0266a0961067b..d2f2b751cb47d 100644
--- a/libcxx/test/support/atomic_helpers.h
+++ b/libcxx/test/support/atomic_helpers.h
@@ -11,9 +11,112 @@
 
 #include <cassert>
 #include <cstdint>
+#include <cstddef>
+#include <type_traits>
 
 #include "test_macros.h"
 
+#if defined(TEST_COMPILER_CLANG)
+#  define TEST_ATOMIC_CHAR_LOCK_FREE __CLANG_ATOMIC_CHAR_LOCK_FREE
+#  define TEST_ATOMIC_SHORT_LOCK_FREE __CLANG_ATOMIC_SHORT_LOCK_FREE
+#  define TEST_ATOMIC_INT_LOCK_FREE __CLANG_ATOMIC_INT_LOCK_FREE
+#  define TEST_ATOMIC_LONG_LOCK_FREE __CLANG_ATOMIC_LONG_LOCK_FREE
+#  define TEST_ATOMIC_LLONG_LOCK_FREE __CLANG_ATOMIC_LLONG_LOCK_FREE
+#  define TEST_ATOMIC_POINTER_LOCK_FREE __CLANG_ATOMIC_POINTER_LOCK_FREE
+#elif defined(TEST_COMPILER_GCC)
+#  define TEST_ATOMIC_CHAR_LOCK_FREE __GCC_ATOMIC_CHAR_LOCK_FREE
+#  define TEST_ATOMIC_SHORT_LOCK_FREE __GCC_ATOMIC_SHORT_LOCK_FREE
+#  define TEST_ATOMIC_INT_LOCK_FREE __GCC_ATOMIC_INT_LOCK_FREE
+#  define TEST_ATOMIC_LONG_LOCK_FREE __GCC_ATOMIC_LONG_LOCK_FREE
+#  define TEST_ATOMIC_LLONG_LOCK_FREE __GCC_ATOMIC_LLONG_LOCK_FREE
+#  define TEST_ATOMIC_POINTER_LOCK_FREE __GCC_ATOMIC_POINTER_LOCK_FREE
+#elif TEST_COMPILER_MSVC
+// This is lifted from STL/stl/inc/atomic on github for the purposes of
+// keeping the tests compiling for MSVC's STL. It's not a perfect solution
+// but at least the tests will keep running.
+//
+// Note MSVC's STL never produces a type that is sometimes lock free, but not always lock free.
+template <class T, size_t Size = sizeof(T)>
+constexpr bool msvc_is_lock_free_macro_value() {
+  return (Size <= 8 && (Size & Size - 1) == 0) ? 2 : 0;
+}
+#  define TEST_ATOMIC_CHAR_LOCK_FREE ::msvc_is_lock_free_macro_value<char>()
+#  define TEST_ATOMIC_SHORT_LOCK_FREE ::msvc_is_lock_free_macro_value<short>()
+#  define TEST_ATOMIC_INT_LOCK_FREE ::msvc_is_lock_free_macro_value<int>()
+#  define TEST_ATOMIC_LONG_LOCK_FREE ::msvc_is_lock_free_macro_value<long>()
+#  define TEST_ATOMIC_LLONG_LOCK_FREE ::msvc_is_lock_free_macro_value<long long>()
+#  define TEST_ATOMIC_POINTER_LOCK_FREE ::msvc_is_lock_free_macro_value<void*>()
+#else
+#  error "Unknown compiler"
+#endif
+
+#ifdef TEST_COMPILER_CLANG
+#  pragma clang diagnostic push
+#  pragma clang diagnostic ignored "-Wc++11-extensions"
+#endif
+
+enum class LockFreeStatus : int { unknown = -1, never = 0, sometimes = 1, always = 2 };
+
+// We should really be checking whether the alignment of T is greater-than-or-equal-to the alignment required
+// for T to be atomic, but this is basically impossible to implement portably. Instead, we assume that any type
+// aligned to at least its size is going to be atomic if there exists atomic operations for that size at all,
+// which is true on most platforms. This technically reduces our test coverage in the sense that if a type has
+// an alignment requirement less than its size but could still be made lockfree, LockFreeStatusInfo will report
+// that we don't know whether it is lockfree or not.
+#define COMPARE_TYPES(T, FundamentalT) (sizeof(T) == sizeof(FundamentalT) && TEST_ALIGNOF(T) >= sizeof(T))
+
+template <class T>
+struct LockFreeStatusInfo {
+  static const LockFreeStatus value = LockFreeStatus(
+      COMPARE_TYPES(T, char)
+          ? TEST_ATOMIC_CHAR_LOCK_FREE
+          : (COMPARE_TYPES(T, short)
+                 ? TEST_ATOMIC_SHORT_LOCK_FREE
+                 : (COMPARE_TYPES(T, int)
+                        ? TEST_ATOMIC_INT_LOCK_FREE
+                        : (COMPARE_TYPES(T, long)
+                               ? TEST_ATOMIC_LONG_LOCK_FREE
+                               : (COMPARE_TYPES(T, long long)
+                                      ? TEST_ATOMIC_LLONG_LOCK_FREE
+                                      : (COMPARE_TYPES(T, void*) ? TEST_ATOMIC_POINTER_LOCK_FREE : -1))))));
+
+  static const bool status_known = LockFreeStatusInfo::value != LockFreeStatus::unknown;
+};
+
+#undef COMPARE_TYPES
+
+// This doesn't work in C++03 due to issues with scoped enumerations. Just disable the test.
+#if TEST_STD_VER >= 11
+static_assert(LockFreeStatusInfo<char>::status_known, "");
+static_assert(LockFreeStatusInfo<short>::status_known, "");
+static_assert(LockFreeStatusInfo<int>::status_known, "");
+static_assert(LockFreeStatusInfo<long>::status_known, "");
+static_assert(LockFreeStatusInfo<void*>::status_known, "");
+
+// long long is a bit funky: on some platforms, its alignment is 4 bytes but its size is
+// 8 bytes. In that case, atomics may or may not be lockfree based on their address.
+static_assert(alignof(long long) == sizeof(long long) ? LockFreeStatusInfo<long long>::status_known : true, "");
+
+// Those should always be lock free: hardcode some expected values to make sure our tests are actually
+// testing something meaningful.
+static_assert(LockFreeStatusInfo<char>::value == LockFreeStatus::always, "");
+static_assert(LockFreeStatusInfo<short>::value == LockFreeStatus::always, "");
+static_assert(LockFreeStatusInfo<int>::value == LockFreeStatus::always, "");
+#endif
+
+// These macros are somewhat suprising to use, since they take the values 0, 1, or 2.
+// To make the tests clearer, get rid of them in preference of LockFreeStatusInfo.
+#undef TEST_ATOMIC_CHAR_LOCK_FREE
+#undef TEST_ATOMIC_SHORT_LOCK_FREE
+#undef TEST_ATOMIC_INT_LOCK_FREE
+#undef TEST_ATOMIC_LONG_LOCK_FREE
+#undef TEST_ATOMIC_LLONG_LOCK_FREE
+#undef TEST_ATOMIC_POINTER_LOCK_FREE
+
+#ifdef TEST_COMPILER_CLANG
+#  pragma clang diagnostic pop
+#endif
+
 struct UserAtomicType {
   int i;
 

>From 00341691ac31795b2d6fd5849efc586dd57dd40e Mon Sep 17 00:00:00 2001
From: Craig Topper <craig.topper at sifive.com>
Date: Fri, 26 Jul 2024 17:11:01 -0700
Subject: [PATCH 51/91] [RISCV] Don't crash in RISCVMergeBaseOffset if
 INLINE_ASM uses address register in a non-memory constraint. (#100790)

If the register is used by a non-memory constraint we should disable the
fold. Otherwise, we may leave CommonOffset unassigned.

Fixes #100779.

(cherry picked from commit c901b739b67476b00209b7ee706de94c0595d763)
---
 .../lib/Target/RISCV/RISCVMergeBaseOffset.cpp | 10 +++-
 .../RISCV/inline-asm-mem-constraint.ll        | 50 +++++++++++++++++++
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp b/llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
index fecc83a821f42..b6ac3384e7d3e 100644
--- a/llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
+++ b/llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
@@ -429,8 +429,16 @@ bool RISCVMergeBaseOffsetOpt::foldIntoMemoryOps(MachineInstr &Hi,
         NumOps = Flags.getNumOperandRegisters();
 
         // Memory constraints have two operands.
-        if (NumOps != 2 || !Flags.isMemKind())
+        if (NumOps != 2 || !Flags.isMemKind()) {
+          // If the register is used by something other than a memory contraint,
+          // we should not fold.
+          for (unsigned J = 0; J < NumOps; ++J) {
+            const MachineOperand &MO = UseMI.getOperand(I + 1 + J);
+            if (MO.isReg() && MO.getReg() == DestReg)
+              return false;
+          }
           continue;
+        }
 
         // We can't do this for constraint A because AMO instructions don't have
         // an immediate offset field.
diff --git a/llvm/test/CodeGen/RISCV/inline-asm-mem-constraint.ll b/llvm/test/CodeGen/RISCV/inline-asm-mem-constraint.ll
index 52d0dabf18839..6666d92feaac2 100644
--- a/llvm/test/CodeGen/RISCV/inline-asm-mem-constraint.ll
+++ b/llvm/test/CodeGen/RISCV/inline-asm-mem-constraint.ll
@@ -2252,3 +2252,53 @@ label:
   call void asm "lw zero, $0", "*A"(ptr elementtype(i32) getelementptr (i8, ptr blockaddress(@constraint_A_with_local_3, %label), i32 2000))
   ret void
 }
+
+ at _ZN5repro9MY_BUFFER17hb0f674501d5980a6E = external global <{ [16 x i8] }>
+
+; Address is not used by a memory constraint.
+define void @should_not_fold() {
+; RV32I-LABEL: should_not_fold:
+; RV32I:       # %bb.0: # %start
+; RV32I-NEXT:    .cfi_def_cfa_offset 0
+; RV32I-NEXT:    lui a0, %hi(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV32I-NEXT:    addi a0, a0, %lo(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV32I-NEXT:    #APP
+; RV32I-NEXT:    ecall
+; RV32I-NEXT:    #NO_APP
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: should_not_fold:
+; RV64I:       # %bb.0: # %start
+; RV64I-NEXT:    .cfi_def_cfa_offset 0
+; RV64I-NEXT:    lui a0, %hi(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV64I-NEXT:    addi a0, a0, %lo(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV64I-NEXT:    #APP
+; RV64I-NEXT:    ecall
+; RV64I-NEXT:    #NO_APP
+; RV64I-NEXT:    ret
+;
+; RV32I-MEDIUM-LABEL: should_not_fold:
+; RV32I-MEDIUM:       # %bb.0: # %start
+; RV32I-MEDIUM-NEXT:    .cfi_def_cfa_offset 0
+; RV32I-MEDIUM-NEXT:  .Lpcrel_hi39:
+; RV32I-MEDIUM-NEXT:    auipc a0, %pcrel_hi(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV32I-MEDIUM-NEXT:    addi a0, a0, %pcrel_lo(.Lpcrel_hi39)
+; RV32I-MEDIUM-NEXT:    #APP
+; RV32I-MEDIUM-NEXT:    ecall
+; RV32I-MEDIUM-NEXT:    #NO_APP
+; RV32I-MEDIUM-NEXT:    ret
+;
+; RV64I-MEDIUM-LABEL: should_not_fold:
+; RV64I-MEDIUM:       # %bb.0: # %start
+; RV64I-MEDIUM-NEXT:    .cfi_def_cfa_offset 0
+; RV64I-MEDIUM-NEXT:  .Lpcrel_hi39:
+; RV64I-MEDIUM-NEXT:    auipc a0, %pcrel_hi(_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+; RV64I-MEDIUM-NEXT:    addi a0, a0, %pcrel_lo(.Lpcrel_hi39)
+; RV64I-MEDIUM-NEXT:    #APP
+; RV64I-MEDIUM-NEXT:    ecall
+; RV64I-MEDIUM-NEXT:    #NO_APP
+; RV64I-MEDIUM-NEXT:    ret
+start:
+  %0 = tail call ptr asm sideeffect alignstack "ecall", "=&{x10},0,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(ptr @_ZN5repro9MY_BUFFER17hb0f674501d5980a6E)
+  ret void
+}

>From f53633b1e26f2c3dfb229fccf94b9738edb50cd9 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Tue, 23 Jul 2024 13:04:54 -0500
Subject: [PATCH 52/91] [libc++][libc++abi] Minor follow-up changes after
 ptrauth upstreaming (#87481)

This patch applies the comments provided on #84573. This is done as a
separate PR to avoid merge conflicts with downstreams that already had
ptrauth support.

(cherry picked from commit e64e745e8fb802ffb06259b1a5ba3db713a17087)
---
 libcxx/include/typeinfo                   |  9 ++++---
 libcxx/src/include/overridable_function.h |  6 ++---
 libcxxabi/src/private_typeinfo.cpp        | 33 +++++++++--------------
 3 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/libcxx/include/typeinfo b/libcxx/include/typeinfo
index d1c0de3c1bfdd..2727cad02fa99 100644
--- a/libcxx/include/typeinfo
+++ b/libcxx/include/typeinfo
@@ -275,13 +275,14 @@ struct __type_info_implementations {
           __impl;
 };
 
-#    if defined(__arm64__) && __has_cpp_attribute(clang::ptrauth_vtable_pointer)
-#      if __has_feature(ptrauth_type_info_discriminated_vtable_pointer)
+#    if __has_cpp_attribute(_Clang::__ptrauth_vtable_pointer__)
+#      if __has_feature(ptrauth_type_info_vtable_pointer_discrimination)
 #        define _LIBCPP_TYPE_INFO_VTABLE_POINTER_AUTH                                                                  \
-          [[clang::ptrauth_vtable_pointer(process_independent, address_discrimination, type_discrimination)]]
+          [[_Clang::__ptrauth_vtable_pointer__(process_independent, address_discrimination, type_discrimination)]]
 #      else
 #        define _LIBCPP_TYPE_INFO_VTABLE_POINTER_AUTH                                                                  \
-          [[clang::ptrauth_vtable_pointer(process_independent, no_address_discrimination, no_extra_discrimination)]]
+          [[_Clang::__ptrauth_vtable_pointer__(                                                                        \
+              process_independent, no_address_discrimination, no_extra_discrimination)]]
 #      endif
 #    else
 #      define _LIBCPP_TYPE_INFO_VTABLE_POINTER_AUTH
diff --git a/libcxx/src/include/overridable_function.h b/libcxx/src/include/overridable_function.h
index e71e4f104b290..c7639f56eee26 100644
--- a/libcxx/src/include/overridable_function.h
+++ b/libcxx/src/include/overridable_function.h
@@ -13,7 +13,7 @@
 #include <__config>
 #include <cstdint>
 
-#if defined(__arm64e__) && __has_feature(ptrauth_calls)
+#if __has_feature(ptrauth_calls)
 #  include <ptrauth.h>
 #endif
 
@@ -83,13 +83,13 @@ _LIBCPP_HIDE_FROM_ABI bool __is_function_overridden(_Ret (*__fptr)(_Args...)) no
   uintptr_t __end   = reinterpret_cast<uintptr_t>(&__lcxx_override_end);
   uintptr_t __ptr   = reinterpret_cast<uintptr_t>(__fptr);
 
-#if defined(__arm64e__) && __has_feature(ptrauth_calls)
+#  if __has_feature(ptrauth_calls)
   // We must pass a void* to ptrauth_strip since it only accepts a pointer type. Also, in particular,
   // we must NOT pass a function pointer, otherwise we will strip the function pointer, and then attempt
   // to authenticate and re-sign it when casting it to a uintptr_t again, which will fail because we just
   // stripped the function pointer. See rdar://122927845.
   __ptr = reinterpret_cast<uintptr_t>(ptrauth_strip(reinterpret_cast<void*>(__ptr), ptrauth_key_function_pointer));
-#endif
+#  endif
 
   // Finally, the function was overridden if it falls outside of the section's bounds.
   return __ptr < __start || __ptr > __end;
diff --git a/libcxxabi/src/private_typeinfo.cpp b/libcxxabi/src/private_typeinfo.cpp
index 9e58501a55934..9dba91e1985e3 100644
--- a/libcxxabi/src/private_typeinfo.cpp
+++ b/libcxxabi/src/private_typeinfo.cpp
@@ -55,15 +55,12 @@
 #include <ptrauth.h>
 #endif
 
-
-template<typename T>
-static inline
-T *
-get_vtable(T *vtable) {
+template <typename T>
+static inline T* strip_vtable(T* vtable) {
 #if __has_feature(ptrauth_calls)
-    vtable = ptrauth_strip(vtable, ptrauth_key_cxx_vtable_pointer);
+  vtable = ptrauth_strip(vtable, ptrauth_key_cxx_vtable_pointer);
 #endif
-    return vtable;
+  return vtable;
 }
 
 static inline
@@ -117,11 +114,10 @@ void dyn_cast_get_derived_info(derived_object_info* info, const void* static_ptr
         reinterpret_cast<const uint8_t*>(vtable) + offset_to_ti_proxy;
     info->dynamic_type = *(reinterpret_cast<const __class_type_info* const*>(ptr_to_ti_proxy));
 #else
-    void **vtable = *static_cast<void ** const *>(static_ptr);
-    vtable = get_vtable(vtable);
-    info->offset_to_derived = reinterpret_cast<ptrdiff_t>(vtable[-2]);
-    info->dynamic_ptr = static_cast<const char*>(static_ptr) + info->offset_to_derived;
-    info->dynamic_type = static_cast<const __class_type_info*>(vtable[-1]);
+  void** vtable = strip_vtable(*static_cast<void** const*>(static_ptr));
+  info->offset_to_derived = reinterpret_cast<ptrdiff_t>(vtable[-2]);
+  info->dynamic_ptr = static_cast<const char*>(static_ptr) + info->offset_to_derived;
+  info->dynamic_type = static_cast<const __class_type_info*>(vtable[-1]);
 #endif
 }
 
@@ -576,8 +572,7 @@ __base_class_type_info::has_unambiguous_public_base(__dynamic_cast_info* info,
        find the layout.  */
     offset_to_base = __offset_flags >> __offset_shift;
     if (is_virtual) {
-      const char* vtable = *static_cast<const char* const*>(adjustedPtr);
-      vtable = get_vtable(vtable);
+      const char* vtable = strip_vtable(*static_cast<const char* const*>(adjustedPtr));
       offset_to_base = update_offset_to_base(vtable, offset_to_base);
     }
   } else if (!is_virtual) {
@@ -1517,9 +1512,8 @@ __base_class_type_info::search_above_dst(__dynamic_cast_info* info,
     ptrdiff_t offset_to_base = __offset_flags >> __offset_shift;
     if (__offset_flags & __virtual_mask)
     {
-        const char* vtable = *static_cast<const char*const*>(current_ptr);
-        vtable = get_vtable(vtable);
-        offset_to_base = update_offset_to_base(vtable, offset_to_base);
+      const char* vtable = strip_vtable(*static_cast<const char* const*>(current_ptr));
+      offset_to_base = update_offset_to_base(vtable, offset_to_base);
     }
     __base_type->search_above_dst(info, dst_ptr,
                                   static_cast<const char*>(current_ptr) + offset_to_base,
@@ -1538,9 +1532,8 @@ __base_class_type_info::search_below_dst(__dynamic_cast_info* info,
     ptrdiff_t offset_to_base = __offset_flags >> __offset_shift;
     if (__offset_flags & __virtual_mask)
     {
-        const char* vtable = *static_cast<const char*const*>(current_ptr);
-        vtable = get_vtable(vtable);
-        offset_to_base = update_offset_to_base(vtable, offset_to_base);
+      const char* vtable = strip_vtable(*static_cast<const char* const*>(current_ptr));
+      offset_to_base = update_offset_to_base(vtable, offset_to_base);
     }
     __base_type->search_below_dst(info,
                                   static_cast<const char*>(current_ptr) + offset_to_base,

>From e1d05010c3277ffb1f71fbb7bfad66704dcfc4cf Mon Sep 17 00:00:00 2001
From: Utkarsh Saxena <usx at google.com>
Date: Wed, 24 Jul 2024 15:58:52 +0200
Subject: [PATCH 53/91] Fix lifetimebound for field access (#100197)

Fixes: https://github.com/llvm/llvm-project/issues/81589

There is no way to switch this off without  `-Wno-dangling`.
---
 clang/docs/ReleaseNotes.rst               |  3 +++
 clang/lib/Sema/CheckExprLifetime.cpp      |  9 ++++++++
 clang/test/SemaCXX/attr-lifetimebound.cpp | 26 +++++++++++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 549da6812740f..71d615553c613 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -767,6 +767,9 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses undefined behavior in constant expressions more consistently. This includes invalid shifts, and signed overflow in arithmetic.
 
+- Clang now diagnoses dangling references to fields of temporary objects. Fixes #GH81589.
+
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/lib/Sema/CheckExprLifetime.cpp b/clang/lib/Sema/CheckExprLifetime.cpp
index 5c8ef564f30aa..112cf3d081822 100644
--- a/clang/lib/Sema/CheckExprLifetime.cpp
+++ b/clang/lib/Sema/CheckExprLifetime.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "CheckExprLifetime.h"
+#include "clang/AST/Decl.h"
 #include "clang/AST/Expr.h"
 #include "clang/Basic/DiagnosticSema.h"
 #include "clang/Sema/Initialization.h"
@@ -548,6 +549,14 @@ static void visitLocalsRetainedByReferenceBinding(IndirectLocalPath &Path,
                                        EnableLifetimeWarnings);
   }
 
+  if (auto *M = dyn_cast<MemberExpr>(Init)) {
+    // Lifetime of a non-reference type field is same as base object.
+    if (auto *F = dyn_cast<FieldDecl>(M->getMemberDecl());
+        F && !F->getType()->isReferenceType())
+      visitLocalsRetainedByInitializer(Path, M->getBase(), Visit, true,
+                                       EnableLifetimeWarnings);
+  }
+
   if (isa<CallExpr>(Init)) {
     if (EnableLifetimeWarnings)
       handleGslAnnotatedTypes(Path, Init, Visit);
diff --git a/clang/test/SemaCXX/attr-lifetimebound.cpp b/clang/test/SemaCXX/attr-lifetimebound.cpp
index 70bc545c07bd9..7db0a4d64d259 100644
--- a/clang/test/SemaCXX/attr-lifetimebound.cpp
+++ b/clang/test/SemaCXX/attr-lifetimebound.cpp
@@ -47,6 +47,31 @@ namespace usage_ok {
     q = A(); // expected-warning {{object backing the pointer q will be destroyed at the end of the full-expression}}
     r = A(1); // expected-warning {{object backing the pointer r will be destroyed at the end of the full-expression}}
   }
+
+  struct FieldCheck {
+    struct Set {
+      int a;
+    };
+    struct Pair {
+      const int& a;
+      int b;
+      Set c;
+      int * d;
+    };
+    Pair p;  
+    FieldCheck(const int& a): p(a){}
+    Pair& getR() [[clang::lifetimebound]] { return p; }
+    Pair* getP() [[clang::lifetimebound]] { return &p; }
+    Pair* getNoLB() { return &p; }
+  };
+  void test_field_access() {
+    int x = 0;
+    const int& a = FieldCheck{x}.getR().a;
+    const int& b = FieldCheck{x}.getP()->b;   // expected-warning {{temporary bound to local reference 'b' will be destroyed at the end of the full-expression}}
+    const int& c = FieldCheck{x}.getP()->c.a; // expected-warning {{temporary bound to local reference 'c' will be destroyed at the end of the full-expression}}
+    const int& d = FieldCheck{x}.getNoLB()->c.a;
+    const int* e = FieldCheck{x}.getR().d;
+  }
 }
 
 # 1 "<std>" 1 3
@@ -239,3 +264,4 @@ namespace move_forward_et_al_examples {
   S X;
   S *AddressOfOk = std::addressof(X);
 } // namespace move_forward_et_al_examples
+

>From 07ef07813483c6ffa721f795c475cdc3f2341723 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Thu, 25 Jul 2024 16:45:09 -0700
Subject: [PATCH 54/91] [ELF] Remove obsoleted comment after #99567

(cherry picked from commit 026972af9c3cbd85b654b67a5b5c3b754a78a997)
---
 lld/ELF/ScriptLexer.cpp | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lld/ELF/ScriptLexer.cpp b/lld/ELF/ScriptLexer.cpp
index c8c02ab0f3e09..847cd8423b49a 100644
--- a/lld/ELF/ScriptLexer.cpp
+++ b/lld/ELF/ScriptLexer.cpp
@@ -20,11 +20,6 @@
 // in various corner cases. We do not care much about efficiency because
 // the time spent in parsing linker scripts is usually negligible.
 //
-// Our grammar of the linker script is LL(2), meaning that it needs at
-// most two-token lookahead to parse. The only place we need two-token
-// lookahead is labels in version scripts, where we need to parse "local :"
-// as if "local:".
-//
 // Overall, this lexer works fine for most linker scripts. There might
 // be room for improving compatibility, but that's probably not at the
 // top of our todo list.

>From 28e2baaeed86fd330d1e0fcacefaf6221685be23 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Thu, 25 Jul 2024 17:11:52 -0700
Subject: [PATCH 55/91] [ELF,test] Improve negative linker script tests

(cherry picked from commit 8644a2aa0f3540c69464f56b3d538e888b6cbdcb)
---
 lld/test/ELF/linkerscript/diag.test           | 49 +++++++++++++++++++
 lld/test/ELF/linkerscript/diag1.test          | 15 ------
 lld/test/ELF/linkerscript/diag2.test          | 13 -----
 lld/test/ELF/linkerscript/diag3.test          | 13 -----
 lld/test/ELF/linkerscript/diag4.test          | 14 ------
 lld/test/ELF/linkerscript/diag5.test          | 14 ------
 lld/test/ELF/linkerscript/diag6.test          |  7 ---
 .../invalid.test}                             |  0
 lld/test/ELF/linkerscript/unquoted.test       | 26 ++++++++++
 9 files changed, 75 insertions(+), 76 deletions(-)
 create mode 100644 lld/test/ELF/linkerscript/diag.test
 delete mode 100644 lld/test/ELF/linkerscript/diag1.test
 delete mode 100644 lld/test/ELF/linkerscript/diag2.test
 delete mode 100644 lld/test/ELF/linkerscript/diag3.test
 delete mode 100644 lld/test/ELF/linkerscript/diag4.test
 delete mode 100644 lld/test/ELF/linkerscript/diag5.test
 delete mode 100644 lld/test/ELF/linkerscript/diag6.test
 rename lld/test/ELF/{invalid-linkerscript.test => linkerscript/invalid.test} (100%)
 create mode 100644 lld/test/ELF/linkerscript/unquoted.test

diff --git a/lld/test/ELF/linkerscript/diag.test b/lld/test/ELF/linkerscript/diag.test
new file mode 100644
index 0000000000000..fbc24659a5311
--- /dev/null
+++ b/lld/test/ELF/linkerscript/diag.test
@@ -0,0 +1,49 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 /dev/null -o 0.o
+
+#--- 1.lds
+SECTIONS {
+  .text + { *(.text) }
+  .keep : { *(.keep) } /*
+  comment line 1
+  comment line 2 */
+  .temp : { *(.temp) }
+}
+
+# RUN: not ld.lld -shared 0.o -T 1.lds 2>&1 | FileCheck %s --check-prefix=CHECK1 --match-full-lines --strict-whitespace
+#      CHECK1:{{.*}}:2: malformed number: +
+# CHECK1-NEXT:>>>   .text + { *(.text) }
+# CHECK1-NEXT:>>>         ^
+
+#--- 2.lds
+
+UNKNOWN_TAG {
+  .text : { *(.text) }
+  .keep : { *(.keep) }
+  .temp : { *(.temp) }
+}
+
+# RUN: not ld.lld -shared 0.o -T 2.lds 2>&1 | FileCheck %s --check-prefix=CHECK2 --match-full-lines --strict-whitespace
+#      CHECK2:{{.*}}:2: unknown directive: UNKNOWN_TAG
+# CHECK2-NEXT:>>> UNKNOWN_TAG {
+# CHECK2-NEXT:>>> ^
+
+#--- 3.lds
+SECTIONS {
+  .text : { *(.text) }
+  .keep : { *(.keep) }
+  boom ^temp : { *(.temp) }
+}
+#--- 3a.lds
+INCLUDE "3.lds"
+#--- 3b.lds
+foo = 3;
+INCLUDE "3a.lds"
+
+# RUN: not ld.lld -shared 0.o -T 3.lds 2>&1 | FileCheck %s --check-prefix=CHECK3 --match-full-lines --strict-whitespace
+# RUN: not ld.lld -shared 0.o -T 3a.lds 2>&1 | FileCheck %s --check-prefix=CHECK3 --match-full-lines --strict-whitespace
+# RUN: not ld.lld -shared 0.o -T 3b.lds 2>&1 | FileCheck %s --check-prefix=CHECK3 --match-full-lines --strict-whitespace
+#      CHECK3:{{.*}}3.lds:4: malformed number: ^
+# CHECK3-NEXT:>>>   boom ^temp : { *(.temp) }
+# CHECK3-NEXT:>>>        ^
diff --git a/lld/test/ELF/linkerscript/diag1.test b/lld/test/ELF/linkerscript/diag1.test
deleted file mode 100644
index 829bc5a1bffaf..0000000000000
--- a/lld/test/ELF/linkerscript/diag1.test
+++ /dev/null
@@ -1,15 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %s 2>&1 | FileCheck -strict-whitespace %s
-
-SECTIONS {
-  .text + { *(.text) }
-  .keep : { *(.keep) } /*
-  comment line 1
-  comment line 2 */
-  .temp : { *(.temp) }
-}
-
-CHECK:      6: malformed number: +
-CHECK-NEXT: >>>   .text + { *(.text) }
-CHECK-NEXT: >>>         ^
diff --git a/lld/test/ELF/linkerscript/diag2.test b/lld/test/ELF/linkerscript/diag2.test
deleted file mode 100644
index aeb623dbb7f4b..0000000000000
--- a/lld/test/ELF/linkerscript/diag2.test
+++ /dev/null
@@ -1,13 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %s 2>&1 | FileCheck -strict-whitespace %s
-
-UNKNOWN_TAG {
-  .text : { *(.text) }
-  .keep : { *(.keep) }
-  .temp : { *(.temp) }
-}
-
-CHECK:      5: unknown directive: UNKNOWN_TAG
-CHECK-NEXT: >>> UNKNOWN_TAG {
-CHECK-NEXT: >>> ^
diff --git a/lld/test/ELF/linkerscript/diag3.test b/lld/test/ELF/linkerscript/diag3.test
deleted file mode 100644
index 1df8d601db016..0000000000000
--- a/lld/test/ELF/linkerscript/diag3.test
+++ /dev/null
@@ -1,13 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %s 2>&1 | FileCheck -strict-whitespace %s
-
-SECTIONS {
-  .text : { *(.text) }
-  .keep : { *(.keep) }
-  boom ^temp : { *(.temp) }
-}
-
-# CHECK:      8: malformed number: ^
-# CHECK-NEXT: >>>   boom ^temp : { *(.temp) }
-# CHECK-NEXT: >>>        ^
diff --git a/lld/test/ELF/linkerscript/diag4.test b/lld/test/ELF/linkerscript/diag4.test
deleted file mode 100644
index d93a69a95c61d..0000000000000
--- a/lld/test/ELF/linkerscript/diag4.test
+++ /dev/null
@@ -1,14 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: echo "INCLUDE \"%s\"" > %t.script
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %t.script 2>&1 | FileCheck -strict-whitespace %s
-
-SECTIONS {
-  .text : { *(.text) }
-  .keep : { *(.keep) }
-  boom ^temp : { *(.temp) }
-}
-
-# CHECK:      9: malformed number: ^{{$}}
-# CHECK-NEXT: >>>   boom ^temp : { *(.temp) }
-# CHECK-NEXT: >>>        ^
diff --git a/lld/test/ELF/linkerscript/diag5.test b/lld/test/ELF/linkerscript/diag5.test
deleted file mode 100644
index 9a2304baa4413..0000000000000
--- a/lld/test/ELF/linkerscript/diag5.test
+++ /dev/null
@@ -1,14 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: echo "INCLUDE \"%s\"" > %t.script
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %t.script 2>&1 | FileCheck -strict-whitespace %s
-
-SECTIONS {
-  .text : { *(.text) }
-  .keep : { *(.keep) }
-  boom ^temp : { *(.temp) }
-}
-
-# CHECK:      9: malformed number: ^
-# CHECK-NEXT: >>>   boom ^temp : { *(.temp) }
-# CHECK-NEXT: >>>        ^
diff --git a/lld/test/ELF/linkerscript/diag6.test b/lld/test/ELF/linkerscript/diag6.test
deleted file mode 100644
index 0ec0400040b54..0000000000000
--- a/lld/test/ELF/linkerscript/diag6.test
+++ /dev/null
@@ -1,7 +0,0 @@
-# REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux /dev/null -o %t.o
-# RUN: not ld.lld -shared %t.o -o /dev/null --script %s 2>&1 | FileCheck %s
-
-SECTIONS /*
-
-CHECK: error: {{.*}}diag6.test:1: unclosed comment in a linker script
diff --git a/lld/test/ELF/invalid-linkerscript.test b/lld/test/ELF/linkerscript/invalid.test
similarity index 100%
rename from lld/test/ELF/invalid-linkerscript.test
rename to lld/test/ELF/linkerscript/invalid.test
diff --git a/lld/test/ELF/linkerscript/unquoted.test b/lld/test/ELF/linkerscript/unquoted.test
new file mode 100644
index 0000000000000..7dca75fe09ab1
--- /dev/null
+++ b/lld/test/ELF/linkerscript/unquoted.test
@@ -0,0 +1,26 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 /dev/null -o 0.o
+
+#--- empty.lds
+#--- 1.lds
+
+SECTIONS /*
+#--- 1a.lds
+foo = 3;
+INCLUDE "empty.lds"
+INCLUDE "1.lds"
+
+# RUN: not ld.lld -shared 0.o -T 1.lds 2>&1 | FileCheck %s --check-prefix=CHECK1 --match-full-lines --strict-whitespace
+# RUN: not ld.lld -shared 0.o -T 1a.lds 2>&1 | FileCheck %s --check-prefix=CHECK1A --match-full-lines --strict-whitespace
+#      CHECK1:{{.*}}error: 1.lds:1: unclosed comment in a linker script
+#     CHECK1A:{{.*}}error: 1a.lds:3: unclosed comment in a linker script
+#CHECK1A-NEXT:>>> INCLUDE "1.lds"
+#CHECK1A-NEXT:>>>         ^
+
+#--- 2.lds
+INCLUDE "empty.lds"
+"
+# RUN: not ld.lld -shared 0.o -T 2.lds 2>&1 | FileCheck %s --check-prefix=CHECK2 --match-full-lines --strict-whitespace
+#      CHECK2:{{.*}}error: 2.lds:2: unclosed quote
+# CHECK2-NOT:{{.}}

>From 66c08d9095d8795ef5d8ff7bb9940d560a4e0eab Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Sat, 27 Jul 2024 10:55:17 -0700
Subject: [PATCH 56/91] [ELF] Add Relocs and invokeOnRelocs. NFC

Relocs is to simplify CREL support (#98115) while invokeOnRelocs
simplifies some relsOrRelas call sites that will use the CREL iterator.

(cherry picked from commit 6efc3774bd8c5fcb105cda73ec27c05ef850dc19)
---
 lld/ELF/ICF.cpp               | 18 +++++++++---------
 lld/ELF/InputSection.cpp      |  2 +-
 lld/ELF/InputSection.h        | 15 ++++++++++++---
 lld/ELF/Relocations.cpp       | 18 ++++++++----------
 lld/ELF/Relocations.h         | 10 ++++++++--
 lld/ELF/SyntheticSections.cpp | 12 ++++--------
 lld/ELF/SyntheticSections.h   |  5 +++--
 7 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/lld/ELF/ICF.cpp b/lld/ELF/ICF.cpp
index bfc605c793a92..a6b52d78fa806 100644
--- a/lld/ELF/ICF.cpp
+++ b/lld/ELF/ICF.cpp
@@ -103,12 +103,12 @@ template <class ELFT> class ICF {
   void segregate(size_t begin, size_t end, uint32_t eqClassBase, bool constant);
 
   template <class RelTy>
-  bool constantEq(const InputSection *a, ArrayRef<RelTy> relsA,
-                  const InputSection *b, ArrayRef<RelTy> relsB);
+  bool constantEq(const InputSection *a, Relocs<RelTy> relsA,
+                  const InputSection *b, Relocs<RelTy> relsB);
 
   template <class RelTy>
-  bool variableEq(const InputSection *a, ArrayRef<RelTy> relsA,
-                  const InputSection *b, ArrayRef<RelTy> relsB);
+  bool variableEq(const InputSection *a, Relocs<RelTy> relsA,
+                  const InputSection *b, Relocs<RelTy> relsB);
 
   bool equalsConstant(const InputSection *a, const InputSection *b);
   bool equalsVariable(const InputSection *a, const InputSection *b);
@@ -235,8 +235,8 @@ void ICF<ELFT>::segregate(size_t begin, size_t end, uint32_t eqClassBase,
 // Compare two lists of relocations.
 template <class ELFT>
 template <class RelTy>
-bool ICF<ELFT>::constantEq(const InputSection *secA, ArrayRef<RelTy> ra,
-                           const InputSection *secB, ArrayRef<RelTy> rb) {
+bool ICF<ELFT>::constantEq(const InputSection *secA, Relocs<RelTy> ra,
+                           const InputSection *secB, Relocs<RelTy> rb) {
   if (ra.size() != rb.size())
     return false;
   auto rai = ra.begin(), rae = ra.end(), rbi = rb.begin();
@@ -333,8 +333,8 @@ bool ICF<ELFT>::equalsConstant(const InputSection *a, const InputSection *b) {
 // relocations point to the same section in terms of ICF.
 template <class ELFT>
 template <class RelTy>
-bool ICF<ELFT>::variableEq(const InputSection *secA, ArrayRef<RelTy> ra,
-                           const InputSection *secB, ArrayRef<RelTy> rb) {
+bool ICF<ELFT>::variableEq(const InputSection *secA, Relocs<RelTy> ra,
+                           const InputSection *secB, Relocs<RelTy> rb) {
   assert(ra.size() == rb.size());
 
   auto rai = ra.begin(), rae = ra.end(), rbi = rb.begin();
@@ -441,7 +441,7 @@ void ICF<ELFT>::forEachClass(llvm::function_ref<void(size_t, size_t)> fn) {
 // hash.
 template <class RelTy>
 static void combineRelocHashes(unsigned cnt, InputSection *isec,
-                               ArrayRef<RelTy> rels) {
+                               Relocs<RelTy> rels) {
   uint32_t hash = isec->eqClass[cnt % 2];
   for (RelTy rel : rels) {
     Symbol &s = isec->file->getRelocTargetSym(rel);
diff --git a/lld/ELF/InputSection.cpp b/lld/ELF/InputSection.cpp
index 12ab1f1eac808..306872f164f6f 100644
--- a/lld/ELF/InputSection.cpp
+++ b/lld/ELF/InputSection.cpp
@@ -911,7 +911,7 @@ uint64_t InputSectionBase::getRelocTargetVA(const InputFile *file, RelType type,
 // So, we handle relocations for non-alloc sections directly in this
 // function as a performance optimization.
 template <class ELFT, class RelTy>
-void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
+void InputSection::relocateNonAlloc(uint8_t *buf, Relocs<RelTy> rels) {
   const unsigned bits = sizeof(typename ELFT::uint) * 8;
   const TargetInfo &target = *elf::target;
   const auto emachine = config->emachine;
diff --git a/lld/ELF/InputSection.h b/lld/ELF/InputSection.h
index ec12235f842a9..c89a545e1543f 100644
--- a/lld/ELF/InputSection.h
+++ b/lld/ELF/InputSection.h
@@ -37,11 +37,20 @@ LLVM_LIBRARY_VISIBILITY extern std::vector<Partition> partitions;
 
 // Returned by InputSectionBase::relsOrRelas. At least one member is empty.
 template <class ELFT> struct RelsOrRelas {
-  ArrayRef<typename ELFT::Rel> rels;
-  ArrayRef<typename ELFT::Rela> relas;
+  Relocs<typename ELFT::Rel> rels;
+  Relocs<typename ELFT::Rela> relas;
   bool areRelocsRel() const { return rels.size(); }
 };
 
+#define invokeOnRelocs(sec, f, ...)                                            \
+  {                                                                            \
+    const RelsOrRelas<ELFT> rs = (sec).template relsOrRelas<ELFT>();           \
+    if (rs.areRelocsRel())                                                     \
+      f(__VA_ARGS__, rs.rels);                                                 \
+    else                                                                       \
+      f(__VA_ARGS__, rs.relas);                                                \
+  }
+
 // This is the base class of all sections that lld handles. Some are sections in
 // input files, some are sections in the produced output file and some exist
 // just as a convenience for implementing special ways of combining some
@@ -407,7 +416,7 @@ class InputSection : public InputSectionBase {
   InputSectionBase *getRelocatedSection() const;
 
   template <class ELFT, class RelTy>
-  void relocateNonAlloc(uint8_t *buf, llvm::ArrayRef<RelTy> rels);
+  void relocateNonAlloc(uint8_t *buf, Relocs<RelTy> rels);
 
   // Points to the canonical section. If ICF folds two sections, repl pointer of
   // one section points to the other.
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 36857d72c647e..713bd15440251 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -475,8 +475,9 @@ class RelocationScanner {
                                 uint64_t relOff) const;
   void processAux(RelExpr expr, RelType type, uint64_t offset, Symbol &sym,
                   int64_t addend) const;
-  template <class ELFT, class RelTy> void scanOne(RelTy *&i);
-  template <class ELFT, class RelTy> void scan(ArrayRef<RelTy> rels);
+  template <class ELFT, class RelTy>
+  void scanOne(typename Relocs<RelTy>::const_iterator &i);
+  template <class ELFT, class RelTy> void scan(Relocs<RelTy> rels);
 };
 } // namespace
 
@@ -1433,7 +1434,8 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym,
   return 0;
 }
 
-template <class ELFT, class RelTy> void RelocationScanner::scanOne(RelTy *&i) {
+template <class ELFT, class RelTy>
+void RelocationScanner::scanOne(typename Relocs<RelTy>::const_iterator &i) {
   const RelTy &rel = *i;
   uint32_t symIndex = rel.getSymbol(config->isMips64EL);
   Symbol &sym = sec->getFile<ELFT>()->getSymbol(symIndex);
@@ -1574,7 +1576,7 @@ static void checkPPC64TLSRelax(InputSectionBase &sec, ArrayRef<RelTy> rels) {
 }
 
 template <class ELFT, class RelTy>
-void RelocationScanner::scan(ArrayRef<RelTy> rels) {
+void RelocationScanner::scan(Relocs<RelTy> rels) {
   // Not all relocations end up in Sec->Relocations, but a lot do.
   sec->relocations.reserve(rels.size());
 
@@ -1592,7 +1594,7 @@ void RelocationScanner::scan(ArrayRef<RelTy> rels) {
 
   end = static_cast<const void *>(rels.end());
   for (auto i = rels.begin(); i != end;)
-    scanOne<ELFT>(i);
+    scanOne<ELFT, RelTy>(i);
 
   // Sort relocations by offset for more efficient searching for
   // R_RISCV_PCREL_HI20 and R_PPC64_ADDR64.
@@ -2409,11 +2411,7 @@ template <class ELFT> void elf::checkNoCrossRefs() {
         if (!isd)
           continue;
         parallelForEach(isd->sections, [&](InputSection *sec) {
-          const RelsOrRelas<ELFT> rels = sec->template relsOrRelas<ELFT>();
-          if (rels.areRelocsRel())
-            scanCrossRefs<ELFT>(noxref, osec, sec, rels.rels);
-          else
-            scanCrossRefs<ELFT>(noxref, osec, sec, rels.relas);
+          invokeOnRelocs(*sec, scanCrossRefs<ELFT>, noxref, osec, sec);
         });
       }
     }
diff --git a/lld/ELF/Relocations.h b/lld/ELF/Relocations.h
index 1bee0dedf8587..77d8d52ca3d3f 100644
--- a/lld/ELF/Relocations.h
+++ b/lld/ELF/Relocations.h
@@ -205,6 +205,11 @@ class ThunkCreator {
   uint32_t pass = 0;
 };
 
+template <class RelTy> struct Relocs : ArrayRef<RelTy> {
+  Relocs() = default;
+  Relocs(ArrayRef<RelTy> a) : ArrayRef<RelTy>(a) {}
+};
+
 // Return a int64_t to make sure we get the sign extension out of the way as
 // early as possible.
 template <class ELFT>
@@ -217,14 +222,15 @@ static inline int64_t getAddend(const typename ELFT::Rela &rel) {
 }
 
 template <typename RelTy>
-ArrayRef<RelTy> sortRels(ArrayRef<RelTy> rels, SmallVector<RelTy, 0> &storage) {
+inline Relocs<RelTy> sortRels(Relocs<RelTy> rels,
+                              SmallVector<RelTy, 0> &storage) {
   auto cmp = [](const RelTy &a, const RelTy &b) {
     return a.r_offset < b.r_offset;
   };
   if (!llvm::is_sorted(rels, cmp)) {
     storage.assign(rels.begin(), rels.end());
     llvm::stable_sort(storage, cmp);
-    rels = storage;
+    rels = Relocs<RelTy>(storage);
   }
   return rels;
 }
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index 5d3f3df216b85..b40ff0bc3cb03 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -3203,10 +3203,10 @@ template <class ELFT> DebugNamesSection<ELFT>::DebugNamesSection() {
 template <class ELFT>
 template <class RelTy>
 void DebugNamesSection<ELFT>::getNameRelocs(
-    InputSection *sec, ArrayRef<RelTy> rels,
-    DenseMap<uint32_t, uint32_t> &relocs) {
+    const InputFile &file, DenseMap<uint32_t, uint32_t> &relocs,
+    Relocs<RelTy> rels) {
   for (const RelTy &rel : rels) {
-    Symbol &sym = sec->file->getRelocTargetSym(rel);
+    Symbol &sym = file.getRelocTargetSym(rel);
     relocs[rel.r_offset] = sym.getVA(getAddend<ELFT>(rel));
   }
 }
@@ -3216,11 +3216,7 @@ template <class ELFT> void DebugNamesSection<ELFT>::finalizeContents() {
   auto relocs = std::make_unique<DenseMap<uint32_t, uint32_t>[]>(numChunks);
   parallelFor(0, numChunks, [&](size_t i) {
     InputSection *sec = inputSections[i];
-    auto rels = sec->template relsOrRelas<ELFT>();
-    if (rels.areRelocsRel())
-      getNameRelocs(sec, rels.rels, relocs.get()[i]);
-    else
-      getNameRelocs(sec, rels.relas, relocs.get()[i]);
+    invokeOnRelocs(*sec, getNameRelocs, *sec->file, relocs.get()[i]);
 
     // Relocate CU offsets with .debug_info + X relocations.
     OutputChunk &chunk = chunks.get()[i];
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index eaa09ea7194fb..d4169e1e1acaf 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -916,8 +916,9 @@ class DebugNamesSection final : public DebugNamesBaseSection {
   void writeTo(uint8_t *buf) override;
 
   template <class RelTy>
-  void getNameRelocs(InputSection *sec, ArrayRef<RelTy> rels,
-                     llvm::DenseMap<uint32_t, uint32_t> &relocs);
+  void getNameRelocs(const InputFile &file,
+                     llvm::DenseMap<uint32_t, uint32_t> &relocs,
+                     Relocs<RelTy> rels);
 
 private:
   static void readOffsets(InputChunk &inputChunk, OutputChunk &chunk,

>From 5d9f4600865ca5d734a73f540136402462deac91 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Sat, 27 Jul 2024 10:58:27 -0700
Subject: [PATCH 57/91] [ELF] Use invokeOnRelocs. NFC

(cherry picked from commit c7231e49099d56fdc5b2207142184a0bf2544ec1)
---
 lld/ELF/InputSection.cpp | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lld/ELF/InputSection.cpp b/lld/ELF/InputSection.cpp
index 306872f164f6f..7857d857488c0 100644
--- a/lld/ELF/InputSection.cpp
+++ b/lld/ELF/InputSection.cpp
@@ -1073,11 +1073,7 @@ void InputSectionBase::relocate(uint8_t *buf, uint8_t *bufEnd) {
   auto *sec = cast<InputSection>(this);
   // For a relocatable link, also call relocateNonAlloc() to rewrite applicable
   // locations with tombstone values.
-  const RelsOrRelas<ELFT> rels = sec->template relsOrRelas<ELFT>();
-  if (rels.areRelocsRel())
-    sec->relocateNonAlloc<ELFT>(buf, rels.rels);
-  else
-    sec->relocateNonAlloc<ELFT>(buf, rels.relas);
+  invokeOnRelocs(*sec, sec->relocateNonAlloc<ELFT>, buf);
 }
 
 // For each function-defining prologue, find any calls to __morestack,

>From cbfbbd74788252b3c5488480fc9d6914f9cf0f38 Mon Sep 17 00:00:00 2001
From: Aiden Grossman <aidengrossman at google.com>
Date: Sat, 27 Jul 2024 10:53:23 -0700
Subject: [PATCH 58/91] [llvm-exegesis] Use correct rseq struct size (#100804)

Glibc v2.40 changes the definition of __rseq_size to the usable area of
the struct rather than the actual size of the struct to accommodate
users trying to figure out what features can be used. This change breaks
llvm-exegesis trying to disable rseq as the size registered in the
kernel is no longer equal to __rseq_size. This patch adds a check to see
if __rseq_size is less than 32 bytes and uses 32 as a value if it is
given alignment requirements.

Fixes #100791.

(cherry picked from commit 1e8df9e85a1ff213e5868bd822877695f27504ad)
---
 llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index ed53f8fabb175..adee869967d98 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -466,9 +466,20 @@ class SubProcessFunctionExecutorImpl
 // segfaults in the program. Unregister the rseq region so that we can safely
 // unmap it later
 #ifdef GLIBC_INITS_RSEQ
+    unsigned int RseqStructSize = __rseq_size;
+
+    // Glibc v2.40 (the change is also expected to be backported to v2.35)
+    // changes the definition of __rseq_size to be the usable area of the struct
+    // rather than the actual size of the struct. v2.35 uses only 20 bytes of
+    // the 32 byte struct. For now, it should be safe to assume that if the
+    // usable size is less than 32, the actual size of the struct will be 32
+    // bytes given alignment requirements.
+    if (__rseq_size < 32)
+      RseqStructSize = 32;
+
     long RseqDisableOutput =
         syscall(SYS_rseq, (intptr_t)__builtin_thread_pointer() + __rseq_offset,
-                __rseq_size, RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
+                RseqStructSize, RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
     if (RseqDisableOutput != 0)
       exit(ChildProcessExitCodeE::RSeqDisableFailed);
 #endif // GLIBC_INITS_RSEQ

>From d728d60cea0181ddd38f8710dcc1e13cd1540c56 Mon Sep 17 00:00:00 2001
From: wanglei <wanglei at loongson.cn>
Date: Fri, 26 Jul 2024 14:38:36 +0800
Subject: [PATCH 59/91] [lld][ELF][LoongArch] Support
 R_LARCH_TLS_{LD,GD,DESC}_PCREL_S2

Reviewed By: MaskRay, SixWeining

Pull Request: https://github.com/llvm/llvm-project/pull/100105

(cherry picked from commit 0057a969a2a397c1ba57e06b65a8bb56af2ce987)
---
 lld/ELF/Arch/LoongArch.cpp                  |  10 ++
 lld/ELF/Relocations.cpp                     |   3 +-
 lld/test/ELF/loongarch-tls-gd-pcrel20-s2.s  | 129 ++++++++++++++++++
 lld/test/ELF/loongarch-tls-ld-pcrel20-s2.s  |  82 +++++++++++
 lld/test/ELF/loongarch-tlsdesc-pcrel20-s2.s | 142 ++++++++++++++++++++
 5 files changed, 365 insertions(+), 1 deletion(-)
 create mode 100644 lld/test/ELF/loongarch-tls-gd-pcrel20-s2.s
 create mode 100644 lld/test/ELF/loongarch-tls-ld-pcrel20-s2.s
 create mode 100644 lld/test/ELF/loongarch-tlsdesc-pcrel20-s2.s

diff --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp
index 9466e8b1ce54d..db0bc6c760096 100644
--- a/lld/ELF/Arch/LoongArch.cpp
+++ b/lld/ELF/Arch/LoongArch.cpp
@@ -511,6 +511,12 @@ RelExpr LoongArch::getRelExpr(const RelType type, const Symbol &s,
     return R_TLSDESC;
   case R_LARCH_TLS_DESC_CALL:
     return R_TLSDESC_CALL;
+  case R_LARCH_TLS_LD_PCREL20_S2:
+    return R_TLSLD_PC;
+  case R_LARCH_TLS_GD_PCREL20_S2:
+    return R_TLSGD_PC;
+  case R_LARCH_TLS_DESC_PCREL20_S2:
+    return R_TLSDESC_PC;
 
   // Other known relocs that are explicitly unimplemented:
   //
@@ -557,7 +563,11 @@ void LoongArch::relocate(uint8_t *loc, const Relocation &rel,
     write64le(loc, val);
     return;
 
+  // Relocs intended for `pcaddi`.
   case R_LARCH_PCREL20_S2:
+  case R_LARCH_TLS_LD_PCREL20_S2:
+  case R_LARCH_TLS_GD_PCREL20_S2:
+  case R_LARCH_TLS_DESC_PCREL20_S2:
     checkInt(loc, val, 22, rel);
     checkAlignment(loc, val, 4, rel);
     write32le(loc, setJ20(read32le(loc), val >> 2));
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 713bd15440251..9a799cd286135 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1309,7 +1309,8 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym,
   // LoongArch does not yet implement transition from TLSDESC to LE/IE, so
   // generate TLSDESC dynamic relocation for the dynamic linker to handle.
   if (config->emachine == EM_LOONGARCH &&
-      oneof<R_LOONGARCH_TLSDESC_PAGE_PC, R_TLSDESC, R_TLSDESC_CALL>(expr)) {
+      oneof<R_LOONGARCH_TLSDESC_PAGE_PC, R_TLSDESC, R_TLSDESC_PC,
+            R_TLSDESC_CALL>(expr)) {
     if (expr != R_TLSDESC_CALL) {
       sym.setFlags(NEEDS_TLSDESC);
       c.addReloc({expr, type, offset, addend, &sym});
diff --git a/lld/test/ELF/loongarch-tls-gd-pcrel20-s2.s b/lld/test/ELF/loongarch-tls-gd-pcrel20-s2.s
new file mode 100644
index 0000000000000..d4d12b9d4a520
--- /dev/null
+++ b/lld/test/ELF/loongarch-tls-gd-pcrel20-s2.s
@@ -0,0 +1,129 @@
+# REQUIRES: loongarch
+# RUN: rm -rf %t && split-file %s %t
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 %t/a.s -o %t/a.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 %t/bc.s -o %t/bc.32.o
+# RUN: ld.lld -shared -soname=bc.so %t/bc.32.o -o %t/bc.32.so
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 %t/tga.s -o %t/tga.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %t/a.s -o %t/a.64.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %t/bc.s -o %t/bc.64.o
+# RUN: ld.lld -shared -soname=bc.so %t/bc.64.o -o %t/bc.64.so
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %t/tga.s -o %t/tga.64.o
+
+## LA32 GD
+# RUN: ld.lld -shared %t/a.32.o %t/bc.32.o -o %t/gd.32.so
+# RUN: llvm-readobj -r %t/gd.32.so | FileCheck --check-prefix=GD32-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/gd.32.so | FileCheck --check-prefix=GD32 %s
+
+## LA32 GD -> LE
+# RUN: ld.lld %t/a.32.o %t/bc.32.o %t/tga.32.o -o %t/le.32
+# RUN: llvm-readelf -r %t/le.32 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le.32 | FileCheck --check-prefix=LE32-GOT %s
+# RUN: ld.lld -pie %t/a.32.o %t/bc.32.o %t/tga.32.o -o %t/le-pie.32
+# RUN: llvm-readelf -r %t/le-pie.32 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le-pie.32 | FileCheck --check-prefix=LE32-GOT %s
+
+## LA32 GD -> IE
+# RUN: ld.lld %t/a.32.o %t/bc.32.so %t/tga.32.o -o %t/ie.32
+# RUN: llvm-readobj -r %t/ie.32 | FileCheck --check-prefix=IE32-REL %s
+# RUN: llvm-readelf -x .got %t/ie.32 | FileCheck --check-prefix=IE32-GOT %s
+
+## LA64 GD
+# RUN: ld.lld -shared %t/a.64.o %t/bc.64.o -o %t/gd.64.so
+# RUN: llvm-readobj -r %t/gd.64.so | FileCheck --check-prefix=GD64-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/gd.64.so | FileCheck --check-prefix=GD64 %s
+
+## LA64 GD -> LE
+# RUN: ld.lld %t/a.64.o %t/bc.64.o %t/tga.64.o -o %t/le.64
+# RUN: llvm-readelf -r %t/le.64 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le.64 | FileCheck --check-prefix=LE64-GOT %s
+# RUN: ld.lld -pie %t/a.64.o %t/bc.64.o %t/tga.64.o -o %t/le-pie.64
+# RUN: llvm-readelf -r %t/le-pie.64 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le-pie.64 | FileCheck --check-prefix=LE64-GOT %s
+
+## LA64 GD -> IE
+# RUN: ld.lld %t/a.64.o %t/bc.64.so %t/tga.64.o -o %t/ie.64
+# RUN: llvm-readobj -r %t/ie.64 | FileCheck --check-prefix=IE64-REL %s
+# RUN: llvm-readelf -x .got %t/ie.64 | FileCheck --check-prefix=IE64-GOT %s
+
+# GD32-REL:      .rela.dyn {
+# GD32-REL-NEXT:   0x20300 R_LARCH_TLS_DTPMOD32 a 0x0
+# GD32-REL-NEXT:   0x20304 R_LARCH_TLS_DTPREL32 a 0x0
+# GD32-REL-NEXT:   0x20308 R_LARCH_TLS_DTPMOD32 b 0x0
+# GD32-REL-NEXT:   0x2030C R_LARCH_TLS_DTPREL32 b 0x0
+# GD32-REL-NEXT: }
+
+## &DTPMOD(a) - . = 0x20300 - 0x10250 = 16428<<2
+# GD32:      10250: pcaddi $a0, 16428
+# GD32-NEXT:        bl 44
+
+## &DTPMOD(b) - . = 0x20308 - 0x10258 = 16428<<2
+# GD32:      10258: pcaddi $a0, 16428
+# GD32-NEXT:        bl 36
+
+# GD64-REL:      .rela.dyn {
+# GD64-REL-NEXT:   0x204C0 R_LARCH_TLS_DTPMOD64 a 0x0
+# GD64-REL-NEXT:   0x204C8 R_LARCH_TLS_DTPREL64 a 0x0
+# GD64-REL-NEXT:   0x204D0 R_LARCH_TLS_DTPMOD64 b 0x0
+# GD64-REL-NEXT:   0x204D8 R_LARCH_TLS_DTPREL64 b 0x0
+# GD64-REL-NEXT: }
+
+## &DTPMOD(a) - . = 0x204c0 - 0x10398 = 16458<<2
+# GD64:      10398: pcaddi $a0, 16458
+# GD64-NEXT:        bl 52
+
+## &DTPMOD(b) - . = 0x204d0 - 0x103a4 = 16460<<2
+# GD64:      103a0: pcaddi $a0, 16460
+# GD64-NEXT:        bl 44
+
+# NOREL: no relocations
+
+## .got contains pre-populated values: [a at dtpmod, a at dtprel, b at dtpmod, b at dtprel]
+## a at dtprel = st_value(a) = 0x8
+## b at dtprel = st_value(b) = 0xc
+# LE32-GOT: section '.got':
+# LE32-GOT-NEXT: 0x[[#%x,A:]] 01000000 08000000 01000000 0c000000
+# LE64-GOT: section '.got':
+# LE64-GOT-NEXT: 0x[[#%x,A:]] 01000000 00000000 08000000 00000000
+# LE64-GOT-NEXT: 0x[[#%x,A:]] 01000000 00000000 0c000000 00000000
+
+## a is local - relaxed to LE - its DTPMOD/DTPREL slots are link-time constants.
+## b is external - DTPMOD/DTPREL dynamic relocations are required.
+# IE32-REL:      .rela.dyn {
+# IE32-REL-NEXT:   0x30220 R_LARCH_TLS_DTPMOD32 b 0x0
+# IE32-REL-NEXT:   0x30224 R_LARCH_TLS_DTPREL32 b 0x0
+# IE32-REL-NEXT: }
+# IE32-GOT:      section '.got':
+# IE32-GOT-NEXT: 0x00030218 01000000 08000000 00000000 00000000
+
+# IE64-REL:      .rela.dyn {
+# IE64-REL-NEXT:   0x30380 R_LARCH_TLS_DTPMOD64 b 0x0
+# IE64-REL-NEXT:   0x30388 R_LARCH_TLS_DTPREL64 b 0x0
+# IE64-REL-NEXT: }
+# IE64-GOT:      section '.got':
+# IE64-GOT-NEXT: 0x00030370 01000000 00000000 08000000 00000000
+# IE64-GOT-NEXT: 0x00030380 00000000 00000000 00000000 00000000
+
+#--- a.s
+pcaddi $a0, %gd_pcrel_20(a)
+bl %plt(__tls_get_addr)
+
+pcaddi $a0, %gd_pcrel_20(b)
+bl %plt(__tls_get_addr)
+
+.section .tbss,"awT", at nobits
+.globl a
+.zero 8
+a:
+.zero 4
+
+#--- bc.s
+.section .tbss,"awT", at nobits
+.globl b, c
+b:
+.zero 4
+c:
+
+#--- tga.s
+.globl __tls_get_addr
+__tls_get_addr:
diff --git a/lld/test/ELF/loongarch-tls-ld-pcrel20-s2.s b/lld/test/ELF/loongarch-tls-ld-pcrel20-s2.s
new file mode 100644
index 0000000000000..70186f5538dfc
--- /dev/null
+++ b/lld/test/ELF/loongarch-tls-ld-pcrel20-s2.s
@@ -0,0 +1,82 @@
+# REQUIRES: loongarch
+# RUN: rm -rf %t && split-file %s %t
+
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 --position-independent %t/a.s -o %t/a.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch32 %t/tga.s -o %t/tga.32.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 --position-independent %t/a.s -o %t/a.64.o
+# RUN: llvm-mc --filetype=obj --triple=loongarch64 %t/tga.s -o %t/tga.64.o
+
+## LA32 LD
+# RUN: ld.lld -shared %t/a.32.o -o %t/ld.32.so
+# RUN: llvm-readobj -r %t/ld.32.so | FileCheck --check-prefix=LD32-REL %s
+# RUN: llvm-readelf -x .got %t/ld.32.so | FileCheck --check-prefix=LD32-GOT %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/ld.32.so | FileCheck --check-prefixes=LD32 %s
+
+## LA32 LD -> LE
+# RUN: ld.lld %t/a.32.o %t/tga.32.o -o %t/le.32
+# RUN: llvm-readelf -r %t/le.32 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le.32 | FileCheck --check-prefix=LE32-GOT %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/le.32 | FileCheck --check-prefixes=LE32 %s
+
+## LA64 LD
+# RUN: ld.lld -shared %t/a.64.o -o %t/ld.64.so
+# RUN: llvm-readobj -r %t/ld.64.so | FileCheck --check-prefix=LD64-REL %s
+# RUN: llvm-readelf -x .got %t/ld.64.so | FileCheck --check-prefix=LD64-GOT %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/ld.64.so | FileCheck --check-prefixes=LD64 %s
+
+## LA64 LD -> LE
+# RUN: ld.lld %t/a.64.o %t/tga.64.o -o %t/le.64
+# RUN: llvm-readelf -r %t/le.64 | FileCheck --check-prefix=NOREL %s
+# RUN: llvm-readelf -x .got %t/le.64 | FileCheck --check-prefix=LE64-GOT %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t/le.64 | FileCheck --check-prefixes=LE64 %s
+
+## a at dtprel = st_value(a) = 0 is a link-time constant.
+# LD32-REL:      .rela.dyn {
+# LD32-REL-NEXT:   0x20280 R_LARCH_TLS_DTPMOD32 - 0x0
+# LD32-REL-NEXT: }
+# LD32-GOT:      section '.got':
+# LD32-GOT-NEXT: 0x00020280 00000000 00000000
+
+# LD64-REL:      .rela.dyn {
+# LD64-REL-NEXT:   0x20400 R_LARCH_TLS_DTPMOD64 - 0x0
+# LD64-REL-NEXT: }
+# LD64-GOT:      section '.got':
+# LD64-GOT-NEXT: 0x00020400 00000000 00000000 00000000 00000000
+
+## LA32: &DTPMOD(a) - . = 0x20280 - 0x101cc = 16429<<2
+# LD32:      101cc: pcaddi $a0, 16429
+# LD32-NEXT:        bl 48
+
+## LA64: &DTPMOD(a) - . = 0x20400 - 0x102e0 = 16456<<2
+# LD64:      102e0: pcaddi $a0, 16456
+# LD64-NEXT:        bl 44
+
+# NOREL: no relocations
+
+## a is local - its DTPMOD/DTPREL slots are link-time constants.
+## a at dtpmod = 1 (main module)
+# LE32-GOT: section '.got':
+# LE32-GOT-NEXT: 0x0003011c 01000000 00000000
+
+# LE64-GOT: section '.got':
+# LE64-GOT-NEXT: 0x000301d0 01000000 00000000 00000000 00000000
+
+## LA32: DTPMOD(.LANCHOR0) - . = 0x3011c - 0x20114 = 16386<<2
+# LE32:      20114: pcaddi $a0, 16386
+# LE32-NEXT:        bl 4
+
+## LA64: DTPMOD(.LANCHOR0) - . = 0x301d0 - 0x201c8 = 16386<<2
+# LE64:      201c8: pcaddi $a0, 16386
+# LE64-NEXT:        bl 4
+
+#--- a.s
+pcaddi $a0, %ld_pcrel_20(.LANCHOR0)
+bl %plt(__tls_get_addr)
+
+.section .tbss,"awT", at nobits
+.set .LANCHOR0, . + 0
+.zero 8
+
+#--- tga.s
+.globl __tls_get_addr
+__tls_get_addr:
diff --git a/lld/test/ELF/loongarch-tlsdesc-pcrel20-s2.s b/lld/test/ELF/loongarch-tlsdesc-pcrel20-s2.s
new file mode 100644
index 0000000000000..99e21d9935197
--- /dev/null
+++ b/lld/test/ELF/loongarch-tlsdesc-pcrel20-s2.s
@@ -0,0 +1,142 @@
+# REQUIRES: loongarch
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=loongarch64 a.s -o a.64.o
+# RUN: llvm-mc -filetype=obj -triple=loongarch64 c.s -o c.64.o
+# RUN: ld.lld -shared -soname=c.64.so c.64.o -o c.64.so
+# RUN: llvm-mc -filetype=obj -triple=loongarch32 --defsym ELF32=1 a.s -o a.32.o
+# RUN: llvm-mc -filetype=obj -triple=loongarch32 --defsym ELF32=1 c.s -o c.32.o
+# RUN: ld.lld -shared -soname=c.32.so c.32.o -o c.32.so
+
+# RUN: ld.lld -shared -z now a.64.o c.64.o -o a.64.so
+# RUN: llvm-readobj -r -x .got a.64.so | FileCheck --check-prefix=GD64-RELA %s
+# RUN: llvm-objdump --no-show-raw-insn -h -d a.64.so | FileCheck %s --check-prefix=GD64
+
+# RUN: ld.lld -shared -z now a.64.o c.64.o -o rel.64.so -z rel
+# RUN: llvm-readobj -r -x .got rel.64.so | FileCheck --check-prefix=GD64-REL %s
+
+## FIXME: The transition frome TLSDESC to IE/LE has not yet been implemented.
+## Keep the dynamic relocations and hand them over to dynamic linker.
+
+# RUN: ld.lld -e 0 -z now a.64.o c.64.o -o a.64.le
+# RUN: llvm-readobj -r -x .got a.64.le | FileCheck --check-prefix=LE64-RELA %s
+
+# RUN: ld.lld -e 0 -z now a.64.o c.64.so -o a.64.ie
+# RUN: llvm-readobj -r -x .got a.64.ie | FileCheck --check-prefix=IE64-RELA %s
+
+## 32-bit code is mostly the same. We only test a few variants.
+
+# RUN: ld.lld -shared -z now a.32.o c.32.o -o rel.32.so -z rel
+# RUN: llvm-readobj -r -x .got rel.32.so | FileCheck --check-prefix=GD32-REL %s
+
+# GD64-RELA:      .rela.dyn {
+# GD64-RELA-NEXT:   0x203F0 R_LARCH_TLS_DESC64 - 0x7FF
+# GD64-RELA-NEXT:   0x203D0 R_LARCH_TLS_DESC64 a 0x0
+# GD64-RELA-NEXT:   0x203E0 R_LARCH_TLS_DESC64 c 0x0
+# GD64-RELA-NEXT: }
+# GD64-RELA:      Hex dump of section '.got':
+# GD64-RELA-NEXT: 0x000203d0 00000000 00000000 00000000 00000000 .
+# GD64-RELA-NEXT: 0x000203e0 00000000 00000000 00000000 00000000 .
+# GD64-RELA-NEXT: 0x000203f0 00000000 00000000 00000000 00000000 .
+
+# GD64-REL:      .rel.dyn {
+# GD64-REL-NEXT:   0x203D8 R_LARCH_TLS_DESC64 -
+# GD64-REL-NEXT:   0x203B8 R_LARCH_TLS_DESC64 a
+# GD64-REL-NEXT:   0x203C8 R_LARCH_TLS_DESC64 c
+# GD64-REL-NEXT: }
+# GD64-REL:      Hex dump of section '.got':
+# GD64-REL-NEXT: 0x000203b8 00000000 00000000 00000000 00000000 .
+# GD64-REL-NEXT: 0x000203c8 00000000 00000000 00000000 00000000 .
+# GD64-REL-NEXT: 0x000203d8 00000000 00000000 ff070000 00000000 .
+
+# GD64:      .got     00000030 00000000000203d0
+
+## &.got[a]-. = 0x203d0 - 0x102e0 = 16444<<2
+# GD64:        102e0: pcaddi $a0, 16444
+# GD64-NEXT:          ld.d $ra, $a0, 0
+# GD64-NEXT:          jirl $ra, $ra, 0
+# GD64-NEXT:          add.d $a1, $a0, $tp
+
+## &.got[b]-. = 0x203d0+32 - 0x102f0 = 16448<<2
+# GD64:        102f0: pcaddi $a0, 16448
+# GD64-NEXT:          ld.d $ra, $a0, 0
+# GD64-NEXT:          jirl $ra, $ra, 0
+# GD64-NEXT:          add.d $a2, $a0, $tp
+
+## &.got[c]-. = 0x203d0+16 - 0x10300 = 16440<<2
+# GD64:        10300: pcaddi $a0, 16440
+# GD64-NEXT:          ld.d $ra, $a0, 0
+# GD64-NEXT:          jirl $ra, $ra, 0
+# GD64-NEXT:          add.d $a3, $a0, $tp
+
+# LE64-RELA:      .rela.dyn {
+# LE64-RELA-NEXT:   0x30240 R_LARCH_TLS_DESC64 - 0x8
+# LE64-RELA-NEXT:   0x30250 R_LARCH_TLS_DESC64 - 0x800
+# LE64-RELA-NEXT:   0x30260 R_LARCH_TLS_DESC64 - 0x7FF
+# LE64-RELA-NEXT: }
+# LE64-RELA:      Hex dump of section '.got':
+# LE64-RELA-NEXT: 0x00030240 00000000 00000000 00000000 00000000 .
+# LE64-RELA-NEXT: 0x00030250 00000000 00000000 00000000 00000000 .
+# LE64-RELA-NEXT: 0x00030260 00000000 00000000 00000000 00000000 .
+
+# IE64-RELA:      .rela.dyn {
+# IE64-RELA-NEXT:   0x303C8 R_LARCH_TLS_DESC64 - 0x8
+# IE64-RELA-NEXT:   0x303E8 R_LARCH_TLS_DESC64 - 0x7FF
+# IE64-RELA-NEXT:   0x303D8 R_LARCH_TLS_DESC64 c 0x0
+# IE64-RELA-NEXT: }
+# IE64-RELA:      Hex dump of section '.got':
+# IE64-RELA-NEXT: 0x000303c8 00000000 00000000 00000000 00000000 .
+# IE64-RELA-NEXT: 0x000303d8 00000000 00000000 00000000 00000000 .
+# IE64-RELA-NEXT: 0x000303e8 00000000 00000000 00000000 00000000 .
+
+# GD32-REL:      .rel.dyn {
+# GD32-REL-NEXT:    0x20264 R_LARCH_TLS_DESC32 -
+# GD32-REL-NEXT:    0x20254 R_LARCH_TLS_DESC32 a
+# GD32-REL-NEXT:    0x2025C R_LARCH_TLS_DESC32 c
+# GD32-REL-NEXT: }
+# GD32-REL:      Hex dump of section '.got':
+# GD32-REL-NEXT: 0x00020254 00000000 00000000 00000000 00000000 .
+# GD32-REL-NEXT: 0x00020264 00000000 ff070000                   .
+
+#--- a.s
+.macro add dst, src1, src2
+.ifdef ELF32
+add.w \dst, \src1, \src2
+.else
+add.d \dst, \src1, \src2
+.endif
+.endm
+.macro load dst, src1, src2
+.ifdef ELF32
+ld.w \dst, \src1, \src2
+.else
+ld.d \dst, \src1, \src2
+.endif
+.endm
+
+pcaddi $a0, %desc_pcrel_20(a)
+load $ra, $a0, %desc_ld(a)
+jirl $ra, $ra, %desc_call(a)
+add $a1, $a0, $tp
+
+pcaddi $a0, %desc_pcrel_20(b)
+load $ra, $a0, %desc_ld(b)
+jirl $ra, $ra, %desc_call(b)
+add $a2, $a0, $tp
+
+pcaddi $a0, %desc_pcrel_20(c)
+load $ra, $a0, %desc_ld(c)
+jirl $ra, $ra, %desc_call(c)
+add $a3, $a0, $tp
+
+.section .tbss,"awT", at nobits
+.globl a
+.zero 8
+a:
+.zero 2039  ## Place b at 0x7ff
+b:
+.zero 1
+
+#--- c.s
+.section .tbss,"awT", at nobits
+.globl c
+c: .zero 4

>From bf173ba0ea34a59d3ce76ce7535c8ca186bdf681 Mon Sep 17 00:00:00 2001
From: Rainer Orth <ro at gcc.gnu.org>
Date: Mon, 29 Jul 2024 09:12:15 +0200
Subject: [PATCH 60/91] [compiler-rt][test] Disable lld tests on SPARC
 (#100533)

As detailed in Issue #100320, a considerable number of tests that
explicitly use `-fuse-ld=lld` `FAIL` on Linux/sparc64 due to several
`lld` limitations (no 32-bit SPARC support, lack of support for various
relocations, ...).

To reduce the noise, this patch disables `COMPILER_RT_HAS_LLD` on SPARC
wholesale.

Tested on `sparc64-unknown-linux-gnu`.

(cherry picked from commit 33a50e0eaa80cf3db1b944762db9a37a06f3ac32)
---
 compiler-rt/CMakeLists.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/compiler-rt/CMakeLists.txt b/compiler-rt/CMakeLists.txt
index 65063e0057bbc..6b8d9a5ea46d6 100644
--- a/compiler-rt/CMakeLists.txt
+++ b/compiler-rt/CMakeLists.txt
@@ -801,6 +801,10 @@ if(ANDROID)
   append_list_if(COMPILER_RT_HAS_FUSE_LD_LLD_FLAG -fuse-ld=lld SANITIZER_COMMON_LINK_FLAGS)
   append_list_if(COMPILER_RT_HAS_LLD -fuse-ld=lld COMPILER_RT_UNITTEST_LINK_FLAGS)
 endif()
+if(${COMPILER_RT_DEFAULT_TARGET_ARCH} MATCHES sparc)
+  # lld has several bugs/limitations on SPARC, so disable (Issue #100320).
+  set(COMPILER_RT_HAS_LLD FALSE)
+endif()
 pythonize_bool(COMPILER_RT_HAS_LLD)
 pythonize_bool(COMPILER_RT_TEST_USE_LLD)
 

>From 002fcbd82c00d5c402bf2dabea203b294ef3e00b Mon Sep 17 00:00:00 2001
From: Rainer Orth <ro at gcc.gnu.org>
Date: Wed, 24 Jul 2024 10:03:47 +0200
Subject: [PATCH 61/91] [asan][cmake][test] Fix finding dynamic asan runtime
 lib (#100083)

In a `runtimes` build on Solaris/amd64, there are two failues:
```
  AddressSanitizer-Unit :: ./Asan-i386-calls-Dynamic-Test/failed_to_discover_tests_from_gtest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Dynamic-Test/failed_to_discover_tests_from_gtest
```
This happens when `lit` enumerates the tests with `--gtest_list_tests
--gtest_filter=-*DISABLED_*`. The error is twofold:

- The `LD_LIBRARY_PATH*` variables point at the 64-bit directory
(`lib/clang/19/lib/x86_64-pc-solaris2.11`) for a 32-bit test:
  ```
ld.so.1: Asan-i386-calls-Dynamic-Test: fatal:
/var/llvm/local-amd64-release-stage2-A-flang-clang18-runtimes/tools/clang/stage2-bins/./lib/../lib/clang/19/lib/x86_64-pc-solaris2.11/libclang_rt.asan.so:
wrong ELF class: ELFCLASS64
  ```
- While the tests are linked with `-Wl,-rpath`, that path always is the
64-bit directory again.

Accordingly, the fix consists of two parts:
- The code in `compiler-rt/test/asan/Unit/lit.site.cfg.py.in` to adjust
the `LD_LIBRARY_PATH*` variables is guarded by a `config.target_arch !=
config.host_arch` condition. This is wrong in two ways:
- The adjustment is always needed independent of the host arch. This is
what `compiler-rt/test/lit.common.cfg.py` already does.
- Besides, `config.host_arch` is ultimately set from
`CMAKE_HOST_SYSTEM_PROCESSOR`. On Linux/x86_64, this is `x86_64` (`uname
-m`) while on Solaris/amd64 it's `i386` (`uname -p`), explaining why the
transformation is skipped on Solaris, but not on Linux.
- Besides, `RPATH` needs to be set to the correct subdirectory, so
instead of using the default arch in `compiler-rt/CMakeLists.txt`, this
patch moves the code to a function which takes the test's arch into
account.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

(cherry picked from commit c34d673b02ead039acd107f096c1f32c16b61e07)
---
 compiler-rt/CMakeLists.txt                    |  4 ----
 compiler-rt/cmake/config-ix.cmake             | 13 +++++++++++++
 compiler-rt/lib/asan/tests/CMakeLists.txt     |  4 +++-
 compiler-rt/test/asan/Unit/lit.site.cfg.py.in |  2 +-
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/compiler-rt/CMakeLists.txt b/compiler-rt/CMakeLists.txt
index 6b8d9a5ea46d6..2207555b03a03 100644
--- a/compiler-rt/CMakeLists.txt
+++ b/compiler-rt/CMakeLists.txt
@@ -604,10 +604,6 @@ if (COMPILER_RT_TEST_STANDALONE_BUILD_LIBS)
   if ("${COMPILER_RT_TEST_COMPILER_ID}" MATCHES "Clang")
     list(APPEND COMPILER_RT_UNITTEST_LINK_FLAGS "-resource-dir=${COMPILER_RT_OUTPUT_DIR}")
   endif()
-  get_compiler_rt_output_dir(${COMPILER_RT_DEFAULT_TARGET_ARCH} rtlib_dir)
-  if (NOT WIN32)
-    list(APPEND COMPILER_RT_UNITTEST_LINK_FLAGS "-Wl,-rpath,${rtlib_dir}")
-  endif()
 endif()
 
 if(COMPILER_RT_USE_LLVM_UNWINDER)
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index 3a151772e268a..dad557af2ae8c 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -267,6 +267,19 @@ function(get_target_link_flags_for_arch arch out_var)
   endif()
 endfunction()
 
+# Returns a list of architecture specific dynamic ldflags in @out_var list.
+function(get_dynamic_link_flags_for_arch arch out_var)
+  list(FIND COMPILER_RT_SUPPORTED_ARCH ${arch} ARCH_INDEX)
+  if(ARCH_INDEX EQUAL -1)
+    message(FATAL_ERROR "Unsupported architecture: ${arch}")
+  else()
+    get_compiler_rt_output_dir(${arch} rtlib_dir)
+    if (NOT WIN32)
+      set(${out_var} "-Wl,-rpath,${rtlib_dir}" PARENT_SCOPE)
+    endif()
+  endif()
+endfunction()
+
 # Returns a compiler and CFLAGS that should be used to run tests for the
 # specific architecture.  When cross-compiling, this is controled via
 # COMPILER_RT_TEST_COMPILER and COMPILER_RT_TEST_COMPILER_CFLAGS.
diff --git a/compiler-rt/lib/asan/tests/CMakeLists.txt b/compiler-rt/lib/asan/tests/CMakeLists.txt
index 7abd4c89ac6bc..b489bb99aeff3 100644
--- a/compiler-rt/lib/asan/tests/CMakeLists.txt
+++ b/compiler-rt/lib/asan/tests/CMakeLists.txt
@@ -206,13 +206,15 @@ function(add_asan_tests arch test_runtime)
           -Wl,-nodefaultlib:libcmt,-defaultlib:msvcrt,-defaultlib:oldnames
         )
     else()
+      set(DYNAMIC_LINK_FLAGS)
+      get_dynamic_link_flags_for_arch(${arch} DYNAMIC_LINK_FLAGS)
 
       # Otherwise, reuse ASAN_INST_TEST_OBJECTS.
       add_compiler_rt_test(AsanDynamicUnitTests "${dynamic_test_name}" "${arch}"
         SUBDIR "${CONFIG_NAME_DYNAMIC}"
         OBJECTS ${ASAN_INST_TEST_OBJECTS}
         DEPS asan ${ASAN_INST_TEST_OBJECTS}
-        LINK_FLAGS ${ASAN_DYNAMIC_UNITTEST_INSTRUMENTED_LINK_FLAGS} ${TARGET_LINK_FLAGS}
+        LINK_FLAGS ${ASAN_DYNAMIC_UNITTEST_INSTRUMENTED_LINK_FLAGS} ${TARGET_LINK_FLAGS} ${DYNAMIC_LINK_FLAGS}
         )
     endif()
   endif()
diff --git a/compiler-rt/test/asan/Unit/lit.site.cfg.py.in b/compiler-rt/test/asan/Unit/lit.site.cfg.py.in
index 638e1dedfc1d2..ac652b53dcb9d 100644
--- a/compiler-rt/test/asan/Unit/lit.site.cfg.py.in
+++ b/compiler-rt/test/asan/Unit/lit.site.cfg.py.in
@@ -53,7 +53,7 @@ config.test_source_root = config.test_exec_root
 # host triple as the trailing path component. The value is incorrect for i386
 # tests on x86_64 hosts and vice versa. Adjust config.compiler_rt_libdir
 # accordingly.
-if config.enable_per_target_runtime_dir and config.target_arch != config.host_arch:
+if config.enable_per_target_runtime_dir:
     if config.target_arch == 'i386':
         config.compiler_rt_libdir = re.sub(r'/x86_64(?=-[^/]+$)', '/i386', config.compiler_rt_libdir)
     elif config.target_arch == 'x86_64':

>From 55b063f3f5b00ecdcbbeec85b13fc63b8e4386df Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 23 Jul 2024 21:35:09 -0500
Subject: [PATCH 62/91] [libc] Fix leftover debug commandline argument

Summary:
Fixes https://github.com/llvm/llvm-project/issues/100289

(cherry picked from commit 0420d2f97eac49af5e816b0e3f2a9135d1673168)
---
 libc/cmake/modules/LLVMLibCTestRules.cmake | 1 -
 libc/docs/configure.rst                    | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/libc/cmake/modules/LLVMLibCTestRules.cmake b/libc/cmake/modules/LLVMLibCTestRules.cmake
index 18adbee0bc7ad..96eb065c4a672 100644
--- a/libc/cmake/modules/LLVMLibCTestRules.cmake
+++ b/libc/cmake/modules/LLVMLibCTestRules.cmake
@@ -651,7 +651,6 @@ function(add_libc_hermetic test_name)
     target_link_options(${fq_build_target_name} PRIVATE
       ${LIBC_COMPILE_OPTIONS_DEFAULT} -Wno-multi-gpu
       -mcpu=${LIBC_GPU_TARGET_ARCHITECTURE} -flto
-      "-Wl,-asdfasdfasdf"
       "-Wl,-mllvm,-amdgpu-lower-global-ctor-dtor=0" -nostdlib -static
       "-Wl,-mllvm,-amdhsa-code-object-version=${LIBC_GPU_CODE_OBJECT_VERSION}")
   elseif(LIBC_TARGET_ARCHITECTURE_IS_NVPTX)
diff --git a/libc/docs/configure.rst b/libc/docs/configure.rst
index 5c55e4ab0f181..b81922367d8b7 100644
--- a/libc/docs/configure.rst
+++ b/libc/docs/configure.rst
@@ -29,7 +29,7 @@ to learn about the defaults for your platform and target.
     - ``LIBC_CONF_ENABLE_STRONG_STACK_PROTECTOR``: Enable -fstack-protector-strong to defend against stack smashing attack.
     - ``LIBC_CONF_KEEP_FRAME_POINTER``: Keep frame pointer in functions for better debugging experience.
 * **"errno" options**
-    - ``LIBC_CONF_ERRNO_MODE``: The implementation used for errno, acceptable values are LIBC_ERRNO_MODE_UNDEFINED, LIBC_ERRNO_MODE_THREAD_LOCAL, LIBC_ERRNO_MODE_SHARED, LIBC_ERRNO_MODE_EXTERNAL, and LIBC_ERRNO_MODE_SYSTEM.
+    - ``LIBC_CONF_ERRNO_MODE``: The implementation used for errno, acceptable values are LIBC_ERRNO_MODE_DEFAULT, LIBC_ERRNO_MODE_UNDEFINED, LIBC_ERRNO_MODE_THREAD_LOCAL, LIBC_ERRNO_MODE_SHARED, LIBC_ERRNO_MODE_EXTERNAL, and LIBC_ERRNO_MODE_SYSTEM.
 * **"malloc" options**
     - ``LIBC_CONF_FREELIST_MALLOC_BUFFER_SIZE``: Default size for the constinit freelist buffer used for the freelist malloc implementation (default 1o 1GB).
 * **"math" options**

>From a8b7c809ee20bc6e84f9b6280f30d7d2bcfd0a7c Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 23 Jul 2024 22:04:47 -0500
Subject: [PATCH 63/91] Update libc/docs/configure.rst

---
 libc/docs/configure.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libc/docs/configure.rst b/libc/docs/configure.rst
index b81922367d8b7..5c55e4ab0f181 100644
--- a/libc/docs/configure.rst
+++ b/libc/docs/configure.rst
@@ -29,7 +29,7 @@ to learn about the defaults for your platform and target.
     - ``LIBC_CONF_ENABLE_STRONG_STACK_PROTECTOR``: Enable -fstack-protector-strong to defend against stack smashing attack.
     - ``LIBC_CONF_KEEP_FRAME_POINTER``: Keep frame pointer in functions for better debugging experience.
 * **"errno" options**
-    - ``LIBC_CONF_ERRNO_MODE``: The implementation used for errno, acceptable values are LIBC_ERRNO_MODE_DEFAULT, LIBC_ERRNO_MODE_UNDEFINED, LIBC_ERRNO_MODE_THREAD_LOCAL, LIBC_ERRNO_MODE_SHARED, LIBC_ERRNO_MODE_EXTERNAL, and LIBC_ERRNO_MODE_SYSTEM.
+    - ``LIBC_CONF_ERRNO_MODE``: The implementation used for errno, acceptable values are LIBC_ERRNO_MODE_UNDEFINED, LIBC_ERRNO_MODE_THREAD_LOCAL, LIBC_ERRNO_MODE_SHARED, LIBC_ERRNO_MODE_EXTERNAL, and LIBC_ERRNO_MODE_SYSTEM.
 * **"malloc" options**
     - ``LIBC_CONF_FREELIST_MALLOC_BUFFER_SIZE``: Default size for the constinit freelist buffer used for the freelist malloc implementation (default 1o 1GB).
 * **"math" options**

>From 360df814b029fc6647672bd3a38ab7a888d073eb Mon Sep 17 00:00:00 2001
From: Hari Limaye <hari.limaye at arm.com>
Date: Thu, 25 Jul 2024 09:03:48 +0100
Subject: [PATCH 64/91] [StackFrameLayoutAnalysis] Use target-specific hook for
 SP offsets (#100386)

StackFrameLayoutAnalysis currently calculates SP-relative offsets in a
target-independent way via MachineFrameInfo offsets. This is incorrect
for some Targets, e.g. AArch64, when there are scalable vector stack
slots.

This patch adds a virtual function to TargetFrameLowering to provide
offsets from SP, with a default implementation matching what is
currently used in StackFrameLayoutAnalysis, and refactors
StackFrameLayoutAnalysis to use this function. Only non-zero scalable
offsets are output by the analysis pass.

An implementation of this function is added for AArch64 targets, which
aims to provide correct SP offsets in most cases.

(cherry picked from commit dc1c00f6b13f724154f9883990f8b21fb8dcccef)
---
 .../llvm/CodeGen/TargetFrameLowering.h        |   7 +
 .../CodeGen/StackFrameLayoutAnalysisPass.cpp  |  51 +-
 llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp  |  14 +
 .../Target/AArch64/AArch64FrameLowering.cpp   |  35 ++
 .../lib/Target/AArch64/AArch64FrameLowering.h |   2 +
 .../CodeGen/AArch64/sve-stack-frame-layout.ll | 480 +++++++++++++++++-
 6 files changed, 563 insertions(+), 26 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/TargetFrameLowering.h b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
index 72978b2f746d7..0656c0d739fdf 100644
--- a/llvm/include/llvm/CodeGen/TargetFrameLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetFrameLowering.h
@@ -343,6 +343,13 @@ class TargetFrameLowering {
     return getFrameIndexReference(MF, FI, FrameReg);
   }
 
+  /// getFrameIndexReferenceFromSP - This method returns the offset from the
+  /// stack pointer to the slot of the specified index. This function serves to
+  /// provide a comparable offset from a single reference point (the value of
+  /// the stack-pointer at function entry) that can be used for analysis.
+  virtual StackOffset getFrameIndexReferenceFromSP(const MachineFunction &MF,
+                                                   int FI) const;
+
   /// Returns the callee-saved registers as computed by determineCalleeSaves
   /// in the BitVector \p SavedRegs.
   virtual void getCalleeSaves(const MachineFunction &MF,
diff --git a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
index 940aecd1cb363..ff77685f8f354 100644
--- a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
+++ b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
@@ -60,15 +60,15 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
     int Slot;
     int Size;
     int Align;
-    int Offset;
+    StackOffset Offset;
     SlotType SlotTy;
     bool Scalable;
 
-    SlotData(const MachineFrameInfo &MFI, const int ValOffset, const int Idx)
+    SlotData(const MachineFrameInfo &MFI, const StackOffset Offset,
+             const int Idx)
         : Slot(Idx), Size(MFI.getObjectSize(Idx)),
-          Align(MFI.getObjectAlign(Idx).value()),
-          Offset(MFI.getObjectOffset(Idx) - ValOffset), SlotTy(Invalid),
-          Scalable(false) {
+          Align(MFI.getObjectAlign(Idx).value()), Offset(Offset),
+          SlotTy(Invalid), Scalable(false) {
       Scalable = MFI.getStackID(Idx) == TargetStackID::ScalableVector;
       if (MFI.isSpillSlotObjectIndex(Idx))
         SlotTy = SlotType::Spill;
@@ -79,10 +79,10 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
     }
 
     // We use this to sort in reverse order, so that the layout is displayed
-    // correctly. Scalable slots are sorted to the end of the list.
+    // correctly.
     bool operator<(const SlotData &Rhs) const {
-      return std::make_tuple(!Scalable, Offset) >
-             std::make_tuple(!Rhs.Scalable, Rhs.Offset);
+      return (Offset.getFixed() + Offset.getScalable()) >
+             (Rhs.Offset.getFixed() + Rhs.Offset.getScalable());
     }
   };
 
@@ -149,15 +149,27 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
     // For example we store the Offset in YAML as:
     //    ...
     //    - Offset: -8
+    //    - ScalableOffset: -16
+    // Note: the ScalableOffset entries are added only for slots with non-zero
+    // scalable offsets.
     //
-    // But we print it to the CLI as
+    // But we print it to the CLI as:
     //   Offset: [SP-8]
+    //
+    // Or with non-zero scalable offset:
+    //   Offset: [SP-8-16 x vscale]
 
     // Negative offsets will print a leading `-`, so only add `+`
     std::string Prefix =
-        formatv("\nOffset: [SP{0}", (D.Offset < 0) ? "" : "+").str();
-    Rem << Prefix << ore::NV("Offset", D.Offset)
-        << "], Type: " << ore::NV("Type", getTypeString(D.SlotTy))
+        formatv("\nOffset: [SP{0}", (D.Offset.getFixed() < 0) ? "" : "+").str();
+    Rem << Prefix << ore::NV("Offset", D.Offset.getFixed());
+
+    if (D.Offset.getScalable()) {
+      Rem << ((D.Offset.getScalable() < 0) ? "" : "+")
+          << ore::NV("ScalableOffset", D.Offset.getScalable()) << " x vscale";
+    }
+
+    Rem << "], Type: " << ore::NV("Type", getTypeString(D.SlotTy))
         << ", Align: " << ore::NV("Align", D.Align)
         << ", Size: " << ore::NV("Size", ElementCount::get(D.Size, D.Scalable));
   }
@@ -170,17 +182,22 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
     Rem << "\n    " << ore::NV("DataLoc", Loc);
   }
 
+  StackOffset getStackOffset(const MachineFunction &MF,
+                             const MachineFrameInfo &MFI,
+                             const TargetFrameLowering *FI, int FrameIdx) {
+    if (!FI)
+      return StackOffset::getFixed(MFI.getObjectOffset(FrameIdx));
+
+    return FI->getFrameIndexReferenceFromSP(MF, FrameIdx);
+  }
+
   void emitStackFrameLayoutRemarks(MachineFunction &MF,
                                    MachineOptimizationRemarkAnalysis &Rem) {
     const MachineFrameInfo &MFI = MF.getFrameInfo();
     if (!MFI.hasStackObjects())
       return;
 
-    // ValOffset is the offset to the local area from the SP at function entry.
-    // To display the true offset from SP, we need to subtract ValOffset from
-    // MFI's ObjectOffset.
     const TargetFrameLowering *FI = MF.getSubtarget().getFrameLowering();
-    const int ValOffset = (FI ? FI->getOffsetOfLocalArea() : 0);
 
     LLVM_DEBUG(dbgs() << "getStackProtectorIndex =="
                       << MFI.getStackProtectorIndex() << "\n");
@@ -194,7 +211,7 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
          Idx != EndIdx; ++Idx) {
       if (MFI.isDeadObjectIndex(Idx))
         continue;
-      SlotInfo.emplace_back(MFI, ValOffset, Idx);
+      SlotInfo.emplace_back(MFI, getStackOffset(MF, MFI, FI, Idx), Idx);
     }
 
     // sort the ordering, to match the actual layout in memory
diff --git a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
index 48a2094f5d451..7d054cb7c7c71 100644
--- a/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
+++ b/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp
@@ -61,6 +61,20 @@ TargetFrameLowering::getFrameIndexReference(const MachineFunction &MF, int FI,
                                MFI.getOffsetAdjustment());
 }
 
+/// Returns the offset from the stack pointer to the slot of the specified
+/// index. This function serves to provide a comparable offset from a single
+/// reference point (the value of the stack-pointer at function entry) that can
+/// be used for analysis. This is the default implementation using
+/// MachineFrameInfo offsets.
+StackOffset
+TargetFrameLowering::getFrameIndexReferenceFromSP(const MachineFunction &MF,
+                                                  int FI) const {
+  // To display the true offset from SP, we need to subtract the offset to the
+  // local area from MFI's ObjectOffset.
+  return StackOffset::getFixed(MF.getFrameInfo().getObjectOffset(FI) -
+                               getOffsetOfLocalArea());
+}
+
 bool TargetFrameLowering::needsFrameIndexResolution(
     const MachineFunction &MF) const {
   return MF.getFrameInfo().hasStackObjects();
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index b1b83e27c5592..bd530903bb664 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -2603,6 +2603,41 @@ AArch64FrameLowering::getFrameIndexReference(const MachineFunction &MF, int FI,
       /*ForSimm=*/false);
 }
 
+StackOffset
+AArch64FrameLowering::getFrameIndexReferenceFromSP(const MachineFunction &MF,
+                                                   int FI) const {
+  // This function serves to provide a comparable offset from a single reference
+  // point (the value of SP at function entry) that can be used for analysis,
+  // e.g. the stack-frame-layout analysis pass. It is not guaranteed to be
+  // correct for all objects in the presence of VLA-area objects or dynamic
+  // stack re-alignment.
+
+  const auto &MFI = MF.getFrameInfo();
+
+  int64_t ObjectOffset = MFI.getObjectOffset(FI);
+
+  // This is correct in the absence of any SVE stack objects.
+  StackOffset SVEStackSize = getSVEStackSize(MF);
+  if (!SVEStackSize)
+    return StackOffset::getFixed(ObjectOffset - getOffsetOfLocalArea());
+
+  const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
+  if (MFI.getStackID(FI) == TargetStackID::ScalableVector) {
+    return StackOffset::get(-((int64_t)AFI->getCalleeSavedStackSize()),
+                            ObjectOffset);
+  }
+
+  bool IsFixed = MFI.isFixedObjectIndex(FI);
+  bool IsCSR =
+      !IsFixed && ObjectOffset >= -((int)AFI->getCalleeSavedStackSize(MFI));
+
+  StackOffset ScalableOffset = {};
+  if (!IsFixed && !IsCSR)
+    ScalableOffset = -SVEStackSize;
+
+  return StackOffset::getFixed(ObjectOffset) + ScalableOffset;
+}
+
 StackOffset
 AArch64FrameLowering::getNonLocalFrameIndexReference(const MachineFunction &MF,
                                                      int FI) const {
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.h b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
index da315850d6362..0ebab1700e9ce 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
@@ -41,6 +41,8 @@ class AArch64FrameLowering : public TargetFrameLowering {
 
   StackOffset getFrameIndexReference(const MachineFunction &MF, int FI,
                                      Register &FrameReg) const override;
+  StackOffset getFrameIndexReferenceFromSP(const MachineFunction &MF,
+                                           int FI) const override;
   StackOffset resolveFrameIndexReference(const MachineFunction &MF, int FI,
                                          Register &FrameReg, bool PreferFP,
                                          bool ForSimm) const;
diff --git a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
index 34d85d1f76086..36bca2ebd4ada 100644
--- a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
+++ b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
@@ -5,9 +5,9 @@
 ; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20-16 x vscale], Type: Variable, Align: 4, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 8, Size: 8
 
 define i32 @csr_d8_allocnxv4i32i32f64(double %d) "aarch64_pstate_sm_compatible" {
 ; CHECK-LABEL: csr_d8_allocnxv4i32i32f64:
@@ -49,8 +49,8 @@ entry:
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 16, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40], Type: Variable, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40-16 x vscale], Type: Variable, Align: 8, Size: 8
 
 define i32 @csr_d8_allocnxv4i32i32f64_fp(double %d) "aarch64_pstate_sm_compatible" "frame-pointer"="all" {
 ; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_fp:
@@ -90,13 +90,167 @@ entry:
   ret i32 0
 }
 
+; In the presence of dynamic stack-realignment we emit correct offsets for
+; objects which are not realigned. For realigned objects, e.g. the i32 alloca
+; in this test, we emit the correct offset ignoring the re-alignment (i.e. the
+; offset if the alignment requirement is already satisfied).
+
+; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64_dynamicrealign
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-24], Type: Variable, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 16, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128-16 x vscale], Type: Variable, Align: 128, Size: 4
+
+define i32 @csr_d8_allocnxv4i32i32f64_dynamicrealign(double %d) "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_dynamicrealign:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str d8, [sp, #-32]! // 8-byte Folded Spill
+; CHECK-NEXT:    sub x9, sp, #96
+; CHECK-NEXT:    stp x29, x30, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    add x29, sp, #16
+; CHECK-NEXT:    addvl x9, x9, #-1
+; CHECK-NEXT:    and sp, x9, #0xffffffffffffff80
+; CHECK-NEXT:    .cfi_def_cfa w29, 16
+; CHECK-NEXT:    .cfi_offset w30, -8
+; CHECK-NEXT:    .cfi_offset w29, -16
+; CHECK-NEXT:    .cfi_offset b8, -32
+; CHECK-NEXT:    mov z1.s, #0 // =0x0
+; CHECK-NEXT:    ptrue p0.s
+; CHECK-NEXT:    sub x8, x29, #16
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    str wzr, [sp]
+; CHECK-NEXT:    stur d0, [x29, #-8]
+; CHECK-NEXT:    st1w { z1.s }, p0, [x8, #-1, mul vl]
+; CHECK-NEXT:    sub sp, x29, #16
+; CHECK-NEXT:    ldp x29, x30, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr d8, [sp], #32 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  %a = alloca <vscale x 4 x i32>
+  %b = alloca i32, align 128
+  %c = alloca double
+  tail call void asm sideeffect "", "~{d8}"() #1
+  store <vscale x 4 x i32> zeroinitializer, ptr %a
+  store i32 zeroinitializer, ptr %b
+  store double %d, ptr %c
+  ret i32 0
+}
+
+; In the presence of VLA-area objects, we emit correct offsets for all objects
+; except for these VLA objects.
+
+; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64_vla
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-24], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 1, Size: 0
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40-16 x vscale], Type: Variable, Align: 8, Size: 8
+
+define i32 @csr_d8_allocnxv4i32i32f64_vla(double %d, i32 %i) "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_vla:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str d8, [sp, #-32]! // 8-byte Folded Spill
+; CHECK-NEXT:    stp x29, x30, [sp, #8] // 16-byte Folded Spill
+; CHECK-NEXT:    add x29, sp, #8
+; CHECK-NEXT:    str x19, [sp, #24] // 8-byte Folded Spill
+; CHECK-NEXT:    sub sp, sp, #16
+; CHECK-NEXT:    addvl sp, sp, #-1
+; CHECK-NEXT:    mov x19, sp
+; CHECK-NEXT:    .cfi_def_cfa w29, 24
+; CHECK-NEXT:    .cfi_offset w19, -8
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    .cfi_offset w29, -24
+; CHECK-NEXT:    .cfi_offset b8, -32
+; CHECK-NEXT:    // kill: def $w0 killed $w0 def $x0
+; CHECK-NEXT:    ubfiz x8, x0, #2, #32
+; CHECK-NEXT:    mov x9, sp
+; CHECK-NEXT:    add x8, x8, #15
+; CHECK-NEXT:    and x8, x8, #0x7fffffff0
+; CHECK-NEXT:    sub x8, x9, x8
+; CHECK-NEXT:    mov sp, x8
+; CHECK-NEXT:    mov z1.s, #0 // =0x0
+; CHECK-NEXT:    ptrue p0.s
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    str wzr, [x8]
+; CHECK-NEXT:    sub x8, x29, #8
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    str d0, [x19, #8]
+; CHECK-NEXT:    st1w { z1.s }, p0, [x8, #-1, mul vl]
+; CHECK-NEXT:    sub sp, x29, #8
+; CHECK-NEXT:    ldp x29, x30, [sp, #8] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr x19, [sp, #24] // 8-byte Folded Reload
+; CHECK-NEXT:    ldr d8, [sp], #32 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  %a = alloca <vscale x 4 x i32>
+  %0 = zext i32 %i to i64
+  %b = alloca i32, i64 %0
+  %c = alloca double
+  tail call void asm sideeffect "", "~{d8}"() #1
+  store <vscale x 4 x i32> zeroinitializer, ptr %a
+  store i32 zeroinitializer, ptr %b
+  store double %d, ptr %c
+  ret i32 0
+}
+
+; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64_stackargsi32f64
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+8], Type: Variable, Align: 8, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Protector, Align: 16, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20-16 x vscale], Type: Variable, Align: 4, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 8, Size: 8
+
+define i32 @csr_d8_allocnxv4i32i32f64_stackargsi32f64(double %d0, double %d1, double %d2, double %d3, double %d4, double %d5, double %d6, double %d7, double %d8, i32 %i0, i32 %i1, i32 %i2, i32 %i3, i32 %i4, i32 %i5, i32 %i6, i32 %i7, i32 %i8) "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_stackargsi32f64:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str d8, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    str x29, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT:    sub sp, sp, #16
+; CHECK-NEXT:    addvl sp, sp, #-1
+; CHECK-NEXT:    .cfi_escape 0x0f, 0x0c, 0x8f, 0x00, 0x11, 0x20, 0x22, 0x11, 0x08, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 32 + 8 * VG
+; CHECK-NEXT:    .cfi_offset w29, -8
+; CHECK-NEXT:    .cfi_offset b8, -16
+; CHECK-NEXT:    mov z1.s, #0 // =0x0
+; CHECK-NEXT:    ptrue p0.s
+; CHECK-NEXT:    add x8, sp, #16
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    str wzr, [sp, #12]
+; CHECK-NEXT:    str d0, [sp]
+; CHECK-NEXT:    st1w { z1.s }, p0, [x8]
+; CHECK-NEXT:    addvl sp, sp, #1
+; CHECK-NEXT:    add sp, sp, #16
+; CHECK-NEXT:    ldr x29, [sp, #8] // 8-byte Folded Reload
+; CHECK-NEXT:    ldr d8, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  %a = alloca <vscale x 4 x i32>
+  %b = alloca i32
+  %c = alloca double
+  tail call void asm sideeffect "", "~{d8}"() #1
+  store <vscale x 4 x i32> zeroinitializer, ptr %a
+  store i32 zeroinitializer, ptr %b
+  store double %d0, ptr %c
+  ret i32 0
+}
+
 ; CHECK-FRAMELAYOUT-LABEL: Function: svecc_z8_allocnxv4i32i32f64_fp
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20], Type: Variable, Align: 4, Size: 4
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 16, Size: vscale x 16
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-32 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20-32 x vscale], Type: Variable, Align: 4, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-32 x vscale], Type: Variable, Align: 8, Size: 8
 
 define i32 @svecc_z8_allocnxv4i32i32f64_fp(double %d, <vscale x 4 x i32> %v) "aarch64_pstate_sm_compatible" "frame-pointer"="all" {
 ; CHECK-LABEL: svecc_z8_allocnxv4i32i32f64_fp:
@@ -133,3 +287,311 @@ entry:
   store double %d, ptr %c
   ret i32 0
 }
+
+; CHECK-FRAMELAYOUT-LABEL: Function: svecc_z8_allocnxv4i32i32f64_stackargsi32_fp
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Protector, Align: 16, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-32 x vscale], Type: Variable, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-20-32 x vscale], Type: Variable, Align: 4, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-32 x vscale], Type: Variable, Align: 8, Size: 8
+
+define i32 @svecc_z8_allocnxv4i32i32f64_stackargsi32_fp(double %d, i32 %i0, i32 %i1, i32 %i2, i32 %i3, i32 %i4, i32 %i5, i32 %i6, i32 %i7, i32 %i8, <vscale x 4 x i32> %v) "aarch64_pstate_sm_compatible" "frame-pointer"="all"{
+; CHECK-LABEL: svecc_z8_allocnxv4i32i32f64_stackargsi32_fp:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
+; CHECK-NEXT:    mov x29, sp
+; CHECK-NEXT:    addvl sp, sp, #-1
+; CHECK-NEXT:    str z8, [sp] // 16-byte Folded Spill
+; CHECK-NEXT:    sub sp, sp, #16
+; CHECK-NEXT:    addvl sp, sp, #-1
+; CHECK-NEXT:    .cfi_def_cfa w29, 16
+; CHECK-NEXT:    .cfi_offset w30, -8
+; CHECK-NEXT:    .cfi_offset w29, -16
+; CHECK-NEXT:    .cfi_escape 0x10, 0x48, 0x0a, 0x11, 0x70, 0x22, 0x11, 0x78, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d8 @ cfa - 16 - 8 * VG
+; CHECK-NEXT:    ptrue p0.s
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    str wzr, [sp, #12]
+; CHECK-NEXT:    st1w { z1.s }, p0, [x29, #-2, mul vl]
+; CHECK-NEXT:    str d0, [sp], #16
+; CHECK-NEXT:    addvl sp, sp, #1
+; CHECK-NEXT:    ldr z8, [sp] // 16-byte Folded Reload
+; CHECK-NEXT:    addvl sp, sp, #1
+; CHECK-NEXT:    ldp x29, x30, [sp], #16 // 16-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  %a = alloca <vscale x 4 x i32>
+  %b = alloca i32
+  %c = alloca double
+  tail call void asm sideeffect "", "~{d8}"() #1
+  store <vscale x 4 x i32> %v, ptr %a
+  store i32 zeroinitializer, ptr %b
+  store double %d, ptr %c
+  ret i32 0
+}
+
+; CHECK-FRAMELAYOUT-LABEL: Function: svecc_call
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-24], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-16 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-32 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-48 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-64 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-80 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-96 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-112 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-128 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-144 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-160 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-176 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-192 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-208 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-224 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-240 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-256 x vscale], Type: Spill, Align: 16, Size: vscale x 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-258 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-260 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-262 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-264 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-266 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-268 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-270 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-272 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-274 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-276 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-278 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48-280 x vscale], Type: Spill, Align: 2, Size: vscale x 2
+
+define i32 @svecc_call(<4 x i16> %P0, ptr %P1, i32 %P2, <vscale x 16 x i8> %P3, i16 %P4) "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: svecc_call:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    stp x29, x30, [sp, #-48]! // 16-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    cntd x9
+; CHECK-NEXT:    stp x9, x28, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    stp x27, x19, [sp, #32] // 16-byte Folded Spill
+; CHECK-NEXT:    .cfi_offset w19, -8
+; CHECK-NEXT:    .cfi_offset w27, -16
+; CHECK-NEXT:    .cfi_offset w28, -24
+; CHECK-NEXT:    .cfi_offset w30, -40
+; CHECK-NEXT:    .cfi_offset w29, -48
+; CHECK-NEXT:    addvl sp, sp, #-18
+; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x8f, 0x00, 0x11, 0x30, 0x22, 0x11, 0x90, 0x01, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 48 + 144 * VG
+; CHECK-NEXT:    str p15, [sp, #4, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p14, [sp, #5, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p13, [sp, #6, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p12, [sp, #7, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p11, [sp, #8, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p10, [sp, #9, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p9, [sp, #10, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p8, [sp, #11, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p7, [sp, #12, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p6, [sp, #13, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p5, [sp, #14, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str p4, [sp, #15, mul vl] // 2-byte Folded Spill
+; CHECK-NEXT:    str z23, [sp, #2, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z22, [sp, #3, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z21, [sp, #4, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z20, [sp, #5, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z19, [sp, #6, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z18, [sp, #7, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z17, [sp, #8, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z16, [sp, #9, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z15, [sp, #10, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z14, [sp, #11, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z13, [sp, #12, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z12, [sp, #13, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z11, [sp, #14, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z10, [sp, #15, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z9, [sp, #16, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    str z8, [sp, #17, mul vl] // 16-byte Folded Spill
+; CHECK-NEXT:    .cfi_escape 0x10, 0x48, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x78, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d8 @ cfa - 48 - 8 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x49, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x70, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d9 @ cfa - 48 - 16 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4a, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x68, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d10 @ cfa - 48 - 24 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4b, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x60, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d11 @ cfa - 48 - 32 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4c, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x58, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d12 @ cfa - 48 - 40 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4d, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x50, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d13 @ cfa - 48 - 48 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4e, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x48, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d14 @ cfa - 48 - 56 * VG
+; CHECK-NEXT:    .cfi_escape 0x10, 0x4f, 0x0a, 0x11, 0x50, 0x22, 0x11, 0x40, 0x92, 0x2e, 0x00, 0x1e, 0x22 // $d15 @ cfa - 48 - 64 * VG
+; CHECK-NEXT:    mov x8, x0
+; CHECK-NEXT:    //APP
+; CHECK-NEXT:    //NO_APP
+; CHECK-NEXT:    bl __arm_sme_state
+; CHECK-NEXT:    and x19, x0, #0x1
+; CHECK-NEXT:    .cfi_offset vg, -32
+; CHECK-NEXT:    tbz w19, #0, .LBB7_2
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    smstop sm
+; CHECK-NEXT:  .LBB7_2: // %entry
+; CHECK-NEXT:    mov x0, x8
+; CHECK-NEXT:    mov w1, #45 // =0x2d
+; CHECK-NEXT:    mov w2, #37 // =0x25
+; CHECK-NEXT:    bl memset
+; CHECK-NEXT:    tbz w19, #0, .LBB7_4
+; CHECK-NEXT:  // %bb.3: // %entry
+; CHECK-NEXT:    smstart sm
+; CHECK-NEXT:  .LBB7_4: // %entry
+; CHECK-NEXT:    mov w0, #22647 // =0x5877
+; CHECK-NEXT:    movk w0, #59491, lsl #16
+; CHECK-NEXT:    .cfi_restore vg
+; CHECK-NEXT:    ldr z23, [sp, #2, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z22, [sp, #3, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z21, [sp, #4, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z20, [sp, #5, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z19, [sp, #6, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z18, [sp, #7, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z17, [sp, #8, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z16, [sp, #9, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z15, [sp, #10, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z14, [sp, #11, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z13, [sp, #12, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z12, [sp, #13, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z11, [sp, #14, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z10, [sp, #15, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z9, [sp, #16, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr z8, [sp, #17, mul vl] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr p15, [sp, #4, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p14, [sp, #5, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p13, [sp, #6, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p12, [sp, #7, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p11, [sp, #8, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p10, [sp, #9, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p9, [sp, #10, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p8, [sp, #11, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p7, [sp, #12, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p6, [sp, #13, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p5, [sp, #14, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    ldr p4, [sp, #15, mul vl] // 2-byte Folded Reload
+; CHECK-NEXT:    addvl sp, sp, #18
+; CHECK-NEXT:    .cfi_def_cfa wsp, 48
+; CHECK-NEXT:    .cfi_restore z8
+; CHECK-NEXT:    .cfi_restore z9
+; CHECK-NEXT:    .cfi_restore z10
+; CHECK-NEXT:    .cfi_restore z11
+; CHECK-NEXT:    .cfi_restore z12
+; CHECK-NEXT:    .cfi_restore z13
+; CHECK-NEXT:    .cfi_restore z14
+; CHECK-NEXT:    .cfi_restore z15
+; CHECK-NEXT:    ldp x27, x19, [sp, #32] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr x28, [sp, #24] // 8-byte Folded Reload
+; CHECK-NEXT:    ldp x29, x30, [sp], #48 // 16-byte Folded Reload
+; CHECK-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-NEXT:    .cfi_restore w19
+; CHECK-NEXT:    .cfi_restore w27
+; CHECK-NEXT:    .cfi_restore w28
+; CHECK-NEXT:    .cfi_restore w30
+; CHECK-NEXT:    .cfi_restore w29
+; CHECK-NEXT:    ret
+entry:
+  tail call void asm sideeffect "", "~{x0},~{x28},~{x27},~{x3}"() #2
+  %call = call ptr @memset(ptr noundef nonnull %P1, i32 noundef 45, i32 noundef 37)
+  ret i32 -396142473
+}
+declare ptr @memset(ptr, i32, i32)
+
+; The VA register currently ends up in VLA space - in the presence of VLA-area
+; objects, we emit correct offsets for all objects except for these VLA objects.
+
+; CHECK-FRAMELAYOUT-LABEL: Function: vastate
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 16, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-48], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-56], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-64], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-72], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-80], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-88], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-96], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-104], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-112], Type: Spill, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128], Type: Variable, Align: 16, Size: 16
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128], Type: Variable, Align: 16, Size: 0
+
+define i32 @vastate(i32 %x) "aarch64_inout_za" "aarch64_pstate_sm_enabled" "target-features"="+sme" {
+; CHECK-LABEL: vastate:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    stp d15, d14, [sp, #-112]! // 16-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 112
+; CHECK-NEXT:    cntd x9
+; CHECK-NEXT:    stp d13, d12, [sp, #16] // 16-byte Folded Spill
+; CHECK-NEXT:    stp d11, d10, [sp, #32] // 16-byte Folded Spill
+; CHECK-NEXT:    stp d9, d8, [sp, #48] // 16-byte Folded Spill
+; CHECK-NEXT:    stp x29, x30, [sp, #64] // 16-byte Folded Spill
+; CHECK-NEXT:    str x9, [sp, #80] // 8-byte Folded Spill
+; CHECK-NEXT:    stp x20, x19, [sp, #96] // 16-byte Folded Spill
+; CHECK-NEXT:    add x29, sp, #64
+; CHECK-NEXT:    .cfi_def_cfa w29, 48
+; CHECK-NEXT:    .cfi_offset w19, -8
+; CHECK-NEXT:    .cfi_offset w20, -16
+; CHECK-NEXT:    .cfi_offset w30, -40
+; CHECK-NEXT:    .cfi_offset w29, -48
+; CHECK-NEXT:    .cfi_offset b8, -56
+; CHECK-NEXT:    .cfi_offset b9, -64
+; CHECK-NEXT:    .cfi_offset b10, -72
+; CHECK-NEXT:    .cfi_offset b11, -80
+; CHECK-NEXT:    .cfi_offset b12, -88
+; CHECK-NEXT:    .cfi_offset b13, -96
+; CHECK-NEXT:    .cfi_offset b14, -104
+; CHECK-NEXT:    .cfi_offset b15, -112
+; CHECK-NEXT:    sub sp, sp, #16
+; CHECK-NEXT:    rdsvl x8, #1
+; CHECK-NEXT:    mov x9, sp
+; CHECK-NEXT:    mov w20, w0
+; CHECK-NEXT:    msub x9, x8, x8, x9
+; CHECK-NEXT:    mov sp, x9
+; CHECK-NEXT:    stur x9, [x29, #-80]
+; CHECK-NEXT:    sub x9, x29, #80
+; CHECK-NEXT:    sturh wzr, [x29, #-70]
+; CHECK-NEXT:    stur wzr, [x29, #-68]
+; CHECK-NEXT:    sturh w8, [x29, #-72]
+; CHECK-NEXT:    msr TPIDR2_EL0, x9
+; CHECK-NEXT:    .cfi_offset vg, -32
+; CHECK-NEXT:    smstop sm
+; CHECK-NEXT:    bl other
+; CHECK-NEXT:    smstart sm
+; CHECK-NEXT:    .cfi_restore vg
+; CHECK-NEXT:    smstart za
+; CHECK-NEXT:    mrs x8, TPIDR2_EL0
+; CHECK-NEXT:    sub x0, x29, #80
+; CHECK-NEXT:    cbnz x8, .LBB8_2
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    bl __arm_tpidr2_restore
+; CHECK-NEXT:  .LBB8_2: // %entry
+; CHECK-NEXT:    mov w0, w20
+; CHECK-NEXT:    msr TPIDR2_EL0, xzr
+; CHECK-NEXT:    sub sp, x29, #64
+; CHECK-NEXT:    .cfi_def_cfa wsp, 112
+; CHECK-NEXT:    ldp x20, x19, [sp, #96] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp x29, x30, [sp, #64] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp d9, d8, [sp, #48] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp d11, d10, [sp, #32] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp d13, d12, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp d15, d14, [sp], #112 // 16-byte Folded Reload
+; CHECK-NEXT:    .cfi_def_cfa_offset 0
+; CHECK-NEXT:    .cfi_restore w19
+; CHECK-NEXT:    .cfi_restore w20
+; CHECK-NEXT:    .cfi_restore w30
+; CHECK-NEXT:    .cfi_restore w29
+; CHECK-NEXT:    .cfi_restore b8
+; CHECK-NEXT:    .cfi_restore b9
+; CHECK-NEXT:    .cfi_restore b10
+; CHECK-NEXT:    .cfi_restore b11
+; CHECK-NEXT:    .cfi_restore b12
+; CHECK-NEXT:    .cfi_restore b13
+; CHECK-NEXT:    .cfi_restore b14
+; CHECK-NEXT:    .cfi_restore b15
+; CHECK-NEXT:    ret
+entry:
+  tail call void @other()
+  ret i32 %x
+}
+declare void @other()

>From b3a73a1559317019b9c5eb01f19af78bffd48dbe Mon Sep 17 00:00:00 2001
From: Hari Limaye <hari.limaye at arm.com>
Date: Thu, 25 Jul 2024 18:54:24 +0100
Subject: [PATCH 65/91] [StackFrameLayoutAnalysis] Support more SlotTypes
 (#100562)

Add new SlotTypes to StackFrameLayoutAnalysis to disambiguate Fixed and
Variable-Sized stack slots from Variable slots. As Offsets are
unreliable for VLA-area objects, sort these to the end of the list -
using the Frame Index to ensure a deterministic order when Offsets are
equal.

(cherry picked from commit e31794f99d72dd764c4bc5c5583a0a4c89df22c3)
---
 .../CodeGen/StackFrameLayoutAnalysisPass.cpp  | 27 ++++++++++++++++---
 .../CodeGen/AArch64/sve-stack-frame-layout.ll | 25 ++++++++++-------
 .../CodeGen/X86/stack-frame-layout-remarks.ll | 12 ++++-----
 3 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
index ff77685f8f354..0a7a6bad4e86d 100644
--- a/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
+++ b/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp
@@ -51,6 +51,8 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
 
   enum SlotType {
     Spill,          // a Spill slot
+    Fixed,          // a Fixed slot (e.g. arguments passed on the stack)
+    VariableSized,  // a variable sized object
     StackProtector, // Stack Protector slot
     Variable,       // a slot used to store a local data (could be a tmp)
     Invalid         // It's an error for a slot to have this type
@@ -72,17 +74,30 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
       Scalable = MFI.getStackID(Idx) == TargetStackID::ScalableVector;
       if (MFI.isSpillSlotObjectIndex(Idx))
         SlotTy = SlotType::Spill;
-      else if (Idx == MFI.getStackProtectorIndex())
+      else if (MFI.isFixedObjectIndex(Idx))
+        SlotTy = SlotType::Fixed;
+      else if (MFI.isVariableSizedObjectIndex(Idx))
+        SlotTy = SlotType::VariableSized;
+      else if (MFI.hasStackProtectorIndex() &&
+               Idx == MFI.getStackProtectorIndex())
         SlotTy = SlotType::StackProtector;
       else
         SlotTy = SlotType::Variable;
     }
 
+    bool isVarSize() const { return SlotTy == SlotType::VariableSized; }
+
     // We use this to sort in reverse order, so that the layout is displayed
-    // correctly.
+    // correctly. Variable sized slots are sorted to the end of the list, as
+    // offsets are currently incorrect for these but they reside at the end of
+    // the stack frame. The Slot index is used to ensure deterministic order
+    // when offsets are equal.
     bool operator<(const SlotData &Rhs) const {
-      return (Offset.getFixed() + Offset.getScalable()) >
-             (Rhs.Offset.getFixed() + Rhs.Offset.getScalable());
+      return std::make_tuple(!isVarSize(),
+                             Offset.getFixed() + Offset.getScalable(), Slot) >
+             std::make_tuple(!Rhs.isVarSize(),
+                             Rhs.Offset.getFixed() + Rhs.Offset.getScalable(),
+                             Rhs.Slot);
     }
   };
 
@@ -121,6 +136,10 @@ struct StackFrameLayoutAnalysisPass : public MachineFunctionPass {
     switch (Ty) {
     case SlotType::Spill:
       return "Spill";
+    case SlotType::Fixed:
+      return "Fixed";
+    case SlotType::VariableSized:
+      return "VariableSized";
     case SlotType::StackProtector:
       return "Protector";
     case SlotType::Variable:
diff --git a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
index 36bca2ebd4ada..431c9dc76508f 100644
--- a/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
+++ b/llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll
@@ -147,10 +147,11 @@ entry:
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-24], Type: Spill, Align: 8, Size: 8
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Variable, Align: 1, Size: 0
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-40-16 x vscale], Type: Variable, Align: 8, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: VariableSized, Align: 1, Size: 0
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-32], Type: VariableSized, Align: 1, Size: 0
 
 define i32 @csr_d8_allocnxv4i32i32f64_vla(double %d, i32 %i) "aarch64_pstate_sm_compatible" {
 ; CHECK-LABEL: csr_d8_allocnxv4i32i32f64_vla:
@@ -172,7 +173,10 @@ define i32 @csr_d8_allocnxv4i32i32f64_vla(double %d, i32 %i) "aarch64_pstate_sm_
 ; CHECK-NEXT:    mov x9, sp
 ; CHECK-NEXT:    add x8, x8, #15
 ; CHECK-NEXT:    and x8, x8, #0x7fffffff0
-; CHECK-NEXT:    sub x8, x9, x8
+; CHECK-NEXT:    sub x9, x9, x8
+; CHECK-NEXT:    mov sp, x9
+; CHECK-NEXT:    mov x10, sp
+; CHECK-NEXT:    sub x8, x10, x8
 ; CHECK-NEXT:    mov sp, x8
 ; CHECK-NEXT:    mov z1.s, #0 // =0x0
 ; CHECK-NEXT:    ptrue p0.s
@@ -181,8 +185,9 @@ define i32 @csr_d8_allocnxv4i32i32f64_vla(double %d, i32 %i) "aarch64_pstate_sm_
 ; CHECK-NEXT:    str wzr, [x8]
 ; CHECK-NEXT:    sub x8, x29, #8
 ; CHECK-NEXT:    mov w0, wzr
-; CHECK-NEXT:    str d0, [x19, #8]
+; CHECK-NEXT:    str wzr, [x9]
 ; CHECK-NEXT:    st1w { z1.s }, p0, [x8, #-1, mul vl]
+; CHECK-NEXT:    str d0, [x19, #8]
 ; CHECK-NEXT:    sub sp, x29, #8
 ; CHECK-NEXT:    ldp x29, x30, [sp, #8] // 16-byte Folded Reload
 ; CHECK-NEXT:    ldr x19, [sp, #24] // 8-byte Folded Reload
@@ -191,18 +196,20 @@ define i32 @csr_d8_allocnxv4i32i32f64_vla(double %d, i32 %i) "aarch64_pstate_sm_
 entry:
   %a = alloca <vscale x 4 x i32>
   %0 = zext i32 %i to i64
-  %b = alloca i32, i64 %0
+  %vla0 = alloca i32, i64 %0
+  %vla1 = alloca i32, i64 %0
   %c = alloca double
   tail call void asm sideeffect "", "~{d8}"() #1
   store <vscale x 4 x i32> zeroinitializer, ptr %a
-  store i32 zeroinitializer, ptr %b
+  store i32 zeroinitializer, ptr %vla0
+  store i32 zeroinitializer, ptr %vla1
   store double %d, ptr %c
   ret i32 0
 }
 
 ; CHECK-FRAMELAYOUT-LABEL: Function: csr_d8_allocnxv4i32i32f64_stackargsi32f64
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+8], Type: Variable, Align: 8, Size: 4
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Protector, Align: 16, Size: 8
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+8], Type: Fixed, Align: 8, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Fixed, Align: 16, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Variable, Align: 16, Size: vscale x 16
@@ -289,7 +296,7 @@ entry:
 }
 
 ; CHECK-FRAMELAYOUT-LABEL: Function: svecc_z8_allocnxv4i32i32f64_stackargsi32_fp
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Protector, Align: 16, Size: 4
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP+0], Type: Fixed, Align: 16, Size: 4
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-8], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-16-16 x vscale], Type: Spill, Align: 16, Size: vscale x 16
@@ -514,7 +521,7 @@ declare ptr @memset(ptr, i32, i32)
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-104], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-112], Type: Spill, Align: 8, Size: 8
 ; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128], Type: Variable, Align: 16, Size: 16
-; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128], Type: Variable, Align: 16, Size: 0
+; CHECK-FRAMELAYOUT-NEXT: Offset: [SP-128], Type: VariableSized, Align: 16, Size: 0
 
 define i32 @vastate(i32 %x) "aarch64_inout_za" "aarch64_pstate_sm_enabled" "target-features"="+sme" {
 ; CHECK-LABEL: vastate:
diff --git a/llvm/test/CodeGen/X86/stack-frame-layout-remarks.ll b/llvm/test/CodeGen/X86/stack-frame-layout-remarks.ll
index cd5edcf2ae502..d8ce5b041042e 100644
--- a/llvm/test/CodeGen/X86/stack-frame-layout-remarks.ll
+++ b/llvm/test/CodeGen/X86/stack-frame-layout-remarks.ll
@@ -35,7 +35,7 @@ entry:
 declare void @llvm.dbg.declare(metadata, metadata, metadata) #0
 
 ; BOTH: Function: cleanup_array
-; BOTH-NEXT:  Offset: [SP+4], Type: Protector, Align: 16, Size: 4
+; BOTH-NEXT:  Offset: [SP+4], Type: Fixed, Align: 16, Size: 4
 ; DEBUG: a @ dot.c:13
 ; STRIPPED-NOT: a @ dot.c:13
 ; BOTH:  Offset: [SP-4], Type: Spill, Align: 8, Size: 4
@@ -47,7 +47,7 @@ define void @cleanup_array(ptr %0) #1 {
 }
 
 ; BOTH: Function: cleanup_result
-; BOTH:  Offset: [SP+4], Type: Protector, Align: 16, Size: 4
+; BOTH:  Offset: [SP+4], Type: Fixed, Align: 16, Size: 4
 ; DEBUG: res @ dot.c:21
 ; STRIPPED-NOT: res @ dot.c:21
 ; BOTH:  Offset: [SP-4], Type: Spill, Align: 8, Size: 4
@@ -59,11 +59,11 @@ define void @cleanup_result(ptr %0) #1 {
 }
 
 ; BOTH: Function: do_work
-; BOTH:  Offset: [SP+12], Type: Variable, Align: 8, Size: 4
+; BOTH:  Offset: [SP+12], Type: Fixed, Align: 8, Size: 4
 ; DEBUG: out @ dot.c:32
 ; STRIPPED-NOT: out @ dot.c:32
-; BOTH:  Offset: [SP+8], Type: Variable, Align: 4, Size: 4
-; BOTH:  Offset: [SP+4], Type: Protector, Align: 16, Size: 4
+; BOTH:  Offset: [SP+8], Type: Fixed, Align: 4, Size: 4
+; BOTH:  Offset: [SP+4], Type: Fixed, Align: 16, Size: 4
 ; DEBUG: A @ dot.c:32
 ; STRIPPED-NOT: A @ dot.c:32
 ; BOTH:  Offset: [SP-4], Type: Spill, Align: 8, Size: 4
@@ -125,7 +125,7 @@ define i32 @do_work(ptr %0, ptr %1, ptr %2) #2 {
 }
 
 ; BOTH: Function: gen_array
-; BOTH:  Offset: [SP+4], Type: Protector, Align: 16, Size: 4
+; BOTH:  Offset: [SP+4], Type: Fixed, Align: 16, Size: 4
 ; DEBUG: size @ dot.c:62
 ; STRIPPED-NOT: size @ dot.c:62
 ; BOTH:  Offset: [SP-4], Type: Spill, Align: 8, Size: 4

>From cd302f3914a42da49542cf2f33a39f2c968471ee Mon Sep 17 00:00:00 2001
From: Daniil Kovalev <dkovalev at accesssoftek.com>
Date: Fri, 26 Jul 2024 23:00:16 +0300
Subject: [PATCH 66/91] [PAC][test] Add tests against Linux triples for
 auth/resign lowering (#100744)

The lowering implementation and tests against arm64e-apple-darwin triple
were added previously in #79024.

(cherry picked from commit 53283dc4645ee13f33dd9b98cc935b376bf78232)
---
 llvm/test/CodeGen/AArch64/ptrauth-fpac.ll     | 100 ++++-----
 ...trauth-intrinsic-auth-resign-with-blend.ll | 139 +++++++-----
 .../AArch64/ptrauth-intrinsic-auth-resign.ll  | 205 ++++++++++--------
 3 files changed, 241 insertions(+), 203 deletions(-)

diff --git a/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll b/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
index 6afe1a93d986e..d5340dcebad57 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
@@ -1,12 +1,14 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc < %s -mtriple arm64e-apple-darwin                                                -verify-machineinstrs | FileCheck %s --check-prefixes=ALL,NOFPAC
-; RUN: llc < %s -mtriple arm64e-apple-darwin -mattr=+fpac                                   -verify-machineinstrs | FileCheck %s --check-prefixes=ALL,FPAC
+; RUN: llc < %s -mtriple arm64e-apple-darwin                          -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,NOFPAC
+; RUN: llc < %s -mtriple arm64e-apple-darwin             -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,FPAC
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth              -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,NOFPAC
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,FPAC
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
 define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_ia:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autia x16, x1
 ; ALL-NEXT:    mov x0, x16
@@ -17,7 +19,7 @@ define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_ia_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_ia_zero:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autiza x16
 ; ALL-NEXT:    mov x0, x16
@@ -28,7 +30,7 @@ define i64 @test_auth_ia_zero(i64 %arg) {
 
 define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_ib:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autib x16, x1
 ; ALL-NEXT:    mov x0, x16
@@ -39,7 +41,7 @@ define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_ib_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_ib_zero:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autizb x16
 ; ALL-NEXT:    mov x0, x16
@@ -50,7 +52,7 @@ define i64 @test_auth_ib_zero(i64 %arg) {
 
 define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_da:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autda x16, x1
 ; ALL-NEXT:    mov x0, x16
@@ -61,7 +63,7 @@ define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_da_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_da_zero:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autdza x16
 ; ALL-NEXT:    mov x0, x16
@@ -72,7 +74,7 @@ define i64 @test_auth_da_zero(i64 %arg) {
 
 define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_db:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autdb x16, x1
 ; ALL-NEXT:    mov x0, x16
@@ -83,7 +85,7 @@ define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_db_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_db_zero:
-; ALL:       ; %bb.0:
+; ALL:       %bb.0:
 ; ALL-NEXT:    mov x16, x0
 ; ALL-NEXT:    autdzb x16
 ; ALL-NEXT:    mov x0, x16
@@ -96,15 +98,15 @@ define i64 @test_auth_db_zero(i64 %arg) {
 ; the validity of a signature.
 define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_ia_ia:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autia x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpaci x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_0
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_0
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_0
+; NOFPAC-NEXT:    b [[L]]resign_end_0
 ; NOFPAC-NEXT:  Lauth_success_0:
 ; NOFPAC-NEXT:    pacia x16, x2
 ; NOFPAC-NEXT:  Lresign_end_0:
@@ -112,7 +114,7 @@ define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_ia_ia:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autia x16, x1
 ; FPAC-NEXT:    pacia x16, x2
@@ -124,15 +126,15 @@ define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_ib_ia:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autib x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpaci x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_1
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_1
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_1
+; NOFPAC-NEXT:    b [[L]]resign_end_1
 ; NOFPAC-NEXT:  Lauth_success_1:
 ; NOFPAC-NEXT:    pacia x16, x2
 ; NOFPAC-NEXT:  Lresign_end_1:
@@ -140,7 +142,7 @@ define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_ib_ia:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autib x16, x1
 ; FPAC-NEXT:    pacia x16, x2
@@ -152,15 +154,15 @@ define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_da_ia:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autda x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_2
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_2
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_2
+; NOFPAC-NEXT:    b [[L]]resign_end_2
 ; NOFPAC-NEXT:  Lauth_success_2:
 ; NOFPAC-NEXT:    pacia x16, x2
 ; NOFPAC-NEXT:  Lresign_end_2:
@@ -168,7 +170,7 @@ define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_da_ia:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autda x16, x1
 ; FPAC-NEXT:    pacia x16, x2
@@ -180,15 +182,15 @@ define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_db_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_db_ia:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autdb x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_3
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_3
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_3
+; NOFPAC-NEXT:    b [[L]]resign_end_3
 ; NOFPAC-NEXT:  Lauth_success_3:
 ; NOFPAC-NEXT:    pacia x16, x2
 ; NOFPAC-NEXT:  Lresign_end_3:
@@ -196,7 +198,7 @@ define i64 @test_resign_db_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_db_ia:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autdb x16, x1
 ; FPAC-NEXT:    pacia x16, x2
@@ -208,15 +210,15 @@ define i64 @test_resign_db_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_db_ib(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_db_ib:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autdb x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_4
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_4
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_4
+; NOFPAC-NEXT:    b [[L]]resign_end_4
 ; NOFPAC-NEXT:  Lauth_success_4:
 ; NOFPAC-NEXT:    pacib x16, x2
 ; NOFPAC-NEXT:  Lresign_end_4:
@@ -224,7 +226,7 @@ define i64 @test_resign_db_ib(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_db_ib:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autdb x16, x1
 ; FPAC-NEXT:    pacib x16, x2
@@ -236,15 +238,15 @@ define i64 @test_resign_db_ib(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_db_da:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autdb x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_5
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_5
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_5
+; NOFPAC-NEXT:    b [[L]]resign_end_5
 ; NOFPAC-NEXT:  Lauth_success_5:
 ; NOFPAC-NEXT:    pacda x16, x2
 ; NOFPAC-NEXT:  Lresign_end_5:
@@ -252,7 +254,7 @@ define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_db_da:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autdb x16, x1
 ; FPAC-NEXT:    pacda x16, x2
@@ -264,15 +266,15 @@ define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_db_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_db_db:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autdb x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_6
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_6
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_6
+; NOFPAC-NEXT:    b [[L]]resign_end_6
 ; NOFPAC-NEXT:  Lauth_success_6:
 ; NOFPAC-NEXT:    pacdb x16, x2
 ; NOFPAC-NEXT:  Lresign_end_6:
@@ -280,7 +282,7 @@ define i64 @test_resign_db_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_db_db:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autdb x16, x1
 ; FPAC-NEXT:    pacdb x16, x2
@@ -292,15 +294,15 @@ define i64 @test_resign_db_db(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_iza_db:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autiza x16
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpaci x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_7
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_7
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_7
+; NOFPAC-NEXT:    b [[L]]resign_end_7
 ; NOFPAC-NEXT:  Lauth_success_7:
 ; NOFPAC-NEXT:    pacdb x16, x2
 ; NOFPAC-NEXT:  Lresign_end_7:
@@ -308,7 +310,7 @@ define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_iza_db:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autiza x16
 ; FPAC-NEXT:    pacdb x16, x2
@@ -320,15 +322,15 @@ define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-LABEL: test_resign_da_dzb:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autda x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpacd x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_8
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_8
 ; NOFPAC-NEXT:    mov x16, x17
-; NOFPAC-NEXT:    b Lresign_end_8
+; NOFPAC-NEXT:    b [[L]]resign_end_8
 ; NOFPAC-NEXT:  Lauth_success_8:
 ; NOFPAC-NEXT:    pacdzb x16
 ; NOFPAC-NEXT:  Lresign_end_8:
@@ -336,7 +338,7 @@ define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_resign_da_dzb:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autda x16, x1
 ; FPAC-NEXT:    pacdzb x16
@@ -348,20 +350,20 @@ define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_auth_trap_attribute(i64 %arg, i64 %arg1) "ptrauth-auth-traps" {
 ; NOFPAC-LABEL: test_auth_trap_attribute:
-; NOFPAC:       ; %bb.0:
+; NOFPAC:       %bb.0:
 ; NOFPAC-NEXT:    mov x16, x0
 ; NOFPAC-NEXT:    autia x16, x1
 ; NOFPAC-NEXT:    mov x17, x16
 ; NOFPAC-NEXT:    xpaci x17
 ; NOFPAC-NEXT:    cmp x16, x17
-; NOFPAC-NEXT:    b.eq Lauth_success_9
+; NOFPAC-NEXT:    b.eq [[L]]auth_success_9
 ; NOFPAC-NEXT:    brk #0xc470
 ; NOFPAC-NEXT:  Lauth_success_9:
 ; NOFPAC-NEXT:    mov x0, x16
 ; NOFPAC-NEXT:    ret
 ;
 ; FPAC-LABEL: test_auth_trap_attribute:
-; FPAC:       ; %bb.0:
+; FPAC:       %bb.0:
 ; FPAC-NEXT:    mov x16, x0
 ; FPAC-NEXT:    autia x16, x1
 ; FPAC-NEXT:    mov x0, x16
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
index 3b93acd8e46f7..74d2370c74c54 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
@@ -1,24 +1,39 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s --check-prefix=UNCHECKED
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL="L" --check-prefixes=UNCHECKED,UNCHECKED-DARWIN
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s --check-prefix=UNCHECKED
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL="L" --check-prefixes=UNCHECKED,UNCHECKED-DARWIN
 
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:                                     | FileCheck %s --check-prefix=CHECKED
+; RUN:                                     | FileCheck %s -DL="L" --check-prefixes=CHECKED,CHECKED-DARWIN
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:                                     | FileCheck %s --check-prefix=CHECKED
+; RUN:                                     | FileCheck %s -DL="L" --check-prefixes=CHECKED,CHECKED-DARWIN
 
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s --check-prefix=TRAP
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL="L" --check-prefixes=TRAP,TRAP-DARWIN
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s --check-prefix=TRAP
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL="L" --check-prefixes=TRAP,TRAP-DARWIN
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL=".L" --check-prefixes=UNCHECKED,UNCHECKED-ELF
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL=".L" --check-prefixes=UNCHECKED,UNCHECKED-ELF
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:                                     | FileCheck %s -DL=".L" --check-prefixes=CHECKED,CHECKED-ELF
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:                                     | FileCheck %s -DL=".L" --check-prefixes=CHECKED,CHECKED-ELF
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL=".L" --check-prefixes=TRAP,TRAP-ELF
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL=".L" --check-prefixes=TRAP,TRAP-ELF
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
 define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_blend:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    mov x17, x1
 ; UNCHECKED-NEXT:    movk x17, #65535, lsl #48
@@ -27,7 +42,7 @@ define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_blend:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    mov x17, x1
 ; CHECKED-NEXT:    movk x17, #65535, lsl #48
@@ -36,7 +51,7 @@ define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_blend:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    mov x17, x1
 ; TRAP-NEXT:    movk x17, #65535, lsl #48
@@ -44,7 +59,7 @@ define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_0
+; TRAP-NEXT:    b.eq [[L]]auth_success_0
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_0:
 ; TRAP-NEXT:    mov x0, x16
@@ -56,7 +71,7 @@ define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 
 define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_blend:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    mov x17, x1
 ; UNCHECKED-NEXT:    movk x17, #12345, lsl #48
@@ -68,7 +83,7 @@ define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_blend:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    mov x17, x1
 ; CHECKED-NEXT:    movk x17, #12345, lsl #48
@@ -76,9 +91,9 @@ define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_0
+; CHECKED-NEXT:    b.eq [[L]]auth_success_0
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_0
+; CHECKED-NEXT:    b [[L]]resign_end_0
 ; CHECKED-NEXT:  Lauth_success_0:
 ; CHECKED-NEXT:    mov x17, x2
 ; CHECKED-NEXT:    movk x17, #56789, lsl #48
@@ -88,7 +103,7 @@ define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_blend:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    mov x17, x1
 ; TRAP-NEXT:    movk x17, #12345, lsl #48
@@ -96,7 +111,7 @@ define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_1
+; TRAP-NEXT:    b.eq [[L]]auth_success_1
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_1:
 ; TRAP-NEXT:    mov x17, x2
@@ -112,18 +127,18 @@ define i64 @test_resign_blend(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_blend_and_const(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_resign_blend_and_const:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    mov x17, x1
 ; UNCHECKED-NEXT:    movk x17, #12345, lsl #48
 ; UNCHECKED-NEXT:    autda x16, x17
-; UNCHECKED-NEXT:    mov x17, #56789 ; =0xddd5
+; UNCHECKED-NEXT:    mov x17, #56789
 ; UNCHECKED-NEXT:    pacdb x16, x17
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_blend_and_const:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    mov x17, x1
 ; CHECKED-NEXT:    movk x17, #12345, lsl #48
@@ -131,18 +146,18 @@ define i64 @test_resign_blend_and_const(i64 %arg, i64 %arg1) {
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_1
+; CHECKED-NEXT:    b.eq [[L]]auth_success_1
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_1
+; CHECKED-NEXT:    b [[L]]resign_end_1
 ; CHECKED-NEXT:  Lauth_success_1:
-; CHECKED-NEXT:    mov x17, #56789 ; =0xddd5
+; CHECKED-NEXT:    mov x17, #56789
 ; CHECKED-NEXT:    pacdb x16, x17
 ; CHECKED-NEXT:  Lresign_end_1:
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_blend_and_const:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    mov x17, x1
 ; TRAP-NEXT:    movk x17, #12345, lsl #48
@@ -150,10 +165,10 @@ define i64 @test_resign_blend_and_const(i64 %arg, i64 %arg1) {
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_2
+; TRAP-NEXT:    b.eq [[L]]auth_success_2
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_2:
-; TRAP-NEXT:    mov x17, #56789 ; =0xddd5
+; TRAP-NEXT:    mov x17, #56789
 ; TRAP-NEXT:    pacdb x16, x17
 ; TRAP-NEXT:    mov x0, x16
 ; TRAP-NEXT:    ret
@@ -164,7 +179,7 @@ define i64 @test_resign_blend_and_const(i64 %arg, i64 %arg1) {
 
 define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_blend_and_addr:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    mov x17, x1
 ; UNCHECKED-NEXT:    movk x17, #12345, lsl #48
@@ -174,7 +189,7 @@ define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_blend_and_addr:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    mov x17, x1
 ; CHECKED-NEXT:    movk x17, #12345, lsl #48
@@ -182,9 +197,9 @@ define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_2
+; CHECKED-NEXT:    b.eq [[L]]auth_success_2
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_2
+; CHECKED-NEXT:    b [[L]]resign_end_2
 ; CHECKED-NEXT:  Lauth_success_2:
 ; CHECKED-NEXT:    pacdb x16, x2
 ; CHECKED-NEXT:  Lresign_end_2:
@@ -192,7 +207,7 @@ define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_blend_and_addr:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    mov x17, x1
 ; TRAP-NEXT:    movk x17, #12345, lsl #48
@@ -200,7 +215,7 @@ define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_3
+; TRAP-NEXT:    b.eq [[L]]auth_success_3
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_3:
 ; TRAP-NEXT:    pacdb x16, x2
@@ -212,38 +227,44 @@ define i64 @test_resign_blend_and_addr(i64 %arg, i64 %arg1, i64 %arg2) {
 }
 
 define i64 @test_auth_too_large_discriminator(i64 %arg, i64 %arg1) {
-; UNCHECKED-LABEL: test_auth_too_large_discriminator:
-; UNCHECKED:       ; %bb.0:
-; UNCHECKED-NEXT:    mov w8, #65536 ; =0x10000
-; UNCHECKED-NEXT:    bfi x1, x8, #48, #16
-; UNCHECKED-NEXT:    mov x16, x0
-; UNCHECKED-NEXT:    autda x16, x1
-; UNCHECKED-NEXT:    mov x0, x16
-; UNCHECKED-NEXT:    ret
+; UNCHECKED-LABEL:     test_auth_too_large_discriminator:
+; UNCHECKED:           %bb.0:
+; UNCHECKED-NEXT:        mov w8, #65536
+; UNCHECKED-DARWIN-NEXT: bfi x1, x8, #48, #16
+; UNCHECKED-DARWIN-NEXT: mov x16, x0
+; UNCHECKED-ELF-NEXT:    mov x16, x0
+; UNCHECKED-ELF-NEXT:    bfi x1, x8, #48, #16
+; UNCHECKED-NEXT:        autda x16, x1
+; UNCHECKED-NEXT:        mov x0, x16
+; UNCHECKED-NEXT:        ret
 ;
 ; CHECKED-LABEL: test_auth_too_large_discriminator:
-; CHECKED:       ; %bb.0:
-; CHECKED-NEXT:    mov w8, #65536 ; =0x10000
-; CHECKED-NEXT:    bfi x1, x8, #48, #16
-; CHECKED-NEXT:    mov x16, x0
-; CHECKED-NEXT:    autda x16, x1
-; CHECKED-NEXT:    mov x0, x16
-; CHECKED-NEXT:    ret
+; CHECKED:           %bb.0:
+; CHECKED-NEXT:        mov w8, #65536
+; CHECKED-DARWIN-NEXT: bfi x1, x8, #48, #16
+; CHECKED-DARWIN-NEXT: mov x16, x0
+; CHECKED-ELF-NEXT:    mov x16, x0
+; CHECKED-ELF-NEXT:    bfi x1, x8, #48, #16
+; CHECKED-NEXT:        autda x16, x1
+; CHECKED-NEXT:        mov x0, x16
+; CHECKED-NEXT:        ret
 ;
 ; TRAP-LABEL: test_auth_too_large_discriminator:
-; TRAP:       ; %bb.0:
-; TRAP-NEXT:    mov w8, #65536 ; =0x10000
-; TRAP-NEXT:    bfi x1, x8, #48, #16
-; TRAP-NEXT:    mov x16, x0
-; TRAP-NEXT:    autda x16, x1
-; TRAP-NEXT:    mov x17, x16
-; TRAP-NEXT:    xpacd x17
-; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_4
-; TRAP-NEXT:    brk #0xc472
-; TRAP-NEXT:  Lauth_success_4:
-; TRAP-NEXT:    mov x0, x16
-; TRAP-NEXT:    ret
+; TRAP:           %bb.0:
+; TRAP-NEXT:        mov w8, #65536
+; TRAP-DARWIN-NEXT: bfi x1, x8, #48, #16
+; TRAP-DARWIN-NEXT: mov x16, x0
+; TRAP-ELF-NEXT:    mov x16, x0
+; TRAP-ELF-NEXT:    bfi x1, x8, #48, #16
+; TRAP-NEXT:        autda x16, x1
+; TRAP-NEXT:        mov x17, x16
+; TRAP-NEXT:        xpacd x17
+; TRAP-NEXT:        cmp x16, x17
+; TRAP-NEXT:        b.eq [[L]]auth_success_4
+; TRAP-NEXT:        brk #0xc472
+; TRAP-NEXT:      Lauth_success_4:
+; TRAP-NEXT:        mov x0, x16
+; TRAP-NEXT:        ret
   %tmp0 = call i64 @llvm.ptrauth.blend(i64 %arg1, i64 65536)
   %tmp1 = call i64 @llvm.ptrauth.auth(i64 %arg, i32 2, i64 %tmp0)
   ret i64 %tmp1
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll
index 62c9fba853adb..fdd5ae29f35ea 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll
@@ -1,44 +1,59 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s --check-prefix=UNCHECKED
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL="L" --check-prefix=UNCHECKED
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s --check-prefix=UNCHECKED
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL="L" --check-prefix=UNCHECKED
 
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:                                     | FileCheck %s --check-prefix=CHECKED
+; RUN:                                     | FileCheck %s -DL="L" --check-prefix=CHECKED
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:                                     | FileCheck %s --check-prefix=CHECKED
+; RUN:                                     | FileCheck %s -DL="L" --check-prefix=CHECKED
 
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel=0                    -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s --check-prefix=TRAP
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL="L" --check-prefix=TRAP
 ; RUN: llc < %s -mtriple arm64e-apple-darwin -global-isel -global-isel-abort=1 -verify-machineinstrs \
-; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s --check-prefix=TRAP
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL="L" --check-prefix=TRAP
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL=".L" --check-prefix=UNCHECKED
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=none | FileCheck %s -DL=".L" --check-prefix=UNCHECKED
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:                                     | FileCheck %s -DL=".L" --check-prefix=CHECKED
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:                                     | FileCheck %s -DL=".L" --check-prefix=CHECKED
+
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel=0                    -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL=".L" --check-prefix=TRAP
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -global-isel -global-isel-abort=1 -verify-machineinstrs \
+; RUN:   -aarch64-ptrauth-auth-checks=trap | FileCheck %s -DL=".L" --check-prefix=TRAP
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
 define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_ia:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autia x16, x1
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_ia:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autia x16, x1
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_ia:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autia x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_0
+; TRAP-NEXT:    b.eq [[L]]auth_success_0
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_0:
 ; TRAP-NEXT:    mov x0, x16
@@ -49,27 +64,27 @@ define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_ia_zero(i64 %arg) {
 ; UNCHECKED-LABEL: test_auth_ia_zero:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autiza x16
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_ia_zero:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autiza x16
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_ia_zero:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autiza x16
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_1
+; TRAP-NEXT:    b.eq [[L]]auth_success_1
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_1:
 ; TRAP-NEXT:    mov x0, x16
@@ -80,27 +95,27 @@ define i64 @test_auth_ia_zero(i64 %arg) {
 
 define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_ib:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autib x16, x1
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_ib:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autib x16, x1
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_ib:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autib x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_2
+; TRAP-NEXT:    b.eq [[L]]auth_success_2
 ; TRAP-NEXT:    brk #0xc471
 ; TRAP-NEXT:  Lauth_success_2:
 ; TRAP-NEXT:    mov x0, x16
@@ -111,27 +126,27 @@ define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_ib_zero(i64 %arg) {
 ; UNCHECKED-LABEL: test_auth_ib_zero:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autizb x16
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_ib_zero:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autizb x16
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_ib_zero:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autizb x16
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_3
+; TRAP-NEXT:    b.eq [[L]]auth_success_3
 ; TRAP-NEXT:    brk #0xc471
 ; TRAP-NEXT:  Lauth_success_3:
 ; TRAP-NEXT:    mov x0, x16
@@ -142,27 +157,27 @@ define i64 @test_auth_ib_zero(i64 %arg) {
 
 define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_da:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autda x16, x1
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_da:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autda x16, x1
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_da:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autda x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_4
+; TRAP-NEXT:    b.eq [[L]]auth_success_4
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_4:
 ; TRAP-NEXT:    mov x0, x16
@@ -173,27 +188,27 @@ define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_da_zero(i64 %arg) {
 ; UNCHECKED-LABEL: test_auth_da_zero:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autdza x16
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_da_zero:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autdza x16
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_da_zero:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autdza x16
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_5
+; TRAP-NEXT:    b.eq [[L]]auth_success_5
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_5:
 ; TRAP-NEXT:    mov x0, x16
@@ -204,27 +219,27 @@ define i64 @test_auth_da_zero(i64 %arg) {
 
 define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_db:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autdb x16, x1
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_db:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autdb x16, x1
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_db:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autdb x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_6
+; TRAP-NEXT:    b.eq [[L]]auth_success_6
 ; TRAP-NEXT:    brk #0xc473
 ; TRAP-NEXT:  Lauth_success_6:
 ; TRAP-NEXT:    mov x0, x16
@@ -235,27 +250,27 @@ define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 
 define i64 @test_auth_db_zero(i64 %arg) {
 ; UNCHECKED-LABEL: test_auth_db_zero:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autdzb x16
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_db_zero:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autdzb x16
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_db_zero:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autdzb x16
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_7
+; TRAP-NEXT:    b.eq [[L]]auth_success_7
 ; TRAP-NEXT:    brk #0xc473
 ; TRAP-NEXT:  Lauth_success_7:
 ; TRAP-NEXT:    mov x0, x16
@@ -268,7 +283,7 @@ define i64 @test_auth_db_zero(i64 %arg) {
 ;; the validity of a signature.
 define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_ia_ia:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autia x16, x1
 ; UNCHECKED-NEXT:    pacia x16, x2
@@ -276,15 +291,15 @@ define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_ia_ia:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autia x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpaci x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_0
+; CHECKED-NEXT:    b.eq [[L]]auth_success_0
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_0
+; CHECKED-NEXT:    b [[L]]resign_end_0
 ; CHECKED-NEXT:  Lauth_success_0:
 ; CHECKED-NEXT:    pacia x16, x2
 ; CHECKED-NEXT:  Lresign_end_0:
@@ -292,13 +307,13 @@ define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_ia_ia:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autia x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_8
+; TRAP-NEXT:    b.eq [[L]]auth_success_8
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_8:
 ; TRAP-NEXT:    pacia x16, x2
@@ -310,7 +325,7 @@ define i64 @test_resign_ia_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_ib_ia:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autib x16, x1
 ; UNCHECKED-NEXT:    pacia x16, x2
@@ -318,15 +333,15 @@ define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_ib_ia:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autib x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpaci x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_1
+; CHECKED-NEXT:    b.eq [[L]]auth_success_1
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_1
+; CHECKED-NEXT:    b [[L]]resign_end_1
 ; CHECKED-NEXT:  Lauth_success_1:
 ; CHECKED-NEXT:    pacia x16, x2
 ; CHECKED-NEXT:  Lresign_end_1:
@@ -334,13 +349,13 @@ define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_ib_ia:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autib x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_9
+; TRAP-NEXT:    b.eq [[L]]auth_success_9
 ; TRAP-NEXT:    brk #0xc471
 ; TRAP-NEXT:  Lauth_success_9:
 ; TRAP-NEXT:    pacia x16, x2
@@ -352,7 +367,7 @@ define i64 @test_resign_ib_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_da_ia:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autda x16, x1
 ; UNCHECKED-NEXT:    pacia x16, x2
@@ -360,15 +375,15 @@ define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_da_ia:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autda x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_2
+; CHECKED-NEXT:    b.eq [[L]]auth_success_2
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_2
+; CHECKED-NEXT:    b [[L]]resign_end_2
 ; CHECKED-NEXT:  Lauth_success_2:
 ; CHECKED-NEXT:    pacia x16, x2
 ; CHECKED-NEXT:  Lresign_end_2:
@@ -376,13 +391,13 @@ define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_da_ia:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autda x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_10
+; TRAP-NEXT:    b.eq [[L]]auth_success_10
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_10:
 ; TRAP-NEXT:    pacia x16, x2
@@ -394,7 +409,7 @@ define i64 @test_resign_da_ia(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_db_da:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autdb x16, x1
 ; UNCHECKED-NEXT:    pacda x16, x2
@@ -402,15 +417,15 @@ define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_db_da:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autdb x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_3
+; CHECKED-NEXT:    b.eq [[L]]auth_success_3
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_3
+; CHECKED-NEXT:    b [[L]]resign_end_3
 ; CHECKED-NEXT:  Lauth_success_3:
 ; CHECKED-NEXT:    pacda x16, x2
 ; CHECKED-NEXT:  Lresign_end_3:
@@ -418,13 +433,13 @@ define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_db_da:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autdb x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_11
+; TRAP-NEXT:    b.eq [[L]]auth_success_11
 ; TRAP-NEXT:    brk #0xc473
 ; TRAP-NEXT:  Lauth_success_11:
 ; TRAP-NEXT:    pacda x16, x2
@@ -436,7 +451,7 @@ define i64 @test_resign_db_da(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_iza_db:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autiza x16
 ; UNCHECKED-NEXT:    pacdb x16, x2
@@ -444,15 +459,15 @@ define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_iza_db:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autiza x16
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpaci x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_4
+; CHECKED-NEXT:    b.eq [[L]]auth_success_4
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_4
+; CHECKED-NEXT:    b [[L]]resign_end_4
 ; CHECKED-NEXT:  Lauth_success_4:
 ; CHECKED-NEXT:    pacdb x16, x2
 ; CHECKED-NEXT:  Lresign_end_4:
@@ -460,13 +475,13 @@ define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_iza_db:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autiza x16
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_12
+; TRAP-NEXT:    b.eq [[L]]auth_success_12
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_12:
 ; TRAP-NEXT:    pacdb x16, x2
@@ -478,7 +493,7 @@ define i64 @test_resign_iza_db(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-LABEL: test_resign_da_dzb:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autda x16, x1
 ; UNCHECKED-NEXT:    pacdzb x16
@@ -486,15 +501,15 @@ define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_da_dzb:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autda x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_5
+; CHECKED-NEXT:    b.eq [[L]]auth_success_5
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_5
+; CHECKED-NEXT:    b [[L]]resign_end_5
 ; CHECKED-NEXT:  Lauth_success_5:
 ; CHECKED-NEXT:    pacdzb x16
 ; CHECKED-NEXT:  Lresign_end_5:
@@ -502,13 +517,13 @@ define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_da_dzb:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autda x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_13
+; TRAP-NEXT:    b.eq [[L]]auth_success_13
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_13:
 ; TRAP-NEXT:    pacdzb x16
@@ -520,33 +535,33 @@ define i64 @test_resign_da_dzb(i64 %arg, i64 %arg1, i64 %arg2) {
 
 define i64 @test_auth_trap_attribute(i64 %arg, i64 %arg1) "ptrauth-auth-traps" {
 ; UNCHECKED-LABEL: test_auth_trap_attribute:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autia x16, x1
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_trap_attribute:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autia x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpaci x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_6
+; CHECKED-NEXT:    b.eq [[L]]auth_success_6
 ; CHECKED-NEXT:    brk #0xc470
 ; CHECKED-NEXT:  Lauth_success_6:
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_trap_attribute:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autia x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_14
+; TRAP-NEXT:    b.eq [[L]]auth_success_14
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_14:
 ; TRAP-NEXT:    mov x0, x16
@@ -557,30 +572,30 @@ define i64 @test_auth_trap_attribute(i64 %arg, i64 %arg1) "ptrauth-auth-traps" {
 
 define i64 @test_auth_ia_constdisc(i64 %arg) {
 ; UNCHECKED-LABEL: test_auth_ia_constdisc:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
-; UNCHECKED-NEXT:    mov x17, #256 ; =0x100
+; UNCHECKED-NEXT:    mov x17, #256
 ; UNCHECKED-NEXT:    autia x16, x17
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_auth_ia_constdisc:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
-; CHECKED-NEXT:    mov x17, #256 ; =0x100
+; CHECKED-NEXT:    mov x17, #256
 ; CHECKED-NEXT:    autia x16, x17
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_ia_constdisc:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
-; TRAP-NEXT:    mov x17, #256 ; =0x100
+; TRAP-NEXT:    mov x17, #256
 ; TRAP-NEXT:    autia x16, x17
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpaci x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_15
+; TRAP-NEXT:    b.eq [[L]]auth_success_15
 ; TRAP-NEXT:    brk #0xc470
 ; TRAP-NEXT:  Lauth_success_15:
 ; TRAP-NEXT:    mov x0, x16
@@ -591,42 +606,42 @@ define i64 @test_auth_ia_constdisc(i64 %arg) {
 
 define i64 @test_resign_da_constdisc(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_resign_da_constdisc:
-; UNCHECKED:       ; %bb.0:
+; UNCHECKED:       %bb.0:
 ; UNCHECKED-NEXT:    mov x16, x0
 ; UNCHECKED-NEXT:    autda x16, x1
-; UNCHECKED-NEXT:    mov x17, #256 ; =0x100
+; UNCHECKED-NEXT:    mov x17, #256
 ; UNCHECKED-NEXT:    pacda x16, x17
 ; UNCHECKED-NEXT:    mov x0, x16
 ; UNCHECKED-NEXT:    ret
 ;
 ; CHECKED-LABEL: test_resign_da_constdisc:
-; CHECKED:       ; %bb.0:
+; CHECKED:       %bb.0:
 ; CHECKED-NEXT:    mov x16, x0
 ; CHECKED-NEXT:    autda x16, x1
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq Lauth_success_7
+; CHECKED-NEXT:    b.eq [[L]]auth_success_7
 ; CHECKED-NEXT:    mov x16, x17
-; CHECKED-NEXT:    b Lresign_end_6
+; CHECKED-NEXT:    b [[L]]resign_end_6
 ; CHECKED-NEXT:  Lauth_success_7:
-; CHECKED-NEXT:    mov x17, #256 ; =0x100
+; CHECKED-NEXT:    mov x17, #256
 ; CHECKED-NEXT:    pacda x16, x17
 ; CHECKED-NEXT:  Lresign_end_6:
 ; CHECKED-NEXT:    mov x0, x16
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_resign_da_constdisc:
-; TRAP:       ; %bb.0:
+; TRAP:       %bb.0:
 ; TRAP-NEXT:    mov x16, x0
 ; TRAP-NEXT:    autda x16, x1
 ; TRAP-NEXT:    mov x17, x16
 ; TRAP-NEXT:    xpacd x17
 ; TRAP-NEXT:    cmp x16, x17
-; TRAP-NEXT:    b.eq Lauth_success_16
+; TRAP-NEXT:    b.eq [[L]]auth_success_16
 ; TRAP-NEXT:    brk #0xc472
 ; TRAP-NEXT:  Lauth_success_16:
-; TRAP-NEXT:    mov x17, #256 ; =0x100
+; TRAP-NEXT:    mov x17, #256
 ; TRAP-NEXT:    pacda x16, x17
 ; TRAP-NEXT:    mov x0, x16
 ; TRAP-NEXT:    ret

>From 1af23c548197ba8325c35e1edd6fa1be456af57f Mon Sep 17 00:00:00 2001
From: Daniil Kovalev <dkovalev at accesssoftek.com>
Date: Thu, 25 Jul 2024 00:24:50 +0300
Subject: [PATCH 67/91] [PAC][clang][test] Implement missing tests for some
 PAuth features (#100206)

Implement tests for the following PAuth-related features:

- driver, preprocessor and ELF codegen tests for type_info vtable
pointer discrimination #99726;

- driver, preprocessor, and ELF codegen (emitting function attributes) +
sema (emitting errors) tests for indirect gotos signing #97647;

- ELF codegen tests for ubsan type checks + auth #99590;

- ELF codegen tests for constant global init with polymorphic MI #99741;

- ELF codegen tests for C++ member function pointers auth #99576.

(cherry picked from commit 70c6e79e6d3e897418f3556a25e22e66ff018dc4)
---
 clang/lib/Driver/ToolChains/Clang.cpp         |  3 +
 .../CodeGen/ptrauth-function-attributes.c     |  5 +-
 clang/test/CodeGen/ubsan-function.cpp         |  7 +-
 .../ptrauth-global-constant-initializers.cpp  | 77 +++++++++++--------
 .../ptrauth-member-function-pointer.cpp       | 55 +++++++------
 .../CodeGenCXX/ptrauth-type-info-vtable.cpp   | 17 +++-
 clang/test/Driver/aarch64-ptrauth.c           |  9 ++-
 clang/test/Preprocessor/ptrauth_feature.c     | 36 +++++++--
 clang/test/Sema/ptrauth-indirect-goto.c       |  1 +
 9 files changed, 137 insertions(+), 73 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 5de29f1eca614..366b147a052bf 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -1847,6 +1847,9 @@ void Clang::AddAArch64TargetArgs(const ArgList &Args,
   Args.addOptInFlag(
       CmdArgs, options::OPT_fptrauth_vtable_pointer_type_discrimination,
       options::OPT_fno_ptrauth_vtable_pointer_type_discrimination);
+  Args.addOptInFlag(
+      CmdArgs, options::OPT_fptrauth_type_info_vtable_pointer_discrimination,
+      options::OPT_fno_ptrauth_type_info_vtable_pointer_discrimination);
   Args.addOptInFlag(CmdArgs, options::OPT_fptrauth_init_fini,
                     options::OPT_fno_ptrauth_init_fini);
   Args.addOptInFlag(
diff --git a/clang/test/CodeGen/ptrauth-function-attributes.c b/clang/test/CodeGen/ptrauth-function-attributes.c
index 7f93ccc7c4bce..6a09cd37bf485 100644
--- a/clang/test/CodeGen/ptrauth-function-attributes.c
+++ b/clang/test/CodeGen/ptrauth-function-attributes.c
@@ -4,8 +4,9 @@
 // RUN: %clang_cc1 -triple arm64-apple-ios  -fptrauth-calls   -emit-llvm %s  -o - | FileCheck %s --check-prefixes=ALL,CALLS
 // RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls  -emit-llvm %s  -o - | FileCheck %s --check-prefixes=ALL,CALLS
 
-// RUN: %clang_cc1 -triple arm64-apple-ios  -fptrauth-indirect-gotos -emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
-// RUN: %clang_cc1 -triple arm64e-apple-ios -fptrauth-indirect-gotos -emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-indirect-gotos -emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
+// RUN: %clang_cc1 -triple arm64e-apple-ios  -fptrauth-indirect-gotos -emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-indirect-gotos -emit-llvm %s -o - | FileCheck %s --check-prefixes=ALL,GOTOS
 
 // ALL: define {{(dso_local )?}}void @test() #0
 void test() {
diff --git a/clang/test/CodeGen/ubsan-function.cpp b/clang/test/CodeGen/ubsan-function.cpp
index 8478f05a10b78..76d4237383f83 100644
--- a/clang/test/CodeGen/ubsan-function.cpp
+++ b/clang/test/CodeGen/ubsan-function.cpp
@@ -4,7 +4,8 @@
 // RUN: %clang_cc1 -triple aarch64_be-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all | FileCheck %s --check-prefixes=CHECK,GNU,64
 // RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all | FileCheck %s --check-prefixes=CHECK,ARM,GNU,32
 
-// RUN: %clang_cc1 -triple arm64e-apple-ios -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -fptrauth-calls | FileCheck %s --check-prefixes=CHECK,GNU,64,64e
+// RUN: %clang_cc1 -triple arm64e-apple-ios  -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -fptrauth-calls | FileCheck %s --check-prefixes=CHECK,GNU,64,AUTH
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -fptrauth-calls | FileCheck %s --check-prefixes=CHECK,GNU,64,AUTH
 
 // GNU:  define{{.*}} void @_Z3funv() #0 !func_sanitize ![[FUNCSAN:.*]] {
 // MSVC: define{{.*}} void @"?fun@@YAXXZ"() #0 !func_sanitize ![[FUNCSAN:.*]] {
@@ -15,8 +16,8 @@ void fun() {}
 // ARM:   ptrtoint ptr {{.*}} to i32, !nosanitize !5
 // ARM:   and i32 {{.*}}, -2, !nosanitize !5
 // ARM:   inttoptr i32 {{.*}} to ptr, !nosanitize !5
-// 64e:   %[[STRIPPED:.*]] = ptrtoint ptr {{.*}} to i64, !nosanitize
-// 64e:   call i64 @llvm.ptrauth.auth(i64 %[[STRIPPED]], i32 0, i64 0), !nosanitize
+// AUTH:  %[[STRIPPED:.*]] = ptrtoint ptr {{.*}} to i64, !nosanitize
+// AUTH:  call i64 @llvm.ptrauth.auth(i64 %[[STRIPPED]], i32 0, i64 0), !nosanitize
 // CHECK: getelementptr <{ i32, i32 }>, ptr {{.*}}, i32 -1, i32 0, !nosanitize
 // CHECK: load i32, ptr {{.*}}, align {{.*}}, !nosanitize
 // CHECK: icmp eq i32 {{.*}}, -1056584962, !nosanitize
diff --git a/clang/test/CodeGenCXX/ptrauth-global-constant-initializers.cpp b/clang/test/CodeGenCXX/ptrauth-global-constant-initializers.cpp
index f0c3ea83d8958..9ce9def6156ef 100644
--- a/clang/test/CodeGenCXX/ptrauth-global-constant-initializers.cpp
+++ b/clang/test/CodeGenCXX/ptrauth-global-constant-initializers.cpp
@@ -1,4 +1,7 @@
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fno-rtti -fptrauth-vtable-pointer-type-discrimination -fptrauth-vtable-pointer-address-discrimination -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fno-rtti -fptrauth-vtable-pointer-type-discrimination \
+// RUN:   -fptrauth-vtable-pointer-address-discrimination -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,DARWIN
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fno-rtti -fptrauth-vtable-pointer-type-discrimination \
+// RUN:   -fptrauth-vtable-pointer-address-discrimination -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,ELF
 
 // CHECK: %struct.Base1 = type { ptr }
 // CHECK: %struct.Base2 = type { ptr }
@@ -6,27 +9,27 @@
 // CHECK: %struct.Derived2 = type { %struct.Base2, %struct.Base1 }
 // CHECK: %struct.Derived3 = type { %struct.Base1, %struct.Base2 }
 
-// CHECK: @_ZTV5Base1 = linkonce_odr unnamed_addr constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC:38871]], ptr getelementptr inbounds ({ [3 x ptr] }, ptr @_ZTV5Base1, i32 0, i32 0, i32 2))] }, align 8
-// CHECK: @g_b1 = global %struct.Base1 { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [3 x ptr] }, ptr @_ZTV5Base1, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC:6511]], ptr @g_b1) }, align 8
-// CHECK: @_ZTV5Base2 = linkonce_odr unnamed_addr constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC:27651]], ptr getelementptr inbounds ({ [3 x ptr] }, ptr @_ZTV5Base2, i32 0, i32 0, i32 2))] }, align 8
-// CHECK: @g_b2 = global %struct.Base2 { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [3 x ptr] }, ptr @_ZTV5Base2, i32 0, i32 0, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC:63631]], ptr @g_b2) }, align 8
-// CHECK: @_ZTV8Derived1 = linkonce_odr unnamed_addr constant { [5 x ptr], [3 x ptr] } { [5 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived11cEv, i32 0, i64 [[DERIVED1_C_DISC:54092]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN8Derived11dEv, i32 0, i64 [[DERIVED1_D_DISC:37391]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 4))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 1, i32 2))] }, align 8
-// CHECK: @g_d1 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 24) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr @g_d1), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 1, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d1, i32 0, i32 1)) }, align 8
-// CHECK: @_ZTV8Derived2 = linkonce_odr unnamed_addr constant { [5 x ptr], [3 x ptr] } { [5 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived21cEv, i32 0, i64 [[DERIVED2_C_DISC:15537]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN8Derived21eEv, i32 0, i64 209, ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 4))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 1, i32 2))] }, align 8
-// CHECK: @g_d2 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 24) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr @g_d2), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 1, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d2, i32 0, i32 1)) }, align 8
-// CHECK: @_ZTV8Derived3 = linkonce_odr unnamed_addr constant { [4 x ptr], [3 x ptr] } { [4 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived31iEv, i32 0, i64 [[DERIVED3_I_DISC:19084]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 3))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 1, i32 2))] }, align 8
-// CHECK: @g_d3 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr @g_d3), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 1, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d3, i32 0, i32 1)) }, align 8
-// CHECK: @g_vb1 = global %struct.VirtualBase1 zeroinitializer, align 8
-// CHECK: @g_vb2 = global %struct.VirtualBase2 zeroinitializer, align 8
-// CHECK: @g_d4 = global %struct.Derived4 zeroinitializer, align 8
-// CHECK: @_ZTV12VirtualBase1 = linkonce_odr unnamed_addr constant { [6 x ptr] } { [6 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC:7987]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 5))] }, align 8
-// CHECK: @_ZTT12VirtualBase1 = linkonce_odr unnamed_addr constant [2 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4), i32 2)], align 8
-// CHECK: @_ZTV12VirtualBase2 = linkonce_odr unnamed_addr constant { [5 x ptr], [4 x ptr] } { [5 x ptr] [ptr inttoptr (i64 8 to ptr), ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC:51224]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 4))], [4 x ptr] [ptr null, ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 1, i32 3))] }, align 8
-// CHECK: @_ZTT12VirtualBase2 = linkonce_odr unnamed_addr constant [2 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 8) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 1, i32 3), i32 2)], align 8
-// CHECK: @_ZTV8Derived4 = linkonce_odr unnamed_addr constant { [7 x ptr], [5 x ptr] } { [7 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 5)), ptr ptrauth (ptr @_ZN8Derived41hEv, i32 0, i64 [[DERIVED4_H_DISC:31844]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 6))], [5 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 4))] }, align 8
-// CHECK: @_ZTT8Derived4 = linkonce_odr unnamed_addr constant [7 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-32, 24) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 8) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 1, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 24) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 3), i32 2)], align 8
-// CHECK: @_ZTC8Derived40_12VirtualBase1 = linkonce_odr unnamed_addr constant { [6 x ptr] } { [6 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 5))] }, align 8
-// CHECK: @_ZTC8Derived48_12VirtualBase2 = linkonce_odr unnamed_addr constant { [5 x ptr], [4 x ptr] } { [5 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 4))], [4 x ptr] [ptr null, ptr inttoptr (i64 8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 1, i32 3))] }, align 8
+// CHECK: @_ZTV5Base1 = linkonce_odr unnamed_addr constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC:38871]], ptr getelementptr inbounds ({ [3 x ptr] }, ptr @_ZTV5Base1, i32 0, i32 0, i32 2))] },{{.*}} align 8
+// CHECK: @g_b1 = global %struct.Base1 { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [3 x ptr] }, ptr @_ZTV5Base1, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC:6511]], ptr @g_b1) },{{.*}} align 8
+// CHECK: @_ZTV5Base2 = linkonce_odr unnamed_addr constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC:27651]], ptr getelementptr inbounds ({ [3 x ptr] }, ptr @_ZTV5Base2, i32 0, i32 0, i32 2))] },{{.*}} align 8
+// CHECK: @g_b2 = global %struct.Base2 { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [3 x ptr] }, ptr @_ZTV5Base2, i32 0, i32 0, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC:63631]], ptr @g_b2) },{{.*}} align 8
+// CHECK: @_ZTV8Derived1 = linkonce_odr unnamed_addr constant { [5 x ptr], [3 x ptr] } { [5 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived11cEv, i32 0, i64 [[DERIVED1_C_DISC:54092]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN8Derived11dEv, i32 0, i64 [[DERIVED1_D_DISC:37391]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 4))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 1, i32 2))] },{{.*}} align 8
+// CHECK: @g_d1 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 24) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr @g_d1), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived1, i32 0, i32 1, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d1, i32 0, i32 1)) },{{.*}} align 8
+// CHECK: @_ZTV8Derived2 = linkonce_odr unnamed_addr constant { [5 x ptr], [3 x ptr] } { [5 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived21cEv, i32 0, i64 [[DERIVED2_C_DISC:15537]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN8Derived21eEv, i32 0, i64 209, ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 4))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 1, i32 2))] },{{.*}} align 8
+// CHECK: @g_d2 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 24) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 0, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr @g_d2), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [5 x ptr], [3 x ptr] }, ptr @_ZTV8Derived2, i32 0, i32 1, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d2, i32 0, i32 1)) },{{.*}} align 8
+// CHECK: @_ZTV8Derived3 = linkonce_odr unnamed_addr constant { [4 x ptr], [3 x ptr] } { [4 x ptr] [ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 2)), ptr ptrauth (ptr @_ZN8Derived31iEv, i32 0, i64 [[DERIVED3_I_DISC:19084]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 3))], [3 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 1, i32 2))] },{{.*}} align 8
+// CHECK: @g_d3 = global { ptr, ptr } { ptr ptrauth (ptr getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 0, i32 2), i32 2, i64 [[BASE1_VTABLE_DISC]], ptr @g_d3), ptr ptrauth (ptr getelementptr inbounds inrange(-16, 8) ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV8Derived3, i32 0, i32 1, i32 2), i32 2, i64 [[BASE2_VTABLE_DISC]], ptr getelementptr inbounds ({ ptr, ptr }, ptr @g_d3, i32 0, i32 1)) },{{.*}} align 8
+// CHECK: @g_vb1 = global %struct.VirtualBase1 zeroinitializer,{{.*}} align 8
+// CHECK: @g_vb2 = global %struct.VirtualBase2 zeroinitializer,{{.*}} align 8
+// CHECK: @g_d4 = global %struct.Derived4 zeroinitializer,{{.*}} align 8
+// CHECK: @_ZTV12VirtualBase1 = linkonce_odr unnamed_addr constant { [6 x ptr] } { [6 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC:7987]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 5))] },{{.*}} align 8
+// CHECK: @_ZTT12VirtualBase1 = linkonce_odr unnamed_addr constant [2 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTV12VirtualBase1, i32 0, i32 0, i32 4), i32 2)],{{.*}} align 8
+// CHECK: @_ZTV12VirtualBase2 = linkonce_odr unnamed_addr constant { [5 x ptr], [4 x ptr] } { [5 x ptr] [ptr inttoptr (i64 8 to ptr), ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC:51224]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 4))], [4 x ptr] [ptr null, ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 1, i32 3))] },{{.*}} align 8
+// CHECK: @_ZTT12VirtualBase2 = linkonce_odr unnamed_addr constant [2 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 0, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 8) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTV12VirtualBase2, i32 0, i32 1, i32 3), i32 2)],{{.*}} align 8
+// CHECK: @_ZTV8Derived4 = linkonce_odr unnamed_addr constant { [7 x ptr], [5 x ptr] } { [7 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 5)), ptr ptrauth (ptr @_ZN8Derived41hEv, i32 0, i64 [[DERIVED4_H_DISC:31844]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 6))], [5 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr inttoptr (i64 -8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC]], ptr getelementptr inbounds ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 4))] },{{.*}} align 8
+// CHECK: @_ZTT8Derived4 = linkonce_odr unnamed_addr constant [7 x ptr] [ptr ptrauth (ptr getelementptr inbounds inrange(-32, 24) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 16) ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 8) ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 1, i32 3), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-32, 24) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 0, i32 4), i32 2), ptr ptrauth (ptr getelementptr inbounds inrange(-24, 16) ({ [7 x ptr], [5 x ptr] }, ptr @_ZTV8Derived4, i32 0, i32 1, i32 3), i32 2)],{{.*}} align 8
+// CHECK: @_ZTC8Derived40_12VirtualBase1 = linkonce_odr unnamed_addr constant { [6 x ptr] } { [6 x ptr] [ptr null, ptr null, ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 4)), ptr ptrauth (ptr @_ZN12VirtualBase11fEv, i32 0, i64 [[VIRTUALBASE1_F_DISC]], ptr getelementptr inbounds ({ [6 x ptr] }, ptr @_ZTC8Derived40_12VirtualBase1, i32 0, i32 0, i32 5))] },{{.*}} align 8
+// CHECK: @_ZTC8Derived48_12VirtualBase2 = linkonce_odr unnamed_addr constant { [5 x ptr], [4 x ptr] } { [5 x ptr] [ptr inttoptr (i64 -8 to ptr), ptr null, ptr null, ptr ptrauth (ptr @_ZN5Base21bEv, i32 0, i64 [[BASE2_B_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 3)), ptr ptrauth (ptr @_ZN12VirtualBase21gEv, i32 0, i64 [[VIRTUALBASE2_G_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 0, i32 4))], [4 x ptr] [ptr null, ptr inttoptr (i64 8 to ptr), ptr null, ptr ptrauth (ptr @_ZN5Base11aEv, i32 0, i64 [[BASE1_A_DISC]], ptr getelementptr inbounds ({ [5 x ptr], [4 x ptr] }, ptr @_ZTC8Derived48_12VirtualBase2, i32 0, i32 1, i32 3))] },{{.*}} align 8
 
 struct Base1 { virtual void a() {} };
 struct Base2 { virtual void b() {} };
@@ -73,20 +76,24 @@ struct Derived5 : VirtualBase2, VirtualBase1 {
   virtual void h() {}
 };
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN12VirtualBase1C1Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN12VirtualBase1C1Ev
+// ELF-LABEL:    define {{.*}} void @_ZN12VirtualBase1C1Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN12VirtualBase2C1Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN12VirtualBase2C1Ev
+// ELF-LABEL:    define {{.*}} void @_ZN12VirtualBase2C1Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN8Derived4C1Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN8Derived4C1Ev
+// ELF-LABEL:    define {{.*}} void @_ZN8Derived4C1Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN8Derived5C1Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN8Derived5C1Ev
+// ELF-LABEL:    define {{.*}} void @_ZN8Derived5C1Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
@@ -155,7 +162,7 @@ extern "C" void cross_check_vtables(Base1 *b1,
   d5->h();
 }
 
-// CHECK-LABEL: define void @cross_check_vtables(
+// CHECK-LABEL: define{{.*}} void @cross_check_vtables(
 // CHECK: "; b1->a()",
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_A_DISC]])
@@ -214,21 +221,25 @@ extern "C" void cross_check_vtables(Base1 *b1,
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[DERIVED4_H_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN5Base1C2Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN5Base1C2Ev
+// ELF-LABEL:    define {{.*}} void @_ZN5Base1C2Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN5Base2C2Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN5Base2C2Ev
+// ELF-LABEL:    define {{.*}} void @_ZN5Base2C2Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN8Derived1C2Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN8Derived1C2Ev
+// ELF-LABEL:    define {{.*}} void @_ZN8Derived1C2Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN8Derived2C2Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN8Derived2C2Ev
+// ELF-LABEL:    define {{.*}} void @_ZN8Derived2C2Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 
-// CHECK-LABEL: define {{.*}} ptr @_ZN8Derived3C2Ev
+// DARWIN-LABEL: define {{.*}} ptr @_ZN8Derived3C2Ev
+// ELF-LABEL:    define {{.*}} void @_ZN8Derived3C2Ev
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE1_VTABLE_DISC]])
 // CHECK: call i64 @llvm.ptrauth.blend(i64 {{%.*}}, i64 [[BASE2_VTABLE_DISC]])
-
diff --git a/clang/test/CodeGenCXX/ptrauth-member-function-pointer.cpp b/clang/test/CodeGenCXX/ptrauth-member-function-pointer.cpp
index 5e84e3e7bc5e9..0a9ac3fa510f5 100644
--- a/clang/test/CodeGenCXX/ptrauth-member-function-pointer.cpp
+++ b/clang/test/CodeGenCXX/ptrauth-member-function-pointer.cpp
@@ -1,8 +1,14 @@
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -o - %s | FileCheck -check-prefixes=CHECK,NODEBUG %s
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -debug-info-kind=limited -o - %s | FileCheck -check-prefixes=CHECK %s
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 1 -o - %s | FileCheck %s -check-prefix=STACK-PROT
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 2 -o - %s | FileCheck %s -check-prefix=STACK-PROT
-// RUN: %clang_cc1 -triple arm64-apple-ios -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 3 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -o - %s | FileCheck -check-prefixes=CHECK,NODEBUG,DARWIN %s
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -debug-info-kind=limited -o - %s | FileCheck -check-prefixes=CHECK,DARWIN %s
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 1 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 2 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+// RUN: %clang_cc1 -triple arm64-apple-ios   -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 3 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -o - %s | FileCheck -check-prefixes=CHECK,NODEBUG,ELF %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -debug-info-kind=limited -o - %s | FileCheck -check-prefixes=CHECK,ELF %s
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 1 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 2 -o - %s | FileCheck %s -check-prefix=STACK-PROT
+// RUN: %clang_cc1 -triple aarch64-linux-gnu -fptrauth-calls -fptrauth-intrinsics -emit-llvm -std=c++11 -O1 -disable-llvm-passes -stack-protector 3 -o - %s | FileCheck %s -check-prefix=STACK-PROT
 
 
 // CHECK: @gmethod0 = global { i64, i64 } { i64 ptrtoint (ptr ptrauth (ptr @_ZN5Base011nonvirtual0Ev, i32 0, i64 [[TYPEDISC1:35591]]) to i64), i64 0 }, align 8
@@ -78,9 +84,9 @@ struct Class0 {
   MethodTy1 m0;
 };
 
-// CHECK: define void @_ZN5Base08virtual1Ev(
+// CHECK: define{{.*}} void @_ZN5Base08virtual1Ev(
 
-// CHECK: define void @_Z5test0v()
+// CHECK: define{{.*}} void @_Z5test0v()
 // CHECK: %[[METHOD0:.*]] = alloca { i64, i64 }, align 8
 // CHECK-NEXT: %[[VARMETHOD1:.*]] = alloca { i64, i64 }, align 8
 // CHECK-NEXT: %[[METHOD2:.*]] = alloca { i64, i64 }, align 8
@@ -246,7 +252,7 @@ void test0() {
   method7 = &Derived1::virtual1;
 }
 
-// CHECK: define void @_Z5test1P5Base0MS_FvvE(ptr noundef %[[A0:.*]], [2 x i64] %[[A1_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z5test1P5Base0MS_FvvE(ptr noundef %[[A0:.*]], [2 x i64] %[[A1_COERCE:.*]])
 // CHECK: %[[A1:.*]] = alloca { i64, i64 }, align 8
 // CHECK: %[[A0_ADDR:.*]] = alloca ptr, align 8
 // CHECK: %[[A1_ADDR:.*]] = alloca { i64, i64 }, align 8
@@ -264,15 +270,16 @@ void test0() {
 // CHECK: %[[MEMPTR_ISVIRTUAL:.*]] = icmp ne i64 %[[V5]], 0
 // CHECK: br i1 %[[MEMPTR_ISVIRTUAL]]
 
-// CHECK: %[[VTABLE:.*]] = load ptr, ptr %[[V4]], align 8
-// CHECK: %[[V7:.*]] = ptrtoint ptr %[[VTABLE]] to i64
-// CHECK: %[[V8:.*]] = call i64 @llvm.ptrauth.auth(i64 %[[V7]], i32 2, i64 0)
-// CHECK: %[[V9:.*]] = inttoptr i64 %[[V8]] to ptr
-// CHECK: %[[V10:.*]] = trunc i64 %[[MEMPTR_PTR]] to i32
-// CHECK: %[[V11:.*]] = zext i32 %[[V10]] to i64
-// CHECK: %[[V12:.*]] = getelementptr i8, ptr %[[V9]], i64 %[[V11]]
-// CHECK: %[[MEMPTR_VIRTUALFN:.*]] = load ptr, ptr %[[V12]], align 8
-// CHECK: br
+// CHECK:  %[[VTABLE:.*]] = load ptr, ptr %[[V4]], align 8
+// CHECK:  %[[V7:.*]] = ptrtoint ptr %[[VTABLE]] to i64
+// CHECK:  %[[V8:.*]] = call i64 @llvm.ptrauth.auth(i64 %[[V7]], i32 2, i64 0)
+// CHECK:  %[[V9:.*]] = inttoptr i64 %[[V8]] to ptr
+// DARWIN: %[[V10:.*]] = trunc i64 %[[MEMPTR_PTR]] to i32
+// DARWIN: %[[V11:.*]] = zext i32 %[[V10]] to i64
+// DARWIN: %[[V12:.*]] = getelementptr i8, ptr %[[V9]], i64 %[[V11]]
+// ELF:    %[[V12:.*]] = getelementptr i8, ptr %[[V9]], i64 %[[MEMPTR_PTR]]
+// CHECK:  %[[MEMPTR_VIRTUALFN:.*]] = load ptr, ptr %[[V12]], align 8
+// CHECK:  br
 
 // CHECK: %[[MEMPTR_NONVIRTUALFN:.*]] = inttoptr i64 %[[MEMPTR_PTR]] to ptr
 // CHECK: br
@@ -286,7 +293,7 @@ void test1(Base0 *a0, MethodTy0 a1) {
   (a0->*a1)();
 }
 
-// CHECK: define void @_Z15testConversion0M5Base0FvvEM8Derived0FvvE([2 x i64] %[[METHOD0_COERCE:.*]], [2 x i64] %[[METHOD1_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z15testConversion0M5Base0FvvEM8Derived0FvvE([2 x i64] %[[METHOD0_COERCE:.*]], [2 x i64] %[[METHOD1_COERCE:.*]])
 // CHECK: %[[METHOD0:.*]] = alloca { i64, i64 }, align 8
 // CHECK: %[[METHOD1:.*]] = alloca { i64, i64 }, align 8
 // CHECK: %[[METHOD0_ADDR:.*]] = alloca { i64, i64 }, align 8
@@ -326,21 +333,21 @@ void testConversion0(MethodTy0 method0, MethodTy1 method1) {
   method1 = method0;
 }
 
-// CHECK: define void @_Z15testConversion1M5Base0FvvE(
+// CHECK: define{{.*}} void @_Z15testConversion1M5Base0FvvE(
 // CHECK: call i64 @llvm.ptrauth.resign(i64 %{{.*}}, i32 0, i64 [[TYPEDISC0]], i32 0, i64 [[TYPEDISC1]])
 
 void testConversion1(MethodTy0 method0) {
   MethodTy1 method1 = reinterpret_cast<MethodTy1>(method0);
 }
 
-// CHECK: define void @_Z15testConversion2M8Derived0FvvE(
+// CHECK: define{{.*}} void @_Z15testConversion2M8Derived0FvvE(
 // CHECK: call i64 @llvm.ptrauth.resign(i64 %{{.*}}, i32 0, i64 [[TYPEDISC1]], i32 0, i64 [[TYPEDISC0]])
 
 void testConversion2(MethodTy1 method1) {
   MethodTy0 method0 = static_cast<MethodTy0>(method1);
 }
 
-// CHECK: define void @_Z15testConversion3M8Derived0FvvE(
+// CHECK: define{{.*}} void @_Z15testConversion3M8Derived0FvvE(
 // CHECK: call i64 @llvm.ptrauth.resign(i64 %{{.*}}, i32 0, i64 [[TYPEDISC1]], i32 0, i64 [[TYPEDISC0]])
 
 void testConversion3(MethodTy1 method1) {
@@ -350,7 +357,7 @@ void testConversion3(MethodTy1 method1) {
 // No need to call @llvm.ptrauth.resign if the source member function
 // pointer is a constant.
 
-// CHECK: define void @_Z15testConversion4v(
+// CHECK: define{{.*}} void @_Z15testConversion4v(
 // CHECK: %[[METHOD0:.*]] = alloca { i64, i64 }, align 8
 // CHECK: store { i64, i64 } { i64 ptrtoint (ptr ptrauth (ptr @_ZN5Base08virtual1Ev_vfpthunk_, i32 0, i64 [[TYPEDISC0]]) to i64), i64 0 }, ptr %[[METHOD0]], align 8
 // CHECK: ret void
@@ -396,7 +403,7 @@ MethodTy1 gmethod0 = reinterpret_cast<MethodTy1>(&Base0::nonvirtual0);
 MethodTy0 gmethod1 = reinterpret_cast<MethodTy0>(&Derived0::nonvirtual5);
 MethodTy0 gmethod2 = reinterpret_cast<MethodTy0>(&Derived0::virtual1);
 
-// CHECK-LABEL: define void @_Z13testArrayInitv()
+// CHECK-LABEL: define{{.*}} void @_Z13testArrayInitv()
 // CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 8 %p0, ptr align 8 @__const._Z13testArrayInitv.p0, i64 16, i1 false)
 // CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 8 %p1, ptr align 8 @__const._Z13testArrayInitv.p1, i64 16, i1 false)
 // CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 8 %c0, ptr align 8 @__const._Z13testArrayInitv.c0, i64 16, i1 false)
@@ -424,7 +431,7 @@ void testArrayInit() {
 // STACK-PROT-NOT: sspreq
 // STACK-PROT-NEXT: attributes
 
-// CHECK: define void @_Z15testConvertNullv(
+// CHECK: define{{.*}} void @_Z15testConvertNullv(
 // CHECK: %[[T:.*]] = alloca { i64, i64 },
 // store { i64, i64 } zeroinitializer, { i64, i64 }* %[[T]],
 
diff --git a/clang/test/CodeGenCXX/ptrauth-type-info-vtable.cpp b/clang/test/CodeGenCXX/ptrauth-type-info-vtable.cpp
index d5f69e0485140..174aeda89d175 100644
--- a/clang/test/CodeGenCXX/ptrauth-type-info-vtable.cpp
+++ b/clang/test/CodeGenCXX/ptrauth-type-info-vtable.cpp
@@ -4,6 +4,12 @@
 // RUN:   -fptrauth-vtable-pointer-address-discrimination \
 // RUN:   %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,NODISC
 
+// RUN: %clang_cc1 -DENABLE_TID=0 -I%S -std=c++11 -triple=aarch64-linux-gnu \
+// RUN:   -fptrauth-calls -fptrauth-intrinsics \
+// RUN:   -fptrauth-vtable-pointer-type-discrimination \
+// RUN:   -fptrauth-vtable-pointer-address-discrimination \
+// RUN:   %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,NODISC
+
 // RUN: %clang_cc1 -DENABLE_TID=1 -I%S -std=c++11 -triple=arm64e-apple-darwin \
 // RUN:   -fptrauth-calls -fptrauth-intrinsics \
 // RUN:   -fptrauth-vtable-pointer-type-discrimination \
@@ -11,6 +17,13 @@
 // RUN:   -fptrauth-type-info-vtable-pointer-discrimination \
 // RUN:   %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,DISC
 
+// RUN: %clang_cc1 -DENABLE_TID=1 -I%S -std=c++11 -triple=aarch64-linux-gnu \
+// RUN:   -fptrauth-calls -fptrauth-intrinsics \
+// RUN:   -fptrauth-vtable-pointer-type-discrimination \
+// RUN:   -fptrauth-vtable-pointer-address-discrimination \
+// RUN:   -fptrauth-type-info-vtable-pointer-discrimination \
+// RUN:   %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK,DISC
+
 // copied from typeinfo
 namespace std {
 
@@ -64,7 +77,7 @@ TestStruct::~TestStruct(){}
 extern "C" void test_vtable(std::type_info* t) {
   t->test_method();
 }
-// NODISC: define void @test_vtable(ptr noundef %t)
+// NODISC: define{{.*}} void @test_vtable(ptr noundef %t)
 // NODISC: [[T_ADDR:%.*]] = alloca ptr, align 8
 // NODISC: store ptr %t, ptr [[T_ADDR]], align 8
 // NODISC: [[T:%.*]] = load ptr, ptr [[T_ADDR]], align 8
@@ -72,7 +85,7 @@ extern "C" void test_vtable(std::type_info* t) {
 // NODISC: [[CAST_VPTR:%.*]] = ptrtoint ptr [[VPTR]] to i64
 // NODISC: [[AUTHED:%.*]] = call i64 @llvm.ptrauth.auth(i64 [[CAST_VPTR]], i32 2, i64 0)
 
-// DISC: define void @test_vtable(ptr noundef %t)
+// DISC: define{{.*}} void @test_vtable(ptr noundef %t)
 // DISC: [[T_ADDR:%.*]] = alloca ptr, align 8
 // DISC: store ptr %t, ptr [[T_ADDR]], align 8
 // DISC: [[T:%.*]] = load ptr, ptr [[T_ADDR]], align 8
diff --git a/clang/test/Driver/aarch64-ptrauth.c b/clang/test/Driver/aarch64-ptrauth.c
index eeb9500792d75..c8e3aeef1640a 100644
--- a/clang/test/Driver/aarch64-ptrauth.c
+++ b/clang/test/Driver/aarch64-ptrauth.c
@@ -11,9 +11,11 @@
 // RUN:   -fno-ptrauth-auth-traps -fptrauth-auth-traps \
 // RUN:   -fno-ptrauth-vtable-pointer-address-discrimination -fptrauth-vtable-pointer-address-discrimination \
 // RUN:   -fno-ptrauth-vtable-pointer-type-discrimination -fptrauth-vtable-pointer-type-discrimination \
+// RUN:   -fno-ptrauth-type-info-vtable-pointer-discrimination -fptrauth-type-info-vtable-pointer-discrimination \
 // RUN:   -fno-ptrauth-init-fini -fptrauth-init-fini \
+// RUN:   -fno-ptrauth-indirect-gotos -fptrauth-indirect-gotos \
 // RUN:   %s 2>&1 | FileCheck %s --check-prefix=ALL
-// ALL: "-cc1"{{.*}} "-fptrauth-intrinsics" "-fptrauth-calls" "-fptrauth-returns" "-fptrauth-auth-traps" "-fptrauth-vtable-pointer-address-discrimination" "-fptrauth-vtable-pointer-type-discrimination" "-fptrauth-init-fini"
+// ALL: "-cc1"{{.*}} "-fptrauth-intrinsics" "-fptrauth-calls" "-fptrauth-returns" "-fptrauth-auth-traps" "-fptrauth-vtable-pointer-address-discrimination" "-fptrauth-vtable-pointer-type-discrimination" "-fptrauth-type-info-vtable-pointer-discrimination" "-fptrauth-init-fini" "-fptrauth-indirect-gotos"
 
 // RUN: %clang -### -c --target=aarch64-linux -mabi=pauthtest %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI1
 // RUN: %clang -### -c --target=aarch64-linux-pauthtest %s 2>&1 | FileCheck %s --check-prefix=PAUTHABI1
@@ -34,13 +36,16 @@
 
 // RUN: not %clang -### -c --target=x86_64 -fptrauth-intrinsics -fptrauth-calls -fptrauth-returns -fptrauth-auth-traps \
 // RUN:   -fptrauth-vtable-pointer-address-discrimination -fptrauth-vtable-pointer-type-discrimination \
-// RUN:   -fptrauth-init-fini %s 2>&1 | FileCheck %s --check-prefix=ERR1
+// RUN:   -fptrauth-type-info-vtable-pointer-discrimination -fptrauth-indirect-gotos -fptrauth-init-fini %s 2>&1 | \
+// RUN:   FileCheck %s --check-prefix=ERR1
 // ERR1:      error: unsupported option '-fptrauth-intrinsics' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-calls' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-returns' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-auth-traps' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-vtable-pointer-address-discrimination' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-vtable-pointer-type-discrimination' for target '{{.*}}'
+// ERR1-NEXT: error: unsupported option '-fptrauth-type-info-vtable-pointer-discrimination' for target '{{.*}}'
+// ERR1-NEXT: error: unsupported option '-fptrauth-indirect-gotos' for target '{{.*}}'
 // ERR1-NEXT: error: unsupported option '-fptrauth-init-fini' for target '{{.*}}'
 
 //// Only support PAuth ABI for Linux as for now.
diff --git a/clang/test/Preprocessor/ptrauth_feature.c b/clang/test/Preprocessor/ptrauth_feature.c
index 1330ad10b4b47..14059f827b94c 100644
--- a/clang/test/Preprocessor/ptrauth_feature.c
+++ b/clang/test/Preprocessor/ptrauth_feature.c
@@ -2,25 +2,31 @@
 //// For example, -fptrauth-init-fini will not affect codegen without -fptrauth-calls, but the preprocessor feature would be set anyway.
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-intrinsics | \
-// RUN:   FileCheck %s --check-prefixes=INTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOFUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=INTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-calls | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,CALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOFUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,CALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-returns | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,RETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOFUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,RETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-vtable-pointer-address-discrimination | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,VPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOFUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,VPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-vtable-pointer-type-discrimination | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,VPTR_TYPE_DISCR,NOFUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,VPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
+
+// RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-type-info-vtable-pointer-discrimination | \
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,TYPE_INFO_DISCR,NOFUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-function-pointer-type-discrimination | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,FUNC,NOINITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,FUNC,NOINITFINI,NOGOTOS
 
 // RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-init-fini | \
-// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOFUNC,INITFINI
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,INITFINI,NOGOTOS
+
+// RUN: %clang_cc1 -E %s -triple=aarch64 -fptrauth-indirect-gotos | \
+// RUN:   FileCheck %s --check-prefixes=NOINTRIN,NOCALLS,NORETS,NOVPTR_ADDR_DISCR,NOVPTR_TYPE_DISCR,NOTYPE_INFO_DISCR,NOFUNC,NOINITFINI,GOTOS
 
 #if __has_feature(ptrauth_intrinsics)
 // INTRIN: has_ptrauth_intrinsics
@@ -71,6 +77,14 @@ void has_ptrauth_vtable_pointer_type_discrimination() {}
 void no_ptrauth_vtable_pointer_type_discrimination() {}
 #endif
 
+#if __has_feature(ptrauth_type_info_vtable_pointer_discrimination)
+// TYPE_INFO_DISCR: has_ptrauth_type_info_vtable_pointer_discrimination
+void has_ptrauth_type_info_vtable_pointer_discrimination() {}
+#else
+// NOTYPE_INFO_DISCR: no_ptrauth_type_info_vtable_pointer_discrimination
+void no_ptrauth_type_info_vtable_pointer_discrimination() {}
+#endif
+
 #if __has_feature(ptrauth_function_pointer_type_discrimination)
 // FUNC: has_ptrauth_function_pointer_type_discrimination
 void has_ptrauth_function_pointer_type_discrimination() {}
@@ -86,3 +100,11 @@ void has_ptrauth_init_fini() {}
 // NOINITFINI: no_ptrauth_init_fini
 void no_ptrauth_init_fini() {}
 #endif
+
+#if __has_feature(ptrauth_indirect_gotos)
+// GOTOS: has_ptrauth_indirect_gotos
+void has_ptrauth_indirect_gotos() {}
+#else
+// NOGOTOS: no_ptrauth_indirect_gotos
+void no_ptrauth_indirect_gotos() {}
+#endif
diff --git a/clang/test/Sema/ptrauth-indirect-goto.c b/clang/test/Sema/ptrauth-indirect-goto.c
index 47bc76738d23b..7304f5c30a117 100644
--- a/clang/test/Sema/ptrauth-indirect-goto.c
+++ b/clang/test/Sema/ptrauth-indirect-goto.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 -triple arm64e-apple-darwin -fsyntax-only -verify %s -fptrauth-indirect-gotos
+// RUN: %clang_cc1 -triple aarch64-linux-gnu   -fsyntax-only -verify %s -fptrauth-indirect-gotos
 
 int f() {
   static void *addrs[] = { &&l1, &&l2 };

>From 9536b026ac46c34d607c0a277c8fbdc183d53b9d Mon Sep 17 00:00:00 2001
From: Dimitry Andric <dimitry at andric.com>
Date: Mon, 29 Jul 2024 22:00:07 +0200
Subject: [PATCH 68/91] [compiler-rt] Fix format string warnings in FreeBSD
 DumpAllRegisters (#101072)

On FreeBSD amd64 (aka x86_64), registers are always defined as
`int64_t`, which in turn is equivalent to `long`. This leads to a number
of warnings in `DumpAllRegisters()`:

compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:2245:31: warning:
format specifies type 'unsigned long long' but the argument has type
'__register_t' (aka 'long') [-Wformat]
     2245 |   Printf("rax = 0x%016llx  ", ucontext->uc_mcontext.mc_rax);
          |                   ~~~~~~~     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
          |                   %016lx
compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:2246:31: warning:
format specifies type 'unsigned long long' but the argument has type
'__register_t' (aka 'long') [-Wformat]
     2246 |   Printf("rbx = 0x%016llx  ", ucontext->uc_mcontext.mc_rbx);
          |                   ~~~~~~~     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
          |                   %016lx
    ... more of these ...

Fix it by using the `lx` format.

(cherry picked from commit 62bd08acedc88d8976a017f7f6818f3167dfa697)
---
 .../lib/sanitizer_common/sanitizer_linux.cpp  | 32 +++++++++----------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
index 483a1042a6238..76acf591871ab 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
@@ -2242,25 +2242,25 @@ void SignalContext::DumpAllRegisters(void *context) {
 #  elif SANITIZER_FREEBSD
 #    if defined(__x86_64__)
   Report("Register values:\n");
-  Printf("rax = 0x%016llx  ", ucontext->uc_mcontext.mc_rax);
-  Printf("rbx = 0x%016llx  ", ucontext->uc_mcontext.mc_rbx);
-  Printf("rcx = 0x%016llx  ", ucontext->uc_mcontext.mc_rcx);
-  Printf("rdx = 0x%016llx  ", ucontext->uc_mcontext.mc_rdx);
+  Printf("rax = 0x%016lx  ", ucontext->uc_mcontext.mc_rax);
+  Printf("rbx = 0x%016lx  ", ucontext->uc_mcontext.mc_rbx);
+  Printf("rcx = 0x%016lx  ", ucontext->uc_mcontext.mc_rcx);
+  Printf("rdx = 0x%016lx  ", ucontext->uc_mcontext.mc_rdx);
   Printf("\n");
-  Printf("rdi = 0x%016llx  ", ucontext->uc_mcontext.mc_rdi);
-  Printf("rsi = 0x%016llx  ", ucontext->uc_mcontext.mc_rsi);
-  Printf("rbp = 0x%016llx  ", ucontext->uc_mcontext.mc_rbp);
-  Printf("rsp = 0x%016llx  ", ucontext->uc_mcontext.mc_rsp);
+  Printf("rdi = 0x%016lx  ", ucontext->uc_mcontext.mc_rdi);
+  Printf("rsi = 0x%016lx  ", ucontext->uc_mcontext.mc_rsi);
+  Printf("rbp = 0x%016lx  ", ucontext->uc_mcontext.mc_rbp);
+  Printf("rsp = 0x%016lx  ", ucontext->uc_mcontext.mc_rsp);
   Printf("\n");
-  Printf(" r8 = 0x%016llx  ", ucontext->uc_mcontext.mc_r8);
-  Printf(" r9 = 0x%016llx  ", ucontext->uc_mcontext.mc_r9);
-  Printf("r10 = 0x%016llx  ", ucontext->uc_mcontext.mc_r10);
-  Printf("r11 = 0x%016llx  ", ucontext->uc_mcontext.mc_r11);
+  Printf(" r8 = 0x%016lx  ", ucontext->uc_mcontext.mc_r8);
+  Printf(" r9 = 0x%016lx  ", ucontext->uc_mcontext.mc_r9);
+  Printf("r10 = 0x%016lx  ", ucontext->uc_mcontext.mc_r10);
+  Printf("r11 = 0x%016lx  ", ucontext->uc_mcontext.mc_r11);
   Printf("\n");
-  Printf("r12 = 0x%016llx  ", ucontext->uc_mcontext.mc_r12);
-  Printf("r13 = 0x%016llx  ", ucontext->uc_mcontext.mc_r13);
-  Printf("r14 = 0x%016llx  ", ucontext->uc_mcontext.mc_r14);
-  Printf("r15 = 0x%016llx  ", ucontext->uc_mcontext.mc_r15);
+  Printf("r12 = 0x%016lx  ", ucontext->uc_mcontext.mc_r12);
+  Printf("r13 = 0x%016lx  ", ucontext->uc_mcontext.mc_r13);
+  Printf("r14 = 0x%016lx  ", ucontext->uc_mcontext.mc_r14);
+  Printf("r15 = 0x%016lx  ", ucontext->uc_mcontext.mc_r15);
   Printf("\n");
 #    elif defined(__i386__)
   Report("Register values:\n");

>From 404746b9f21bef631eac09469bfcc35e8cfe0e63 Mon Sep 17 00:00:00 2001
From: Daniel Martinez <danielpedromartinez at duck.com>
Date: Mon, 29 Jul 2024 22:20:18 +0000
Subject: [PATCH 69/91] [nsan] Remove mallopt from nsan_interceptors (#101055)

Fixes a build failure on 19.1.0-rc1 when building on linux with musl as
the libc

musl does not provide mallopt, whereas glibc does. mallopt has
portability issues with other libc implementations. Just remove the use.

Co-authored-by: Daniel Martinez <danielmartinez at cock.li>
(cherry picked from commit 2c3eb8db057b9d58acd4735999f0f5d5d8d55b0d)
---
 compiler-rt/lib/nsan/nsan_interceptors.cpp | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/compiler-rt/lib/nsan/nsan_interceptors.cpp b/compiler-rt/lib/nsan/nsan_interceptors.cpp
index 544b44f53cc42..852524bd37332 100644
--- a/compiler-rt/lib/nsan/nsan_interceptors.cpp
+++ b/compiler-rt/lib/nsan/nsan_interceptors.cpp
@@ -21,10 +21,6 @@
 
 #include <wchar.h>
 
-#if SANITIZER_LINUX
-extern "C" int mallopt(int param, int value);
-#endif
-
 using namespace __sanitizer;
 using __nsan::nsan_init_is_running;
 using __nsan::nsan_initialized;
@@ -209,12 +205,6 @@ void __nsan::InitializeInterceptors() {
   static bool initialized = false;
   CHECK(!initialized);
 
-  // Instruct libc malloc to consume less memory.
-#if SANITIZER_LINUX
-  mallopt(1, 0);          // M_MXFAST
-  mallopt(-3, 32 * 1024); // M_MMAP_THRESHOLD
-#endif
-
   InitializeMallocInterceptors();
 
   INTERCEPT_FUNCTION(memset);

>From 392b77d58a91049a155f3390ec16941a848aa766 Mon Sep 17 00:00:00 2001
From: Owen Pan <owenpiano at gmail.com>
Date: Mon, 29 Jul 2024 18:01:44 -0700
Subject: [PATCH 70/91] [clang-format] Fix misannotations of `<` in ternary
 expressions (#100980)

Fixes #100300.

(cherry picked from commit 73c961a3345c697f40e2148318f34f5f347701c1)
---
 clang/lib/Format/TokenAnnotator.cpp           | 42 ++++++++++++-------
 clang/unittests/Format/TokenAnnotatorTest.cpp | 23 ++++++++++
 2 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index 5c11f3cb1a874..63c8699fd62d1 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -154,8 +154,8 @@ class AnnotatingParser {
     if (NonTemplateLess.count(CurrentToken->Previous) > 0)
       return false;
 
-    const FormatToken &Previous = *CurrentToken->Previous; // The '<'.
-    if (Previous.Previous) {
+    if (const auto &Previous = *CurrentToken->Previous; // The '<'.
+        Previous.Previous) {
       if (Previous.Previous->Tok.isLiteral())
         return false;
       if (Previous.Previous->is(tok::r_brace))
@@ -175,11 +175,13 @@ class AnnotatingParser {
     FormatToken *Left = CurrentToken->Previous;
     Left->ParentBracket = Contexts.back().ContextKind;
     ScopedContextCreator ContextCreator(*this, tok::less, 12);
-
     Contexts.back().IsExpression = false;
+
+    const auto *BeforeLess = Left->Previous;
+
     // If there's a template keyword before the opening angle bracket, this is a
     // template parameter, not an argument.
-    if (Left->Previous && Left->Previous->isNot(tok::kw_template))
+    if (BeforeLess && BeforeLess->isNot(tok::kw_template))
       Contexts.back().ContextType = Context::TemplateArgument;
 
     if (Style.Language == FormatStyle::LK_Java &&
@@ -187,19 +189,24 @@ class AnnotatingParser {
       next();
     }
 
-    while (CurrentToken) {
+    for (bool SeenTernaryOperator = false; CurrentToken;) {
+      const bool InExpr = Contexts[Contexts.size() - 2].IsExpression;
       if (CurrentToken->is(tok::greater)) {
+        const auto *Next = CurrentToken->Next;
         // Try to do a better job at looking for ">>" within the condition of
         // a statement. Conservatively insert spaces between consecutive ">"
         // tokens to prevent splitting right bitshift operators and potentially
         // altering program semantics. This check is overly conservative and
         // will prevent spaces from being inserted in select nested template
         // parameter cases, but should not alter program semantics.
-        if (CurrentToken->Next && CurrentToken->Next->is(tok::greater) &&
+        if (Next && Next->is(tok::greater) &&
             Left->ParentBracket != tok::less &&
             CurrentToken->getStartOfNonWhitespace() ==
-                CurrentToken->Next->getStartOfNonWhitespace().getLocWithOffset(
-                    -1)) {
+                Next->getStartOfNonWhitespace().getLocWithOffset(-1)) {
+          return false;
+        }
+        if (InExpr && SeenTernaryOperator &&
+            (!Next || !Next->isOneOf(tok::l_paren, tok::l_brace))) {
           return false;
         }
         Left->MatchingParen = CurrentToken;
@@ -210,14 +217,14 @@ class AnnotatingParser {
         //   msg: < item: data >
         // In TT_TextProto, map<key, value> does not occur.
         if (Style.Language == FormatStyle::LK_TextProto ||
-            (Style.Language == FormatStyle::LK_Proto && Left->Previous &&
-             Left->Previous->isOneOf(TT_SelectorName, TT_DictLiteral))) {
+            (Style.Language == FormatStyle::LK_Proto && BeforeLess &&
+             BeforeLess->isOneOf(TT_SelectorName, TT_DictLiteral))) {
           CurrentToken->setType(TT_DictLiteral);
         } else {
           CurrentToken->setType(TT_TemplateCloser);
           CurrentToken->Tok.setLength(1);
         }
-        if (CurrentToken->Next && CurrentToken->Next->Tok.isLiteral())
+        if (Next && Next->Tok.isLiteral())
           return false;
         next();
         return true;
@@ -229,18 +236,21 @@ class AnnotatingParser {
       }
       if (CurrentToken->isOneOf(tok::r_paren, tok::r_square, tok::r_brace))
         return false;
+      const auto &Prev = *CurrentToken->Previous;
       // If a && or || is found and interpreted as a binary operator, this set
       // of angles is likely part of something like "a < b && c > d". If the
       // angles are inside an expression, the ||/&& might also be a binary
       // operator that was misinterpreted because we are parsing template
       // parameters.
       // FIXME: This is getting out of hand, write a decent parser.
-      if (CurrentToken->Previous->isOneOf(tok::pipepipe, tok::ampamp) &&
-          CurrentToken->Previous->is(TT_BinaryOperator) &&
-          Contexts[Contexts.size() - 2].IsExpression &&
-          !Line.startsWith(tok::kw_template)) {
-        return false;
+      if (InExpr && !Line.startsWith(tok::kw_template) &&
+          Prev.is(TT_BinaryOperator)) {
+        const auto Precedence = Prev.getPrecedence();
+        if (Precedence > prec::Conditional && Precedence < prec::Relational)
+          return false;
       }
+      if (Prev.is(TT_ConditionalExpr))
+        SeenTernaryOperator = true;
       updateParameterCount(Left, CurrentToken);
       if (Style.Language == FormatStyle::LK_Proto) {
         if (FormatToken *Previous = CurrentToken->getPreviousNonComment()) {
diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp
index 51810ad047a26..386649bb6679f 100644
--- a/clang/unittests/Format/TokenAnnotatorTest.cpp
+++ b/clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -577,12 +577,20 @@ TEST_F(TokenAnnotatorTest, UnderstandsTernaryInTemplate) {
   EXPECT_TOKEN(Tokens[7], tok::greater, TT_TemplateCloser);
 
   // IsExpression = true
+
   Tokens = annotate("return foo<true ? 1 : 2>();");
   ASSERT_EQ(Tokens.size(), 13u) << Tokens;
   EXPECT_TOKEN(Tokens[2], tok::less, TT_TemplateOpener);
   EXPECT_TOKEN(Tokens[4], tok::question, TT_ConditionalExpr);
   EXPECT_TOKEN(Tokens[6], tok::colon, TT_ConditionalExpr);
   EXPECT_TOKEN(Tokens[8], tok::greater, TT_TemplateCloser);
+
+  Tokens = annotate("return foo<true ? 1 : 2>{};");
+  ASSERT_EQ(Tokens.size(), 13u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_TemplateOpener);
+  EXPECT_TOKEN(Tokens[4], tok::question, TT_ConditionalExpr);
+  EXPECT_TOKEN(Tokens[6], tok::colon, TT_ConditionalExpr);
+  EXPECT_TOKEN(Tokens[8], tok::greater, TT_TemplateCloser);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandsNonTemplateAngleBrackets) {
@@ -596,6 +604,21 @@ TEST_F(TokenAnnotatorTest, UnderstandsNonTemplateAngleBrackets) {
   EXPECT_TOKEN(Tokens[1], tok::less, TT_BinaryOperator);
   EXPECT_TOKEN(Tokens[7], tok::greater, TT_BinaryOperator);
 
+  Tokens = annotate("return A < B ? true : A > B;");
+  ASSERT_EQ(Tokens.size(), 12u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_BinaryOperator);
+  EXPECT_TOKEN(Tokens[8], tok::greater, TT_BinaryOperator);
+
+  Tokens = annotate("return A < B ? true : A > B ? false : false;");
+  ASSERT_EQ(Tokens.size(), 16u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_BinaryOperator);
+  EXPECT_TOKEN(Tokens[8], tok::greater, TT_BinaryOperator);
+
+  Tokens = annotate("return A < B ^ A > B;");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[2], tok::less, TT_BinaryOperator);
+  EXPECT_TOKEN(Tokens[6], tok::greater, TT_BinaryOperator);
+
   Tokens = annotate("ratio{-1, 2} < ratio{-1, 3} == -1 / 3 > -1 / 2;");
   ASSERT_EQ(Tokens.size(), 27u) << Tokens;
   EXPECT_TOKEN(Tokens[7], tok::less, TT_BinaryOperator);

>From 63d44ea32a28ed49e99572ca46b03eb92706433e Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Mon, 29 Jul 2024 22:46:18 +0200
Subject: [PATCH 71/91] [NVPTX] Fix DwarfFrameBase construction (#101000)

The `{0}` here was initializing the first union member `Register`,
rather than the union member used by CFA, which is `Offset`. Prior to
https://github.com/llvm/llvm-project/pull/99263 this was harmless, but
now they have different layout, leading to test failures on some
platforms (at least i686 and s390x).

(cherry picked from commit 842a332f11f53c698fa0560505e533ecdca28876)
---
 llvm/lib/Target/NVPTX/NVPTXFrameLowering.cpp | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/NVPTX/NVPTXFrameLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXFrameLowering.cpp
index 10ae81e0460e3..9abe0e3186f20 100644
--- a/llvm/lib/Target/NVPTX/NVPTXFrameLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXFrameLowering.cpp
@@ -93,5 +93,8 @@ MachineBasicBlock::iterator NVPTXFrameLowering::eliminateCallFramePseudoInstr(
 
 TargetFrameLowering::DwarfFrameBase
 NVPTXFrameLowering::getDwarfFrameBase(const MachineFunction &MF) const {
-  return {DwarfFrameBase::CFA, {0}};
+  DwarfFrameBase FrameBase;
+  FrameBase.Kind = DwarfFrameBase::CFA;
+  FrameBase.Location.Offset = 0;
+  return FrameBase;
 }

>From 146fc62f508ba12026f712d9576c80ea95fc6747 Mon Sep 17 00:00:00 2001
From: Jacek Caban <jacek at codeweavers.com>
Date: Sat, 27 Jul 2024 14:29:05 +0200
Subject: [PATCH 72/91] [clang][ARM64EC] Add support for hybrid_patchable
 attribute. (#99478)

(cherry picked from commit ea98dc8b8f508b8393651992830e5e51d3876728)
---
 clang/docs/ReleaseNotes.rst                   |  3 ++
 clang/include/clang/Basic/Attr.td             |  9 +++++
 clang/include/clang/Basic/AttrDocs.td         | 10 ++++++
 .../clang/Basic/DiagnosticSemaKinds.td        |  3 ++
 clang/lib/CodeGen/CodeGenFunction.cpp         |  3 ++
 clang/lib/Sema/SemaDecl.cpp                   |  5 +++
 clang/lib/Sema/SemaDeclAttr.cpp               |  3 ++
 clang/test/CodeGen/arm64ec-hybrid-patchable.c | 34 +++++++++++++++++++
 ...a-attribute-supported-attributes-list.test |  1 +
 9 files changed, 71 insertions(+)
 create mode 100644 clang/test/CodeGen/arm64ec-hybrid-patchable.c

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 71d615553c613..610061406a1ec 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -629,6 +629,9 @@ Attribute Changes in Clang
   The attributes declare constraints about a function's behavior pertaining to blocking and
   heap memory allocation.
 
+- The ``hybrid_patchable`` attribute is now supported on ARM64EC targets. It can be used to specify
+  that a function requires an additional x86-64 thunk, which may be patched at runtime.
+
 Improvements to Clang's diagnostics
 -----------------------------------
 - Clang now emits an error instead of a warning for ``-Wundefined-internal``
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 4825979a974d2..46d0a66d59c37 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -477,6 +477,9 @@ def TargetELF : TargetSpec {
 def TargetELFOrMachO : TargetSpec {
   let ObjectFormats = ["ELF", "MachO"];
 }
+def TargetWindowsArm64EC : TargetSpec {
+  let CustomCode = [{ Target.getTriple().isWindowsArm64EC() }];
+}
 
 def TargetSupportsInitPriority : TargetSpec {
   let CustomCode = [{ !Target.getTriple().isOSzOS() }];
@@ -4027,6 +4030,12 @@ def SelectAny : InheritableAttr {
   let SimpleHandler = 1;
 }
 
+def HybridPatchable : InheritableAttr, TargetSpecificAttr<TargetWindowsArm64EC> {
+  let Spellings = [Declspec<"hybrid_patchable">, Clang<"hybrid_patchable">];
+  let Subjects = SubjectList<[Function]>;
+  let Documentation = [HybridPatchableDocs];
+}
+
 def Thread : Attr {
   let Spellings = [Declspec<"thread">];
   let LangOpts = [MicrosoftExt];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 99738812c8157..b5d468eb5ec95 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -5985,6 +5985,16 @@ For more information see
 or `msvc documentation <https://docs.microsoft.com/pl-pl/cpp/cpp/selectany>`_.
 }]; }
 
+def HybridPatchableDocs : Documentation {
+  let Category = DocCatFunction;
+  let Content = [{
+The ``hybrid_patchable`` attribute declares an ARM64EC function with an additional
+x86-64 thunk, which may be patched at runtime.
+
+For more information see
+`ARM64EC ABI documentation <https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi>`_.
+}]; }
+
 def WebAssemblyExportNameDocs : Documentation {
   let Category = DocCatFunction;
   let Content = [{
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index eb0506e71fe3f..95ce4166ceb66 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3677,6 +3677,9 @@ def err_attribute_weak_static : Error<
   "weak declaration cannot have internal linkage">;
 def err_attribute_selectany_non_extern_data : Error<
   "'selectany' can only be applied to data items with external linkage">;
+def warn_attribute_hybrid_patchable_non_extern : Warning<
+  "'hybrid_patchable' is ignored on functions without external linkage">,
+  InGroup<IgnoredAttributes>;
 def err_declspec_thread_on_thread_variable : Error<
   "'__declspec(thread)' applied to variable that already has a "
   "thread-local storage specifier">;
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index d6078696a7d91..af201554898f3 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -991,6 +991,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
   if (D && D->hasAttr<NoProfileFunctionAttr>())
     Fn->addFnAttr(llvm::Attribute::NoProfile);
 
+  if (D && D->hasAttr<HybridPatchableAttr>())
+    Fn->addFnAttr(llvm::Attribute::HybridPatchable);
+
   if (D) {
     // Function attributes take precedence over command line flags.
     if (auto *A = D->getAttr<FunctionReturnThunksAttr>()) {
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index bb25a0b3a45ae..f60cc78be4f92 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -6890,6 +6890,11 @@ static void checkAttributesAfterMerging(Sema &S, NamedDecl &ND) {
     }
   }
 
+  if (HybridPatchableAttr *Attr = ND.getAttr<HybridPatchableAttr>()) {
+    if (!ND.isExternallyVisible())
+      S.Diag(Attr->getLocation(),
+             diag::warn_attribute_hybrid_patchable_non_extern);
+  }
   if (const InheritableAttr *Attr = getDLLAttr(&ND)) {
     auto *VD = dyn_cast<VarDecl>(&ND);
     bool IsAnonymousNS = false;
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 5fd8622c90dd8..10bacc17a07ca 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -6868,6 +6868,9 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
   case ParsedAttr::AT_MSConstexpr:
     handleMSConstexprAttr(S, D, AL);
     break;
+  case ParsedAttr::AT_HybridPatchable:
+    handleSimpleAttribute<HybridPatchableAttr>(S, D, AL);
+    break;
 
   // HLSL attributes:
   case ParsedAttr::AT_HLSLNumThreads:
diff --git a/clang/test/CodeGen/arm64ec-hybrid-patchable.c b/clang/test/CodeGen/arm64ec-hybrid-patchable.c
new file mode 100644
index 0000000000000..4d1fa12afd2aa
--- /dev/null
+++ b/clang/test/CodeGen/arm64ec-hybrid-patchable.c
@@ -0,0 +1,34 @@
+// REQUIRES: aarch64-registered-target
+// RUN: %clang_cc1 -triple arm64ec-pc-windows -fms-extensions -emit-llvm -o - %s -verify | FileCheck %s
+
+// CHECK: ;    Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define dso_local i32 @func() #0 {
+int __attribute__((hybrid_patchable)) func(void) {  return 1; }
+
+// CHECK: ;    Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define dso_local i32 @func2() #0 {
+int __declspec(hybrid_patchable) func2(void) {  return 2; }
+
+// CHECK: ;    Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define dso_local i32 @func3() #0 {
+int __declspec(hybrid_patchable) func3(void);
+int func3(void) {  return 3; }
+
+// CHECK: ;    Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define dso_local i32 @func4() #0 {
+[[clang::hybrid_patchable]] int func4(void);
+int func4(void) {  return 3; }
+
+// CHECK: ; Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define internal void @static_func() #0 {
+// expected-warning at +1 {{'hybrid_patchable' is ignored on functions without external linkage}}
+static void __declspec(hybrid_patchable) static_func(void) {}
+
+// CHECK: ;    Function Attrs: hybrid_patchable noinline nounwind optnone
+// CHECK-NEXT: define linkonce_odr dso_local i32 @func5() #0 comdat {
+int inline __declspec(hybrid_patchable) func5(void) {  return 4; }
+
+void caller(void) {
+  static_func();
+  func5();
+}
diff --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index 33f9c2f51363c..e082db698ef0c 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -83,6 +83,7 @@
 // CHECK-NEXT: HIPManaged (SubjectMatchRule_variable)
 // CHECK-NEXT: HLSLResourceClass (SubjectMatchRule_record_not_is_union)
 // CHECK-NEXT: Hot (SubjectMatchRule_function)
+// CHECK-NEXT: HybridPatchable (SubjectMatchRule_function)
 // CHECK-NEXT: IBAction (SubjectMatchRule_objc_method_is_instance)
 // CHECK-NEXT: IFunc (SubjectMatchRule_function)
 // CHECK-NEXT: InitPriority (SubjectMatchRule_variable)

>From 67f509a93be67aab643ab2ca333a2f8149f49be2 Mon Sep 17 00:00:00 2001
From: Rainer Orth <ro at gcc.gnu.org>
Date: Tue, 30 Jul 2024 08:54:10 +0200
Subject: [PATCH 73/91] [sanitizer_common][test] Always skip select allocator
 tests on SPARC V9 (#100530)

Two allocator tests `FAIL` on Linux/sparc64:
```
  SanitizerCommon-Unit :: ./Sanitizer-sparcv9-Test/SanitizerCommon/CombinedAllocator32Compact
  SanitizerCommon-Unit :: ./Sanitizer-sparcv9-Test/SanitizerCommon/SizeClassAllocator32Iteration
```
The failure mode is the same on Solaris/sparcv9, where those tests are
already disabled since 0f69cbe2694a4740e6db5b99bd81a26746403072.
Therefore, this patch skips them on SPARC in general.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`.

(cherry picked from commit 3d149123f46cee5ac8d961c6bf77c5c566f1e410)
---
 .../tests/sanitizer_allocator_test.cpp              | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp b/compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
index 1a1ccce82d259..601897a64f051 100644
--- a/compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
+++ b/compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
@@ -28,12 +28,13 @@
 
 using namespace __sanitizer;
 
-#if SANITIZER_SOLARIS && defined(__sparcv9)
+#if defined(__sparcv9)
 // FIXME: These tests probably fail because Solaris/sparcv9 uses the full
-// 64-bit address space.  Needs more investigation
-#define SKIP_ON_SOLARIS_SPARCV9(x) DISABLED_##x
+// 64-bit address space.  Same on Linux/sparc64, so probably a general SPARC
+// issue.  Needs more investigation
+#  define SKIP_ON_SPARCV9(x) DISABLED_##x
 #else
-#define SKIP_ON_SOLARIS_SPARCV9(x) x
+#  define SKIP_ON_SPARCV9(x) x
 #endif
 
 // On 64-bit systems with small virtual address spaces (e.g. 39-bit) we can't
@@ -781,7 +782,7 @@ TEST(SanitizerCommon, CombinedAllocator64VeryCompact) {
 }
 #endif
 
-TEST(SanitizerCommon, SKIP_ON_SOLARIS_SPARCV9(CombinedAllocator32Compact)) {
+TEST(SanitizerCommon, SKIP_ON_SPARCV9(CombinedAllocator32Compact)) {
   TestCombinedAllocator<Allocator32Compact>();
 }
 
@@ -1028,7 +1029,7 @@ TEST(SanitizerCommon, SizeClassAllocator64DynamicPremappedIteration) {
 #endif
 #endif
 
-TEST(SanitizerCommon, SKIP_ON_SOLARIS_SPARCV9(SizeClassAllocator32Iteration)) {
+TEST(SanitizerCommon, SKIP_ON_SPARCV9(SizeClassAllocator32Iteration)) {
   TestSizeClassAllocatorIteration<Allocator32Compact>();
 }
 

>From 3389604cd95d4d12eb975f4057ed21828f5b53ce Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Thu, 25 Jul 2024 18:37:36 +0200
Subject: [PATCH 74/91] [libc++][spaceship] Marks P1614 as complete. (#99375)

Implements parts of:
- P1902R1 Missing feature-test macros 2017-2019

Completes:
- P1614R2 The Mothership has Landed

Fixes #100018
---
 libcxx/docs/FeatureTestMacroTable.rst              |  2 +-
 libcxx/docs/ReleaseNotes/19.rst                    |  1 +
 libcxx/docs/Status/Cxx20.rst                       |  1 +
 libcxx/docs/Status/Cxx20Papers.csv                 |  2 +-
 libcxx/docs/Status/SpaceshipPapers.csv             |  2 +-
 libcxx/include/version                             |  4 ++--
 .../compare.version.compile.pass.cpp               | 14 +++++++-------
 .../version.version.compile.pass.cpp               | 14 +++++++-------
 .../generate_feature_test_macro_components.py      |  3 +--
 9 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/libcxx/docs/FeatureTestMacroTable.rst b/libcxx/docs/FeatureTestMacroTable.rst
index 262da3f8937d2..a1506e115fe70 100644
--- a/libcxx/docs/FeatureTestMacroTable.rst
+++ b/libcxx/docs/FeatureTestMacroTable.rst
@@ -290,7 +290,7 @@ Status
     ---------------------------------------------------------- -----------------
     ``__cpp_lib_syncbuf``                                      ``201803L``
     ---------------------------------------------------------- -----------------
-    ``__cpp_lib_three_way_comparison``                         ``201711L``
+    ``__cpp_lib_three_way_comparison``                         ``201907L``
     ---------------------------------------------------------- -----------------
     ``__cpp_lib_to_address``                                   ``201711L``
     ---------------------------------------------------------- -----------------
diff --git a/libcxx/docs/ReleaseNotes/19.rst b/libcxx/docs/ReleaseNotes/19.rst
index c2c2bfbed4ac3..92896f6b0d11e 100644
--- a/libcxx/docs/ReleaseNotes/19.rst
+++ b/libcxx/docs/ReleaseNotes/19.rst
@@ -53,6 +53,7 @@ Implemented Papers
 ------------------
 
 - P1132R8 - ``out_ptr`` - a scalable output pointer abstraction
+- P1614R2 - The Mothership has Landed
 - P2637R3 - Member ``visit``
 - P2652R2 - Disallow User Specialization of ``allocator_traits``
 - P2819R2 - Add ``tuple`` protocol to ``complex``
diff --git a/libcxx/docs/Status/Cxx20.rst b/libcxx/docs/Status/Cxx20.rst
index c00d6fb237286..b76e30fbb3712 100644
--- a/libcxx/docs/Status/Cxx20.rst
+++ b/libcxx/docs/Status/Cxx20.rst
@@ -48,6 +48,7 @@ Paper Status
    .. [#note-P0883.1] P0883: shared_ptr and floating-point changes weren't applied as they themselves aren't implemented yet.
    .. [#note-P0883.2] P0883: ``ATOMIC_FLAG_INIT`` was marked deprecated in version 14.0, but was undeprecated with the implementation of LWG3659 in version 15.0.
    .. [#note-P0660] P0660: The paper is implemented but the features are experimental and can be enabled via ``-fexperimental-library``.
+   .. [#note-P1614] P1614: ``std::strong_order(long double, long double)`` is partly implemented.
    .. [#note-P0355] P0355: The implementation status is:
 
       * ``Calendars`` mostly done in Clang 7
diff --git a/libcxx/docs/Status/Cxx20Papers.csv b/libcxx/docs/Status/Cxx20Papers.csv
index 34fc5586f74d9..4015d7ad48b06 100644
--- a/libcxx/docs/Status/Cxx20Papers.csv
+++ b/libcxx/docs/Status/Cxx20Papers.csv
@@ -123,7 +123,7 @@
 "`P1522R1 <https://wg21.link/P1522R1>`__","LWG","Iterator Difference Type and Integer Overflow","Cologne","|Complete|","15.0","|ranges|"
 "`P1523R1 <https://wg21.link/P1523R1>`__","LWG","Views and Size Types","Cologne","|Complete|","15.0","|ranges|"
 "`P1612R1 <https://wg21.link/P1612R1>`__","LWG","Relocate Endian's Specification","Cologne","|Complete|","10.0"
-"`P1614R2 <https://wg21.link/P1614R2>`__","LWG","The Mothership has Landed","Cologne","|In Progress|",""
+"`P1614R2 <https://wg21.link/P1614R2>`__","LWG","The Mothership has Landed","Cologne","|Complete| [#note-P1614]_","19.0"
 "`P1638R1 <https://wg21.link/P1638R1>`__","LWG","basic_istream_view::iterator should not be copyable","Cologne","|Complete|","16.0","|ranges|"
 "`P1643R1 <https://wg21.link/P1643R1>`__","LWG","Add wait/notify to atomic_ref","Cologne","|Complete|","19.0"
 "`P1644R0 <https://wg21.link/P1644R0>`__","LWG","Add wait/notify to atomic<shared_ptr>","Cologne","",""
diff --git a/libcxx/docs/Status/SpaceshipPapers.csv b/libcxx/docs/Status/SpaceshipPapers.csv
index 39e1f968c1754..1ab64a9caf86a 100644
--- a/libcxx/docs/Status/SpaceshipPapers.csv
+++ b/libcxx/docs/Status/SpaceshipPapers.csv
@@ -1,5 +1,5 @@
 "Number","Name","Status","First released version"
-`P1614R2 <https://wg21.link/P1614R2>`_,The Mothership has Landed,|In Progress|,
+`P1614R2 <https://wg21.link/P1614R2>`_,The Mothership has Landed,|Complete|,19.0
 `P2404R3 <https://wg21.link/P2404R3>`_,"Relaxing ``equality_comparable_with``'s, ``totally_ordered_with``'s, and ``three_way_comparable_with``'s common reference requirements to support move-only types",,
 `LWG3330 <https://wg21.link/LWG3330>`_,Include ``<compare>`` from most library headers,"|Complete|","13.0"
 `LWG3347 <https://wg21.link/LWG3347>`_,"``std::pair<T, U>`` now requires ``T`` and ``U`` to be *less-than-comparable*",|Nothing To Do|,
diff --git a/libcxx/include/version b/libcxx/include/version
index 40548098a92d6..fe64343eafbc9 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -238,7 +238,7 @@ __cpp_lib_string_view                                   202403L <string> <string
 __cpp_lib_submdspan                                     202306L <mdspan>
 __cpp_lib_syncbuf                                       201803L <syncstream>
 __cpp_lib_text_encoding                                 202306L <text_encoding>
-__cpp_lib_three_way_comparison                          201711L <compare>
+__cpp_lib_three_way_comparison                          201907L <compare>
 __cpp_lib_to_address                                    201711L <memory>
 __cpp_lib_to_array                                      201907L <array>
 __cpp_lib_to_chars                                      202306L <charconv>
@@ -446,7 +446,7 @@ __cpp_lib_void_t                                        201411L <type_traits>
 # if !defined(_LIBCPP_HAS_NO_EXPERIMENTAL_SYNCSTREAM)
 #   define __cpp_lib_syncbuf                            201803L
 # endif
-# define __cpp_lib_three_way_comparison                 201711L
+# define __cpp_lib_three_way_comparison                 201907L
 # define __cpp_lib_to_address                           201711L
 # define __cpp_lib_to_array                             201907L
 # define __cpp_lib_type_identity                        201806L
diff --git a/libcxx/test/std/language.support/support.limits/support.limits.general/compare.version.compile.pass.cpp b/libcxx/test/std/language.support/support.limits/support.limits.general/compare.version.compile.pass.cpp
index aac00f20c7b45..1d61f43f9ee51 100644
--- a/libcxx/test/std/language.support/support.limits/support.limits.general/compare.version.compile.pass.cpp
+++ b/libcxx/test/std/language.support/support.limits/support.limits.general/compare.version.compile.pass.cpp
@@ -16,7 +16,7 @@
 // Test the feature test macros defined by <compare>
 
 /*  Constant                          Value
-    __cpp_lib_three_way_comparison    201711L [C++20]
+    __cpp_lib_three_way_comparison    201907L [C++20]
 */
 
 #include <compare>
@@ -45,8 +45,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++20"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++20"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++20"
 # endif
 
 #elif TEST_STD_VER == 23
@@ -54,8 +54,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++23"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++23"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++23"
 # endif
 
 #elif TEST_STD_VER > 23
@@ -63,8 +63,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++26"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++26"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++26"
 # endif
 
 #endif // TEST_STD_VER > 23
diff --git a/libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp b/libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp
index f26e7dc4b4c63..b8bad696f1bae 100644
--- a/libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp
+++ b/libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp
@@ -221,7 +221,7 @@
     __cpp_lib_submdspan                                     202306L [C++26]
     __cpp_lib_syncbuf                                       201803L [C++20]
     __cpp_lib_text_encoding                                 202306L [C++26]
-    __cpp_lib_three_way_comparison                          201711L [C++20]
+    __cpp_lib_three_way_comparison                          201907L [C++20]
     __cpp_lib_to_address                                    201711L [C++20]
     __cpp_lib_to_array                                      201907L [C++20]
     __cpp_lib_to_chars                                      201611L [C++17]
@@ -4438,8 +4438,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++20"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++20"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++20"
 # endif
 
 # ifndef __cpp_lib_to_address
@@ -6037,8 +6037,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++23"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++23"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++23"
 # endif
 
 # ifndef __cpp_lib_to_address
@@ -7960,8 +7960,8 @@
 # ifndef __cpp_lib_three_way_comparison
 #   error "__cpp_lib_three_way_comparison should be defined in c++26"
 # endif
-# if __cpp_lib_three_way_comparison != 201711L
-#   error "__cpp_lib_three_way_comparison should have the value 201711L in c++26"
+# if __cpp_lib_three_way_comparison != 201907L
+#   error "__cpp_lib_three_way_comparison should have the value 201907L in c++26"
 # endif
 
 # ifndef __cpp_lib_to_address
diff --git a/libcxx/utils/generate_feature_test_macro_components.py b/libcxx/utils/generate_feature_test_macro_components.py
index a351112471295..6c42748002aee 100755
--- a/libcxx/utils/generate_feature_test_macro_components.py
+++ b/libcxx/utils/generate_feature_test_macro_components.py
@@ -1302,8 +1302,7 @@ def add_version_header(tc):
         },
         {
             "name": "__cpp_lib_three_way_comparison",
-            "values": {"c++20": 201711},
-            # {"c++20": 201907} # P1614R2 The Mothership has Landed (see P1902R1 Missing feature-test macros 2017-2019)
+            "values": {"c++20": 201907},
             "headers": ["compare"],
         },
         {

>From 63cf3d4fb07a4e2c484ae44cec5df2c273fc7fff Mon Sep 17 00:00:00 2001
From: Stefan Pintilie <stefanp at ca.ibm.com>
Date: Tue, 23 Jul 2024 21:59:27 -0400
Subject: [PATCH 75/91] [RegisterCoalescer] Fix SUBREG_TO_REG handling in the
 RegisterCoalescer. (#96839)

The issue with the handling of the SUBREG_TO_REG is that we don't join
the subranges correctly when we join live ranges across the
SUBREG_TO_REG. For example when joining across this:
```
32B	  %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
```
we want to join these live ranges:
```
%0 [16r,32r:0) 0 at 16r  weight:0.000000e+00
%2 [32r,112r:0) 0 at 32r  weight:0.000000e+00
```
Before the fix the range for the resulting merged `%2` is:
```
%2 [16r,112r:0) 0 at 16r  weight:0.000000e+00
```
After the fix it is now this:
```
%2 [16r,112r:0) 0 at 16r  L000000000000000F [16r,112r:0) 0 at 16r  weight:0.000000e+00
```

Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.

(cherry picked from commit 26fa399012da00fbf806f50ad72a3b5f0ee63eab)
---
 llvm/lib/CodeGen/RegisterCoalescer.cpp        |  7 ++++
 .../test/CodeGen/PowerPC/subreg-coalescer.mir | 34 +++++++++++++++++
 llvm/test/CodeGen/X86/subreg-fail.mir         | 37 +++++++++++++++++++
 3 files changed, 78 insertions(+)
 create mode 100644 llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
 create mode 100644 llvm/test/CodeGen/X86/subreg-fail.mir

diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 1c35a88b4dc4a..043ea20191487 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -3673,6 +3673,13 @@ bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
 
     LHSVals.pruneSubRegValues(LHS, ShrinkMask);
     RHSVals.pruneSubRegValues(LHS, ShrinkMask);
+  } else if (TrackSubRegLiveness && !CP.getDstIdx() && CP.getSrcIdx()) {
+    LHS.createSubRangeFrom(LIS->getVNInfoAllocator(),
+                           CP.getNewRC()->getLaneMask(), LHS);
+    mergeSubRangeInto(LHS, RHS, TRI->getSubRegIndexLaneMask(CP.getSrcIdx()), CP,
+                      CP.getDstIdx());
+    LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
+    LHSVals.pruneSubRegValues(LHS, ShrinkMask);
   }
 
   // The merging algorithm in LiveInterval::join() can't handle conflicting
diff --git a/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
new file mode 100644
index 0000000000000..39eab1f562e71
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
@@ -0,0 +1,34 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 %s \
+# RUN:   -verify-coalescing --run-pass=register-coalescer -o - | FileCheck %s
+
+# Check that the register coalescer correctly handles merging live ranges over
+# SUBREG_TO_REG on PowerPC. The -verify-coalescing option will give an error if
+# this is incorrect.
+
+---
+name: check_subregs
+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $x3
+
+    ; CHECK-LABEL: name: check_subregs
+    ; CHECK: liveins: $x3
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:g8rc_and_g8rc_nox0 = COPY $x3
+    ; CHECK-NEXT: [[LFSUX:%[0-9]+]]:f8rc, dead [[LFSUX1:%[0-9]+]]:g8rc_and_g8rc_nox0 = LFSUX [[COPY]], [[COPY]]
+    ; CHECK-NEXT: undef [[FRSP:%[0-9]+]].sub_64:vslrc = FRSP [[LFSUX]], implicit $rm
+    ; CHECK-NEXT: [[XVCVDPSP:%[0-9]+]]:vrrc = XVCVDPSP [[FRSP]], implicit $rm
+    ; CHECK-NEXT: $v2 = COPY [[XVCVDPSP]]
+    ; CHECK-NEXT: BLR8 implicit $lr8, implicit $rm, implicit $v2
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %1:f8rc, %2:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+    %3:f4rc = FRSP killed %1, implicit $rm
+    %4:vslrc = SUBREG_TO_REG 1, %3, %subreg.sub_64
+    %5:vrrc = XVCVDPSP killed %4, implicit $rm
+    $v2 = COPY %5
+    BLR8 implicit $lr8, implicit $rm, implicit $v2
+...
+
diff --git a/llvm/test/CodeGen/X86/subreg-fail.mir b/llvm/test/CodeGen/X86/subreg-fail.mir
new file mode 100644
index 0000000000000..c8146f099b814
--- /dev/null
+++ b/llvm/test/CodeGen/X86/subreg-fail.mir
@@ -0,0 +1,37 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple x86_64-unknown-unknown %s \
+# RUN:   -verify-coalescing -enable-subreg-liveness \
+# RUN:   --run-pass=register-coalescer -o - | FileCheck %s
+
+# Check that the register coalescer correctly handles merging live ranges over
+# SUBREG_TO_REG on X86. The -verify-coalescing option will give an error if
+# this is incorrect.
+
+---
+name:            test1
+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: test1
+    ; CHECK: undef [[MOV32rm:%[0-9]+]].sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    ; CHECK-NEXT: undef [[MOV32rm1:%[0-9]+]].sub_32bit:gr64_with_sub_8bit = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    ; CHECK-NEXT: [[MOV32rm1:%[0-9]+]]:gr64_with_sub_8bit = SHL64ri [[MOV32rm1]], 32, implicit-def dead $eflags
+    ; CHECK-NEXT: [[LEA64r:%[0-9]+]]:gr64_with_sub_8bit = LEA64r [[MOV32rm1]], 1, [[MOV32rm]], 256, $noreg
+    ; CHECK-NEXT: [[LEA64r:%[0-9]+]]:gr64_with_sub_8bit = SHR64ri [[LEA64r]], 8, implicit-def dead $eflags
+    ; CHECK-NEXT: MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, [[LEA64r]].sub_32bit :: (volatile store (s32) into `ptr undef`)
+    ; CHECK-NEXT: RET 0, undef $eax
+    %0:gr32 = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %2:gr64_nosp = SUBREG_TO_REG 0, killed %0, %subreg.sub_32bit
+    %3:gr32 = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %5:gr64 = SUBREG_TO_REG 0, killed %3, %subreg.sub_32bit
+    %6:gr64 = COPY killed %5
+    %6:gr64 = SHL64ri %6, 32, implicit-def dead $eflags
+    %7:gr64 = LEA64r killed %6, 1, killed %2, 256, $noreg
+    %8:gr64 = COPY killed %7
+    %8:gr64 = SHR64ri %8, 8, implicit-def dead $eflags
+    %9:gr32 = COPY killed %8.sub_32bit
+    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, killed %9 :: (volatile store (s32) into `ptr undef`)
+    RET 0, undef $eax
+
+...

>From 64699d328a39d3a2cc7c043768111794782ef9f0 Mon Sep 17 00:00:00 2001
From: Xing Xue <xingxue at outlook.com>
Date: Tue, 30 Jul 2024 06:28:59 -0400
Subject: [PATCH 76/91] [libunwind][AIX] Fix the wrong traceback from signal
 handler (#101069)

Patch [llvm#92291](https://github.com/llvm/llvm-project/pull/92291)
causes wrong traceback from a signal handler for AIX because the AIX
unwinder uses the traceback table at the end of each function instead of
FDE/CIE for unwinding. This patch adds a condition to exclude traceback
table based unwinding from the code added by the patch.

(cherry picked from commit d90fa612604b49dfc81c3f42c106fab7401322ec)
---
 libunwind/src/UnwindCursor.hpp             | 3 ++-
 libunwind/test/aix_signal_unwind.pass.sh.S | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/libunwind/src/UnwindCursor.hpp b/libunwind/src/UnwindCursor.hpp
index 2ec60e4c123d5..758557337899e 100644
--- a/libunwind/src/UnwindCursor.hpp
+++ b/libunwind/src/UnwindCursor.hpp
@@ -2589,7 +2589,8 @@ void UnwindCursor<A, R>::setInfoBasedOnIPRegister(bool isReturnAddress) {
     --pc;
 #endif
 
-#if !(defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && defined(_WIN32))
+#if !(defined(_LIBUNWIND_SUPPORT_SEH_UNWIND) && defined(_WIN32)) &&            \
+    !defined(_LIBUNWIND_SUPPORT_TBTAB_UNWIND)
   // In case of this is frame of signal handler, the IP saved in the signal
   // handler points to first non-executed instruction, while FDE/CIE expects IP
   // to be after the first non-executed instruction.
diff --git a/libunwind/test/aix_signal_unwind.pass.sh.S b/libunwind/test/aix_signal_unwind.pass.sh.S
index 9ca18e9481f4f..a666577d095b1 100644
--- a/libunwind/test/aix_signal_unwind.pass.sh.S
+++ b/libunwind/test/aix_signal_unwind.pass.sh.S
@@ -10,7 +10,7 @@
 // a correct traceback when the function raising the signal does not save
 // the link register or does not store the stack back chain.
 
-// REQUIRES: target=powerpc{{(64)?}}-ibm-aix
+// REQUIRES: target=powerpc{{(64)?}}-ibm-aix{{.*}}
 
 // Test when the function raising the signal does not save the link register
 // RUN: %{cxx} -x c++ %s -o %t.exe -DCXX_CODE %{flags} %{compile_flags}

>From 843ed4b722074466d3c462b8180b5abe25b4b7c8 Mon Sep 17 00:00:00 2001
From: Jacek Caban <jacek at codeweavers.com>
Date: Tue, 30 Jul 2024 14:22:50 +0200
Subject: [PATCH 77/91] [CodeGen][ARM64EC] Use alias symbol for exporting
 hybrid_patchable functions. (#100872)

Exporting $hp_target symbol doesn't make sense, use the unmangled alias instead.
This is not compatible with MSVC, but it makes using dllexport together with
hybrid_patchable attribute possible.

(cherry picked from commit 41c0f89f5532ec110b927c3a67ceac83448c5d98)
---
 llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp | 5 +++++
 llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 310b152ef9817..415edb189e60c 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -833,6 +833,11 @@ bool AArch64Arm64ECCallLowering::runOnModule(Module &Mod) {
                                               "EXP+" + MangledName.value())));
       A->setAliasee(&F);
 
+      if (F.hasDLLExportStorageClass()) {
+        A->setDLLStorageClass(GlobalValue::DLLExportStorageClass);
+        F.setDLLStorageClass(GlobalValue::DefaultStorageClass);
+      }
+
       FnsMap[A] = GlobalAlias::create(GlobalValue::LinkOnceODRLinkage,
                                       MangledName.value(), &F);
       PatchableFns.insert(A);
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
index e5387d40b9c64..64fb5b36b2c62 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
@@ -238,7 +238,7 @@ define dso_local void @caller() nounwind {
 ; CHECK-NEXT:      .symidx exp
 ; CHECK-NEXT:      .word   0
 ; CHECK-NEXT:      .section        .drectve,"yni"
-; CHECK-NEXT:      .ascii  " /EXPORT:\"#exp$hp_target,EXPORTAS,exp$hp_target\""
+; CHECK-NEXT:      .ascii  " /EXPORT:exp"
 
 ; CHECK-NEXT:      .def    func;
 ; CHECK-NEXT:      .scl    2;

>From 7f1cd7866ef858bbdb2a4238c81462a0efce5562 Mon Sep 17 00:00:00 2001
From: Hubert Tong <hubert.reinterpretcast at gmail.com>
Date: Tue, 30 Jul 2024 17:56:55 -0400
Subject: [PATCH 78/91] ReleaseNotes.rst: Fix typo "my" for "may"

Replace typo for "may" with "can".
---
 clang/docs/ReleaseNotes.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 610061406a1ec..b4ef1e9672a5d 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -147,7 +147,7 @@ Clang Frontend Potentially Breaking Changes
   that ``none`` means that there is no operating system. As opposed to an unknown
   type of operating system.
 
-  This change my cause clang to not find libraries, or libraries to be built at
+  This change can cause clang to not find libraries, or libraries to be built at
   different file system locations. This can be fixed by changing your builds to
   use the new normalized triple. However, we recommend instead getting the
   normalized triple from clang itself, as this will make your builds more

>From 32b786c92f0ae52201888dcfba5c3ac789afbb3a Mon Sep 17 00:00:00 2001
From: Alexandros Lamprineas <alexandros.lamprineas at arm.com>
Date: Tue, 23 Jul 2024 19:24:41 +0100
Subject: [PATCH 79/91] [clang][FMV][AArch64] Improve streaming mode
 compatibility.

* Allow arm-streaming if all the functions versions adhere to it.
* Allow arm-streaming-compatible if all the functions versions adhere to it.
* Allow arm-locally-streaming regardless of the other functions versions.

When the caller needs to toggle the streaming mode all the function versions
of the callee must adhere to the same mode, otherwise the call will yield a
runtime error.

Imagine the versions of the callee live in separate TUs. The version that
is visible to the caller will determine the calling convention used when
generating code for the callsite. Therefore we cannot support mixing
streaming with non-streaming function versions. Imagine TU1 has a streaming
caller and calls foo._sme which is streaming-compatible. The codegen for
the callsite will not switch off the streaming mode. Then in TU2 we have
a version which is non-streaming and could potentially be called in
streaming mode. Similarly if the caller is non-streaming and the called
version is streaming-compatible the codegen for the callsite will not
switch on the streaming mode, but other versions may be streaming.
---
 .../clang/Basic/DiagnosticSemaKinds.td        |   2 -
 clang/lib/Sema/SemaDecl.cpp                   |  24 +++-
 clang/lib/Sema/SemaDeclAttr.cpp               |   7 --
 clang/test/CodeGen/aarch64-fmv-streaming.c    | 107 ++++++++++++++++++
 clang/test/Sema/aarch64-fmv-streaming.c       |  46 ++++++++
 clang/test/Sema/aarch64-sme-func-attrs.c      |  42 -------
 6 files changed, 173 insertions(+), 55 deletions(-)
 create mode 100644 clang/test/CodeGen/aarch64-fmv-streaming.c
 create mode 100644 clang/test/Sema/aarch64-fmv-streaming.c

diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 95ce4166ceb66..8a00fe21a08ce 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3811,8 +3811,6 @@ def warn_sme_locally_streaming_has_vl_args_returns : Warning<
   InGroup<AArch64SMEAttributes>, DefaultIgnore;
 def err_conflicting_attributes_arm_state : Error<
   "conflicting attributes for state '%0'">;
-def err_sme_streaming_cannot_be_multiversioned : Error<
-  "streaming function cannot be multi-versioned">;
 def err_unknown_arm_state : Error<
   "unknown state '%0'">;
 def err_missing_arm_state : Error<
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index f60cc78be4f92..01231f8e385ef 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -11014,6 +11014,9 @@ static bool AttrCompatibleWithMultiVersion(attr::Kind Kind,
   switch (Kind) {
   default:
     return false;
+  case attr::ArmLocallyStreaming:
+    return MVKind == MultiVersionKind::TargetVersion ||
+           MVKind == MultiVersionKind::TargetClones;
   case attr::Used:
     return MVKind == MultiVersionKind::Target;
   case attr::NonNull:
@@ -11150,7 +11153,21 @@ bool Sema::areMultiversionVariantFunctionsCompatible(
     FunctionType::ExtInfo OldTypeInfo = OldType->getExtInfo();
     FunctionType::ExtInfo NewTypeInfo = NewType->getExtInfo();
 
-    if (OldTypeInfo.getCC() != NewTypeInfo.getCC())
+    const auto *OldFPT = OldFD->getType()->getAs<FunctionProtoType>();
+    const auto *NewFPT = NewFD->getType()->getAs<FunctionProtoType>();
+
+    bool ArmStreamingCCMismatched = false;
+    if (OldFPT && NewFPT) {
+      unsigned Diff =
+          OldFPT->getAArch64SMEAttributes() ^ NewFPT->getAArch64SMEAttributes();
+      // Arm-streaming, arm-streaming-compatible and non-streaming versions
+      // cannot be mixed.
+      if (Diff & (FunctionType::SME_PStateSMEnabledMask |
+                  FunctionType::SME_PStateSMCompatibleMask))
+        ArmStreamingCCMismatched = true;
+    }
+
+    if (OldTypeInfo.getCC() != NewTypeInfo.getCC() || ArmStreamingCCMismatched)
       return Diag(DiffDiagIDAt.first, DiffDiagIDAt.second) << CallingConv;
 
     QualType OldReturnType = OldType->getReturnType();
@@ -11170,9 +11187,8 @@ bool Sema::areMultiversionVariantFunctionsCompatible(
     if (!CLinkageMayDiffer && OldFD->isExternC() != NewFD->isExternC())
       return Diag(DiffDiagIDAt.first, DiffDiagIDAt.second) << LanguageLinkage;
 
-    if (CheckEquivalentExceptionSpec(
-            OldFD->getType()->getAs<FunctionProtoType>(), OldFD->getLocation(),
-            NewFD->getType()->getAs<FunctionProtoType>(), NewFD->getLocation()))
+    if (CheckEquivalentExceptionSpec(OldFPT, OldFD->getLocation(), NewFPT,
+                                     NewFD->getLocation()))
       return true;
   }
   return false;
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 10bacc17a07ca..e2eada24f9fcc 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3024,9 +3024,6 @@ bool Sema::checkTargetVersionAttr(SourceLocation LiteralLoc, Decl *D,
       return Diag(LiteralLoc, diag::warn_unsupported_target_attribute)
              << Unsupported << None << CurFeature << TargetVersion;
   }
-  if (IsArmStreamingFunction(cast<FunctionDecl>(D),
-                             /*IncludeLocallyStreaming=*/false))
-    return Diag(LiteralLoc, diag::err_sme_streaming_cannot_be_multiversioned);
   return false;
 }
 
@@ -3123,10 +3120,6 @@ bool Sema::checkTargetClonesAttrString(
           HasNotDefault = true;
         }
       }
-      if (IsArmStreamingFunction(cast<FunctionDecl>(D),
-                                 /*IncludeLocallyStreaming=*/false))
-        return Diag(LiteralLoc,
-                    diag::err_sme_streaming_cannot_be_multiversioned);
     } else {
       // Other targets ( currently X86 )
       if (Cur.starts_with("arch=")) {
diff --git a/clang/test/CodeGen/aarch64-fmv-streaming.c b/clang/test/CodeGen/aarch64-fmv-streaming.c
new file mode 100644
index 0000000000000..e549ccda59ad8
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-fmv-streaming.c
@@ -0,0 +1,107 @@
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -emit-llvm -o - %s | FileCheck %s
+
+
+// CHECK-LABEL: define {{[^@]+}}@n_callee._Msve
+// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+//
+// CHECK-LABEL: define {{[^@]+}}@n_callee._Msimd
+// CHECK-SAME: () #[[ATTR1:[0-9]+]] {
+//
+__arm_locally_streaming __attribute__((target_clones("sve", "simd"))) void n_callee(void) {}
+// CHECK-LABEL: define {{[^@]+}}@n_callee._Msme2
+// CHECK-SAME: () #[[ATTR2:[0-9]+]] {
+//
+__attribute__((target_version("sme2"))) void n_callee(void) {}
+// CHECK-LABEL: define {{[^@]+}}@n_callee.default
+// CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+//
+__attribute__((target_version("default"))) void n_callee(void) {}
+
+
+// CHECK-LABEL: define {{[^@]+}}@s_callee._Msve
+// CHECK-SAME: () #[[ATTR4:[0-9]+]] {
+//
+// CHECK-LABEL: define {{[^@]+}}@s_callee._Msimd
+// CHECK-SAME: () #[[ATTR5:[0-9]+]] {
+//
+__attribute__((target_clones("sve", "simd"))) void s_callee(void) __arm_streaming {}
+// CHECK-LABEL: define {{[^@]+}}@s_callee._Msme2
+// CHECK-SAME: () #[[ATTR6:[0-9]+]] {
+//
+__arm_locally_streaming __attribute__((target_version("sme2"))) void s_callee(void) __arm_streaming {}
+// CHECK-LABEL: define {{[^@]+}}@s_callee.default
+// CHECK-SAME: () #[[ATTR7:[0-9]+]] {
+//
+__attribute__((target_version("default"))) void s_callee(void) __arm_streaming {}
+
+
+// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msve
+// CHECK-SAME: () #[[ATTR8:[0-9]+]] {
+//
+// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msimd
+// CHECK-SAME: () #[[ATTR9:[0-9]+]] {
+//
+__attribute__((target_clones("sve", "simd"))) void sc_callee(void) __arm_streaming_compatible {}
+// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msme2
+// CHECK-SAME: () #[[ATTR10:[0-9]+]] {
+//
+__arm_locally_streaming __attribute__((target_version("sme2"))) void sc_callee(void) __arm_streaming_compatible {}
+// CHECK-LABEL: define {{[^@]+}}@sc_callee.default
+// CHECK-SAME: () #[[ATTR11:[0-9]+]] {
+//
+__attribute__((target_version("default"))) void sc_callee(void) __arm_streaming_compatible {}
+
+
+// CHECK-LABEL: define {{[^@]+}}@n_caller
+// CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+// CHECK:    call void @n_callee()
+// CHECK:    call void @s_callee() #[[ATTR12:[0-9]+]]
+// CHECK:    call void @sc_callee() #[[ATTR13:[0-9]+]]
+//
+void n_caller(void) {
+  n_callee();
+  s_callee();
+  sc_callee();
+}
+
+
+// CHECK-LABEL: define {{[^@]+}}@s_caller
+// CHECK-SAME: () #[[ATTR7:[0-9]+]] {
+// CHECK:    call void @n_callee()
+// CHECK:    call void @s_callee() #[[ATTR12]]
+// CHECK:    call void @sc_callee() #[[ATTR13]]
+//
+void s_caller(void) __arm_streaming {
+  n_callee();
+  s_callee();
+  sc_callee();
+}
+
+
+// CHECK-LABEL: define {{[^@]+}}@sc_caller
+// CHECK-SAME: () #[[ATTR11:[0-9]+]] {
+// CHECK:    call void @n_callee()
+// CHECK:    call void @s_callee() #[[ATTR12]]
+// CHECK:    call void @sc_callee() #[[ATTR13]]
+//
+void sc_caller(void) __arm_streaming_compatible {
+  n_callee();
+  s_callee();
+  sc_callee();
+}
+
+
+// CHECK: attributes #[[ATTR0:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body"
+// CHECK: attributes #[[ATTR1:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body"
+// CHECK: attributes #[[ATTR2:[0-9]+]] = {{.*}}
+// CHECK: attributes #[[ATTR3]] = {{.*}}
+// CHECK: attributes #[[ATTR4:[0-9]+]] = {{.*}} "aarch64_pstate_sm_enabled"
+// CHECK: attributes #[[ATTR5:[0-9]+]] = {{.*}} "aarch64_pstate_sm_enabled"
+// CHECK: attributes #[[ATTR6:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body" "aarch64_pstate_sm_enabled"
+// CHECK: attributes #[[ATTR7]] = {{.*}} "aarch64_pstate_sm_enabled"
+// CHECK: attributes #[[ATTR8:[0-9]+]] = {{.*}} "aarch64_pstate_sm_compatible"
+// CHECK: attributes #[[ATTR9:[0-9]+]] = {{.*}} "aarch64_pstate_sm_compatible"
+// CHECK: attributes #[[ATTR10]] = {{.*}} "aarch64_pstate_sm_body" "aarch64_pstate_sm_compatible"
+// CHECK: attributes #[[ATTR11]] = {{.*}} "aarch64_pstate_sm_compatible"
+// CHECK: attributes #[[ATTR12]] = {{.*}} "aarch64_pstate_sm_enabled"
+// CHECK: attributes #[[ATTR13]] = {{.*}} "aarch64_pstate_sm_compatible"
diff --git a/clang/test/Sema/aarch64-fmv-streaming.c b/clang/test/Sema/aarch64-fmv-streaming.c
new file mode 100644
index 0000000000000..93b7656216c0c
--- /dev/null
+++ b/clang/test/Sema/aarch64-fmv-streaming.c
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -Waarch64-sme-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -Waarch64-sme-attributes -fsyntax-only -verify=expected-cpp -x c++ %s
+
+__attribute__((target_clones("sve", "simd"))) void ok_arm_streaming(void) __arm_streaming {}
+__arm_locally_streaming __attribute__((target_version("sme2"))) void ok_arm_streaming(void) __arm_streaming {}
+__attribute__((target_version("default"))) void ok_arm_streaming(void) __arm_streaming {}
+
+__attribute__((target_clones("sve", "simd"))) void ok_arm_streaming_compatible(void) __arm_streaming_compatible {}
+__arm_locally_streaming __attribute__((target_version("sme2"))) void ok_arm_streaming_compatible(void) __arm_streaming_compatible {}
+__attribute__((target_version("default"))) void ok_arm_streaming_compatible(void) __arm_streaming_compatible {}
+
+__arm_locally_streaming __attribute__((target_clones("sve", "simd"))) void ok_no_streaming(void) {}
+__attribute__((target_version("sme2"))) void ok_no_streaming(void) {}
+__attribute__((target_version("default"))) void ok_no_streaming(void) {}
+
+__attribute__((target_clones("sve", "simd"))) void bad_mixed_streaming(void) {}
+// expected-cpp-error at +2 {{multiversioned function declaration has a different calling convention}}
+// expected-error at +1 {{multiversioned function declaration has a different calling convention}}
+__attribute__((target_version("sme2"))) void bad_mixed_streaming(void) __arm_streaming {}
+// expected-cpp-error at +2 {{multiversioned function declaration has a different calling convention}}
+// expected-error at +1 {{multiversioned function declaration has a different calling convention}}
+__attribute__((target_version("default"))) void bad_mixed_streaming(void) __arm_streaming_compatible {}
+// expected-cpp-error at +2 {{multiversioned function declaration has a different calling convention}}
+// expected-error at +1 {{multiversioned function declaration has a different calling convention}}
+__arm_locally_streaming __attribute__((target_version("dotprod"))) void bad_mixed_streaming(void) __arm_streaming {}
+
+void n_caller(void) {
+  ok_arm_streaming();
+  ok_arm_streaming_compatible();
+  ok_no_streaming();
+  bad_mixed_streaming();
+}
+
+void s_caller(void) __arm_streaming {
+  ok_arm_streaming();
+  ok_arm_streaming_compatible();
+  ok_no_streaming();
+  bad_mixed_streaming();
+}
+
+void sc_caller(void) __arm_streaming_compatible {
+  ok_arm_streaming();
+  ok_arm_streaming_compatible();
+  ok_no_streaming();
+  bad_mixed_streaming();
+}
diff --git a/clang/test/Sema/aarch64-sme-func-attrs.c b/clang/test/Sema/aarch64-sme-func-attrs.c
index 6db39d6a71e36..0c263eb2610cf 100644
--- a/clang/test/Sema/aarch64-sme-func-attrs.c
+++ b/clang/test/Sema/aarch64-sme-func-attrs.c
@@ -455,48 +455,6 @@ void unimplemented_spill_fill_za(void (*share_zt0_only)(void) __arm_inout("zt0")
   share_zt0_only();
 }
 
-// expected-cpp-error at +2 {{streaming function cannot be multi-versioned}}
-// expected-error at +1 {{streaming function cannot be multi-versioned}}
-__attribute__((target_version("sme2")))
-void cannot_work_version(void) __arm_streaming {}
-// expected-cpp-error at +5 {{function declared 'void ()' was previously declared 'void () __arm_streaming', which has different SME function attributes}}
-// expected-cpp-note at -2 {{previous declaration is here}}
-// expected-error at +3 {{function declared 'void (void)' was previously declared 'void (void) __arm_streaming', which has different SME function attributes}}
-// expected-note at -4 {{previous declaration is here}}
-__attribute__((target_version("default")))
-void cannot_work_version(void) {}
-
-
-// expected-cpp-error at +2 {{streaming function cannot be multi-versioned}}
-// expected-error at +1 {{streaming function cannot be multi-versioned}}
-__attribute__((target_clones("sme2")))
-void cannot_work_clones(void) __arm_streaming {}
-
-
-__attribute__((target("sme2")))
-void just_fine_streaming(void) __arm_streaming {}
-__attribute__((target_version("sme2")))
-void just_fine(void) { just_fine_streaming(); }
-__attribute__((target_version("default")))
-void just_fine(void) {}
-
-
-__arm_locally_streaming
-__attribute__((target_version("sme2")))
-void incompatible_locally_streaming(void) {}
-// expected-error at -1 {{attribute 'target_version' multiversioning cannot be combined with attribute '__arm_locally_streaming'}}
-// expected-cpp-error at -2 {{attribute 'target_version' multiversioning cannot be combined with attribute '__arm_locally_streaming'}}
-__attribute__((target_version("default")))
-void incompatible_locally_streaming(void) {}
-
-
-void fmv_caller() {
-    cannot_work_version();
-    cannot_work_clones();
-    just_fine();
-    incompatible_locally_streaming();
-}
-
 void sme_streaming_with_vl_arg(__SVInt8_t a) __arm_streaming { }
 
 __SVInt8_t sme_streaming_returns_vl(void) __arm_streaming { __SVInt8_t r; return r; }

>From 742576dc3b332d0f67e883b445f482a51ea1feec Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Tue, 30 Jul 2024 09:25:03 +0200
Subject: [PATCH 80/91] [Sanitizers] Avoid overload ambiguity for interceptors
 (#100986)

Since glibc 2.40 some functions like openat make use of overloads when
built with `-D_FORTIFY_SOURCE=2`, see:
https://github.com/bminor/glibc/blob/master/io/bits/fcntl2.h

This means that doing something like `(uintptr_t) openat` or `(void *)
openat` is now ambiguous, breaking the compiler-rt build on new glibc
versions.

Fix this by explicitly casting the symbol to the expected function type
before casting it to an intptr. The expected type is obtained as
`decltype(REAL(func))` so we don't have to repeat the signature from
INTERCEPTOR in the INTERCEPT_FUNTION macro.

Fixes https://github.com/llvm/llvm-project/issues/100754.

(cherry picked from commit 155b7a12820ec45095988b6aa6e057afaf2bc892)
---
 .../lib/interception/interception_linux.h        | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/compiler-rt/lib/interception/interception_linux.h b/compiler-rt/lib/interception/interception_linux.h
index 433a3d9bd7fa7..2e01ff44578c3 100644
--- a/compiler-rt/lib/interception/interception_linux.h
+++ b/compiler-rt/lib/interception/interception_linux.h
@@ -28,12 +28,14 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real,
                        uptr func, uptr trampoline);
 }  // namespace __interception
 
-#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func) \
-  ::__interception::InterceptFunction(            \
-      #func,                                      \
-      (::__interception::uptr *)&REAL(func),      \
-      (::__interception::uptr)&(func),            \
-      (::__interception::uptr)&TRAMPOLINE(func))
+// Cast func to type of REAL(func) before casting to uptr in case it is an
+// overloaded function, which is the case for some glibc functions when
+// _FORTIFY_SOURCE is used. This disambiguates which overload to use.
+#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func)            \
+  ::__interception::InterceptFunction(                       \
+      #func, (::__interception::uptr *)&REAL(func),          \
+      (::__interception::uptr)(decltype(REAL(func)))&(func), \
+      (::__interception::uptr) &TRAMPOLINE(func))
 
 // dlvsym is a GNU extension supported by some other platforms.
 #if SANITIZER_GLIBC || SANITIZER_FREEBSD || SANITIZER_NETBSD
@@ -41,7 +43,7 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real,
   ::__interception::InterceptFunction(                        \
       #func, symver,                                          \
       (::__interception::uptr *)&REAL(func),                  \
-      (::__interception::uptr)&(func),                        \
+      (::__interception::uptr)(decltype(REAL(func)))&(func),  \
       (::__interception::uptr)&TRAMPOLINE(func))
 #else
 #define INTERCEPT_FUNCTION_VER_LINUX_OR_FREEBSD(func, symver) \

>From 03ae9f9fc62b0283505d2d363118b04dd5d947a8 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Tue, 30 Jul 2024 14:52:29 -0700
Subject: [PATCH 81/91] Revert "[MC] Compute fragment offsets eagerly"

This reverts commit 1a47f3f3db66589c11f8ddacfeaecc03fb80c510.

Fix #100283

This commit is actually a trigger of other preexisting problems:

* Size change of fill fragments does not influence the fixed-point iteration.
* The `invalid number of bytes` error is reported too early. Since
  `.zero A-B` might have temporary negative values in the first few
  iterations.

However, the problems appeared at least "benign" (did not affect the
Linux kernel builds) before this commit.

(cherry picked from commit 4eb5450f630849ee0518487de38d857fbe5b1aee)
---
 llvm/include/llvm/MC/MCAsmBackend.h           |  5 +-
 llvm/include/llvm/MC/MCAssembler.h            |  4 +-
 llvm/include/llvm/MC/MCSection.h              |  5 ++
 llvm/lib/MC/MCAssembler.cpp                   | 77 +++++++++----------
 llvm/lib/MC/MCSection.cpp                     |  4 +-
 .../MCTargetDesc/HexagonAsmBackend.cpp        |  4 +-
 .../Target/X86/MCTargetDesc/X86AsmBackend.cpp | 26 +++++--
 7 files changed, 71 insertions(+), 54 deletions(-)

diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h
index d1d1814dd8b52..3f88ac02cd92a 100644
--- a/llvm/include/llvm/MC/MCAsmBackend.h
+++ b/llvm/include/llvm/MC/MCAsmBackend.h
@@ -217,9 +217,8 @@ class MCAsmBackend {
   virtual bool writeNopData(raw_ostream &OS, uint64_t Count,
                             const MCSubtargetInfo *STI) const = 0;
 
-  // Return true if fragment offsets have been adjusted and an extra layout
-  // iteration is needed.
-  virtual bool finishLayout(const MCAssembler &Asm) const { return false; }
+  /// Give backend an opportunity to finish layout after relaxation
+  virtual void finishLayout(MCAssembler const &Asm) const {}
 
   /// Handle any target-specific assembler flags. By default, do nothing.
   virtual void handleAssemblerFlag(MCAssemblerFlag Flag) {}
diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h
index d9752912ee66a..c6fa48128d189 100644
--- a/llvm/include/llvm/MC/MCAssembler.h
+++ b/llvm/include/llvm/MC/MCAssembler.h
@@ -111,7 +111,6 @@ class MCAssembler {
   /// Check whether the given fragment needs relaxation.
   bool fragmentNeedsRelaxation(const MCRelaxableFragment *IF) const;
 
-  void layoutSection(MCSection &Sec);
   /// Perform one layout iteration and return true if any offsets
   /// were adjusted.
   bool layoutOnce();
@@ -148,9 +147,10 @@ class MCAssembler {
   uint64_t computeFragmentSize(const MCFragment &F) const;
 
   void layoutBundle(MCFragment *Prev, MCFragment *F) const;
+  void ensureValid(MCSection &Sec) const;
 
   // Get the offset of the given fragment inside its containing section.
-  uint64_t getFragmentOffset(const MCFragment &F) const { return F.Offset; }
+  uint64_t getFragmentOffset(const MCFragment &F) const;
 
   uint64_t getSectionAddressSize(const MCSection &Sec) const;
   uint64_t getSectionFileSize(const MCSection &Sec) const;
diff --git a/llvm/include/llvm/MC/MCSection.h b/llvm/include/llvm/MC/MCSection.h
index 1289d6f6f9f65..dcdcd094fa17b 100644
--- a/llvm/include/llvm/MC/MCSection.h
+++ b/llvm/include/llvm/MC/MCSection.h
@@ -99,6 +99,8 @@ class MCSection {
   /// Whether this section has had instructions emitted into it.
   bool HasInstructions : 1;
 
+  bool HasLayout : 1;
+
   bool IsRegistered : 1;
 
   bool IsText : 1;
@@ -167,6 +169,9 @@ class MCSection {
   bool hasInstructions() const { return HasInstructions; }
   void setHasInstructions(bool Value) { HasInstructions = Value; }
 
+  bool hasLayout() const { return HasLayout; }
+  void setHasLayout(bool Value) { HasLayout = Value; }
+
   bool isRegistered() const { return IsRegistered; }
   void setIsRegistered(bool Value) { IsRegistered = Value; }
 
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index ceeb7af0fecc4..c3da4bb5cc363 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -432,6 +432,28 @@ void MCAssembler::layoutBundle(MCFragment *Prev, MCFragment *F) const {
       DF->Offset = EF->Offset;
 }
 
+void MCAssembler::ensureValid(MCSection &Sec) const {
+  if (Sec.hasLayout())
+    return;
+  Sec.setHasLayout(true);
+  MCFragment *Prev = nullptr;
+  uint64_t Offset = 0;
+  for (MCFragment &F : Sec) {
+    F.Offset = Offset;
+    if (isBundlingEnabled() && F.hasInstructions()) {
+      layoutBundle(Prev, &F);
+      Offset = F.Offset;
+    }
+    Offset += computeFragmentSize(F);
+    Prev = &F;
+  }
+}
+
+uint64_t MCAssembler::getFragmentOffset(const MCFragment &F) const {
+  ensureValid(*F.getParent());
+  return F.Offset;
+}
+
 // Simple getSymbolOffset helper for the non-variable case.
 static bool getLabelOffset(const MCAssembler &Asm, const MCSymbol &S,
                            bool ReportError, uint64_t &Val) {
@@ -916,20 +938,22 @@ void MCAssembler::layout() {
 
   // Layout until everything fits.
   this->HasLayout = true;
-  for (MCSection &Sec : *this)
-    layoutSection(Sec);
   while (layoutOnce()) {
+    if (getContext().hadError())
+      return;
+    // Size of fragments in one section can depend on the size of fragments in
+    // another. If any fragment has changed size, we have to re-layout (and
+    // as a result possibly further relax) all.
+    for (MCSection &Sec : *this)
+      Sec.setHasLayout(false);
   }
 
   DEBUG_WITH_TYPE("mc-dump", {
       errs() << "assembler backend - post-relaxation\n--\n";
       dump(); });
 
-  // Some targets might want to adjust fragment offsets. If so, perform another
-  // layout loop.
-  if (getBackend().finishLayout(*this))
-    for (MCSection &Sec : *this)
-      layoutSection(Sec);
+  // Finalize the layout, including fragment lowering.
+  getBackend().finishLayout(*this);
 
   DEBUG_WITH_TYPE("mc-dump", {
       errs() << "assembler backend - final-layout\n--\n";
@@ -1282,42 +1306,15 @@ bool MCAssembler::relaxFragment(MCFragment &F) {
   }
 }
 
-void MCAssembler::layoutSection(MCSection &Sec) {
-  MCFragment *Prev = nullptr;
-  uint64_t Offset = 0;
-  for (MCFragment &F : Sec) {
-    F.Offset = Offset;
-    if (LLVM_UNLIKELY(isBundlingEnabled())) {
-      if (F.hasInstructions()) {
-        layoutBundle(Prev, &F);
-        Offset = F.Offset;
-      }
-      Prev = &F;
-    }
-    Offset += computeFragmentSize(F);
-  }
-}
-
 bool MCAssembler::layoutOnce() {
   ++stats::RelaxationSteps;
 
-  // Size of fragments in one section can depend on the size of fragments in
-  // another. If any fragment has changed size, we have to re-layout (and
-  // as a result possibly further relax) all.
-  bool ChangedAny = false;
-  for (MCSection &Sec : *this) {
-    for (;;) {
-      bool Changed = false;
-      for (MCFragment &F : Sec)
-        if (relaxFragment(F))
-          Changed = true;
-      ChangedAny |= Changed;
-      if (!Changed)
-        break;
-      layoutSection(Sec);
-    }
-  }
-  return ChangedAny;
+  bool Changed = false;
+  for (MCSection &Sec : *this)
+    for (MCFragment &Frag : Sec)
+      if (relaxFragment(Frag))
+        Changed = true;
+  return Changed;
 }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
diff --git a/llvm/lib/MC/MCSection.cpp b/llvm/lib/MC/MCSection.cpp
index 97e87a41c8ce5..8c2ee5635a49c 100644
--- a/llvm/lib/MC/MCSection.cpp
+++ b/llvm/lib/MC/MCSection.cpp
@@ -23,8 +23,8 @@ using namespace llvm;
 MCSection::MCSection(SectionVariant V, StringRef Name, bool IsText,
                      bool IsVirtual, MCSymbol *Begin)
     : Begin(Begin), BundleGroupBeforeFirstInst(false), HasInstructions(false),
-      IsRegistered(false), IsText(IsText), IsVirtual(IsVirtual), Name(Name),
-      Variant(V) {
+      HasLayout(false), IsRegistered(false), IsText(IsText),
+      IsVirtual(IsVirtual), Name(Name), Variant(V) {
   DummyFragment.setParent(this);
   // The initial subsection number is 0. Create a fragment list.
   CurFragList = &Subsections.emplace_back(0u, FragList{}).second;
diff --git a/llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp b/llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp
index 1570493b765ca..6acc37e599f2e 100644
--- a/llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp
+++ b/llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp
@@ -702,7 +702,7 @@ class HexagonAsmBackend : public MCAsmBackend {
     return true;
   }
 
-  bool finishLayout(const MCAssembler &Asm) const override {
+  void finishLayout(MCAssembler const &Asm) const override {
     SmallVector<MCFragment *> Frags;
     for (MCSection &Sec : Asm) {
       Frags.clear();
@@ -747,6 +747,7 @@ class HexagonAsmBackend : public MCAsmBackend {
               //assert(!Error);
               (void)Error;
               ReplaceInstruction(Asm.getEmitter(), RF, Inst);
+              Sec.setHasLayout(false);
               Size = 0; // Only look back one instruction
               break;
             }
@@ -756,7 +757,6 @@ class HexagonAsmBackend : public MCAsmBackend {
         }
       }
     }
-    return true;
   }
 }; // class HexagonAsmBackend
 
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
index fcc61d0a5e2f6..67d993a51ad97 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
@@ -201,7 +201,7 @@ class X86AsmBackend : public MCAsmBackend {
   bool padInstructionEncoding(MCRelaxableFragment &RF, MCCodeEmitter &Emitter,
                               unsigned &RemainingSize) const;
 
-  bool finishLayout(const MCAssembler &Asm) const override;
+  void finishLayout(const MCAssembler &Asm) const override;
 
   unsigned getMaximumNopSize(const MCSubtargetInfo &STI) const override;
 
@@ -856,7 +856,7 @@ bool X86AsmBackend::padInstructionEncoding(MCRelaxableFragment &RF,
   return Changed;
 }
 
-bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
+void X86AsmBackend::finishLayout(MCAssembler const &Asm) const {
   // See if we can further relax some instructions to cut down on the number of
   // nop bytes required for code alignment.  The actual win is in reducing
   // instruction count, not number of bytes.  Modern X86-64 can easily end up
@@ -864,7 +864,7 @@ bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
   // (i.e. eliminate nops) even at the cost of increasing the size and
   // complexity of others.
   if (!X86PadForAlign && !X86PadForBranchAlign)
-    return false;
+    return;
 
   // The processed regions are delimitered by LabeledFragments. -g may have more
   // MCSymbols and therefore different relaxation results. X86PadForAlign is
@@ -911,6 +911,9 @@ bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
         continue;
       }
 
+#ifndef NDEBUG
+      const uint64_t OrigOffset = Asm.getFragmentOffset(F);
+#endif
       const uint64_t OrigSize = Asm.computeFragmentSize(F);
 
       // To keep the effects local, prefer to relax instructions closest to
@@ -923,7 +926,8 @@ bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
         // Give the backend a chance to play any tricks it wishes to increase
         // the encoding size of the given instruction.  Target independent code
         // will try further relaxation, but target's may play further tricks.
-        padInstructionEncoding(RF, Asm.getEmitter(), RemainingSize);
+        if (padInstructionEncoding(RF, Asm.getEmitter(), RemainingSize))
+          Sec.setHasLayout(false);
 
         // If we have an instruction which hasn't been fully relaxed, we can't
         // skip past it and insert bytes before it.  Changing its starting
@@ -940,6 +944,14 @@ bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
       if (F.getKind() == MCFragment::FT_BoundaryAlign)
         cast<MCBoundaryAlignFragment>(F).setSize(RemainingSize);
 
+#ifndef NDEBUG
+      const uint64_t FinalOffset = Asm.getFragmentOffset(F);
+      const uint64_t FinalSize = Asm.computeFragmentSize(F);
+      assert(OrigOffset + OrigSize == FinalOffset + FinalSize &&
+             "can't move start of next fragment!");
+      assert(FinalSize == RemainingSize && "inconsistent size computation?");
+#endif
+
       // If we're looking at a boundary align, make sure we don't try to pad
       // its target instructions for some following directive.  Doing so would
       // break the alignment of the current boundary align.
@@ -953,7 +965,11 @@ bool X86AsmBackend::finishLayout(const MCAssembler &Asm) const {
     }
   }
 
-  return true;
+  // The layout is done. Mark every fragment as valid.
+  for (MCSection &Section : Asm) {
+    Asm.getFragmentOffset(*Section.curFragList()->Tail);
+    Asm.computeFragmentSize(*Section.curFragList()->Tail);
+  }
 }
 
 unsigned X86AsmBackend::getMaximumNopSize(const MCSubtargetInfo &STI) const {

>From b14801954e346a3d2f89f4047f0b0bf457bb0194 Mon Sep 17 00:00:00 2001
From: Piyou Chen <piyou.chen at sifive.com>
Date: Wed, 31 Jul 2024 00:54:03 -0700
Subject: [PATCH 82/91] Revert "[compiler-rt][RISCV] Implement
 __init_riscv_feature_bits (#85790)"

This reverts commit a41a4ac78294c728fb70a51623c602ea7f3e308a.
---
 compiler-rt/lib/builtins/CMakeLists.txt       |   1 -
 compiler-rt/lib/builtins/riscv/feature_bits.c | 298 ------------------
 2 files changed, 299 deletions(-)
 delete mode 100644 compiler-rt/lib/builtins/riscv/feature_bits.c

diff --git a/compiler-rt/lib/builtins/CMakeLists.txt b/compiler-rt/lib/builtins/CMakeLists.txt
index 88a5998fd4610..abea8c498f7bd 100644
--- a/compiler-rt/lib/builtins/CMakeLists.txt
+++ b/compiler-rt/lib/builtins/CMakeLists.txt
@@ -739,7 +739,6 @@ endif()
 set(powerpc64le_SOURCES ${powerpc64_SOURCES})
 
 set(riscv_SOURCES
-  riscv/feature_bits.c
   riscv/fp_mode.c
   riscv/save.S
   riscv/restore.S
diff --git a/compiler-rt/lib/builtins/riscv/feature_bits.c b/compiler-rt/lib/builtins/riscv/feature_bits.c
deleted file mode 100644
index 77422935bd2d3..0000000000000
--- a/compiler-rt/lib/builtins/riscv/feature_bits.c
+++ /dev/null
@@ -1,298 +0,0 @@
-//=== feature_bits.c - Update RISC-V Feature Bits Structure -*- C -*-=========//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#define RISCV_FEATURE_BITS_LENGTH 1
-struct {
-  unsigned length;
-  unsigned long long features[RISCV_FEATURE_BITS_LENGTH];
-} __riscv_feature_bits __attribute__((visibility("hidden"), nocommon));
-
-#define RISCV_VENDOR_FEATURE_BITS_LENGTH 1
-struct {
-  unsigned vendorID;
-  unsigned length;
-  unsigned long long features[RISCV_VENDOR_FEATURE_BITS_LENGTH];
-} __riscv_vendor_feature_bits __attribute__((visibility("hidden"), nocommon));
-
-// NOTE: Should sync-up with RISCVFeatures.td
-// TODO: Maybe generate a header from tablegen then include it.
-#define A_GROUPID 0
-#define A_BITMASK (1ULL << 0)
-#define C_GROUPID 0
-#define C_BITMASK (1ULL << 2)
-#define D_GROUPID 0
-#define D_BITMASK (1ULL << 3)
-#define F_GROUPID 0
-#define F_BITMASK (1ULL << 5)
-#define I_GROUPID 0
-#define I_BITMASK (1ULL << 8)
-#define M_GROUPID 0
-#define M_BITMASK (1ULL << 12)
-#define V_GROUPID 0
-#define V_BITMASK (1ULL << 21)
-#define ZACAS_GROUPID 0
-#define ZACAS_BITMASK (1ULL << 26)
-#define ZBA_GROUPID 0
-#define ZBA_BITMASK (1ULL << 27)
-#define ZBB_GROUPID 0
-#define ZBB_BITMASK (1ULL << 28)
-#define ZBC_GROUPID 0
-#define ZBC_BITMASK (1ULL << 29)
-#define ZBKB_GROUPID 0
-#define ZBKB_BITMASK (1ULL << 30)
-#define ZBKC_GROUPID 0
-#define ZBKC_BITMASK (1ULL << 31)
-#define ZBKX_GROUPID 0
-#define ZBKX_BITMASK (1ULL << 32)
-#define ZBS_GROUPID 0
-#define ZBS_BITMASK (1ULL << 33)
-#define ZFA_GROUPID 0
-#define ZFA_BITMASK (1ULL << 34)
-#define ZFH_GROUPID 0
-#define ZFH_BITMASK (1ULL << 35)
-#define ZFHMIN_GROUPID 0
-#define ZFHMIN_BITMASK (1ULL << 36)
-#define ZICBOZ_GROUPID 0
-#define ZICBOZ_BITMASK (1ULL << 37)
-#define ZICOND_GROUPID 0
-#define ZICOND_BITMASK (1ULL << 38)
-#define ZIHINTNTL_GROUPID 0
-#define ZIHINTNTL_BITMASK (1ULL << 39)
-#define ZIHINTPAUSE_GROUPID 0
-#define ZIHINTPAUSE_BITMASK (1ULL << 40)
-#define ZKND_GROUPID 0
-#define ZKND_BITMASK (1ULL << 41)
-#define ZKNE_GROUPID 0
-#define ZKNE_BITMASK (1ULL << 42)
-#define ZKNH_GROUPID 0
-#define ZKNH_BITMASK (1ULL << 43)
-#define ZKSED_GROUPID 0
-#define ZKSED_BITMASK (1ULL << 44)
-#define ZKSH_GROUPID 0
-#define ZKSH_BITMASK (1ULL << 45)
-#define ZKT_GROUPID 0
-#define ZKT_BITMASK (1ULL << 46)
-#define ZTSO_GROUPID 0
-#define ZTSO_BITMASK (1ULL << 47)
-#define ZVBB_GROUPID 0
-#define ZVBB_BITMASK (1ULL << 48)
-#define ZVBC_GROUPID 0
-#define ZVBC_BITMASK (1ULL << 49)
-#define ZVFH_GROUPID 0
-#define ZVFH_BITMASK (1ULL << 50)
-#define ZVFHMIN_GROUPID 0
-#define ZVFHMIN_BITMASK (1ULL << 51)
-#define ZVKB_GROUPID 0
-#define ZVKB_BITMASK (1ULL << 52)
-#define ZVKG_GROUPID 0
-#define ZVKG_BITMASK (1ULL << 53)
-#define ZVKNED_GROUPID 0
-#define ZVKNED_BITMASK (1ULL << 54)
-#define ZVKNHA_GROUPID 0
-#define ZVKNHA_BITMASK (1ULL << 55)
-#define ZVKNHB_GROUPID 0
-#define ZVKNHB_BITMASK (1ULL << 56)
-#define ZVKSED_GROUPID 0
-#define ZVKSED_BITMASK (1ULL << 57)
-#define ZVKSH_GROUPID 0
-#define ZVKSH_BITMASK (1ULL << 58)
-#define ZVKT_GROUPID 0
-#define ZVKT_BITMASK (1ULL << 59)
-
-#if defined(__linux__)
-
-static long syscall_impl_5_args(long number, long arg1, long arg2, long arg3,
-                                long arg4, long arg5) {
-  register long a7 __asm__("a7") = number;
-  register long a0 __asm__("a0") = arg1;
-  register long a1 __asm__("a1") = arg2;
-  register long a2 __asm__("a2") = arg3;
-  register long a3 __asm__("a3") = arg4;
-  register long a4 __asm__("a4") = arg5;
-  __asm__ __volatile__("ecall\n\t"
-                       : "=r"(a0)
-                       : "r"(a7), "r"(a0), "r"(a1), "r"(a2), "r"(a3), "r"(a4)
-                       : "memory");
-  return a0;
-}
-
-#define RISCV_HWPROBE_KEY_MVENDORID 0
-#define RISCV_HWPROBE_KEY_MARCHID 1
-#define RISCV_HWPROBE_KEY_MIMPID 2
-#define RISCV_HWPROBE_KEY_BASE_BEHAVIOR 3
-#define RISCV_HWPROBE_BASE_BEHAVIOR_IMA (1ULL << 0)
-#define RISCV_HWPROBE_KEY_IMA_EXT_0 4
-#define RISCV_HWPROBE_IMA_FD (1ULL << 0)
-#define RISCV_HWPROBE_IMA_C (1ULL << 1)
-#define RISCV_HWPROBE_IMA_V (1ULL << 2)
-#define RISCV_HWPROBE_EXT_ZBA (1ULL << 3)
-#define RISCV_HWPROBE_EXT_ZBB (1ULL << 4)
-#define RISCV_HWPROBE_EXT_ZBS (1ULL << 5)
-#define RISCV_HWPROBE_EXT_ZICBOZ (1ULL << 6)
-#define RISCV_HWPROBE_EXT_ZBC (1ULL << 7)
-#define RISCV_HWPROBE_EXT_ZBKB (1ULL << 8)
-#define RISCV_HWPROBE_EXT_ZBKC (1ULL << 9)
-#define RISCV_HWPROBE_EXT_ZBKX (1ULL << 10)
-#define RISCV_HWPROBE_EXT_ZKND (1ULL << 11)
-#define RISCV_HWPROBE_EXT_ZKNE (1ULL << 12)
-#define RISCV_HWPROBE_EXT_ZKNH (1ULL << 13)
-#define RISCV_HWPROBE_EXT_ZKSED (1ULL << 14)
-#define RISCV_HWPROBE_EXT_ZKSH (1ULL << 15)
-#define RISCV_HWPROBE_EXT_ZKT (1ULL << 16)
-#define RISCV_HWPROBE_EXT_ZVBB (1ULL << 17)
-#define RISCV_HWPROBE_EXT_ZVBC (1ULL << 18)
-#define RISCV_HWPROBE_EXT_ZVKB (1ULL << 19)
-#define RISCV_HWPROBE_EXT_ZVKG (1ULL << 20)
-#define RISCV_HWPROBE_EXT_ZVKNED (1ULL << 21)
-#define RISCV_HWPROBE_EXT_ZVKNHA (1ULL << 22)
-#define RISCV_HWPROBE_EXT_ZVKNHB (1ULL << 23)
-#define RISCV_HWPROBE_EXT_ZVKSED (1ULL << 24)
-#define RISCV_HWPROBE_EXT_ZVKSH (1ULL << 25)
-#define RISCV_HWPROBE_EXT_ZVKT (1ULL << 26)
-#define RISCV_HWPROBE_EXT_ZFH (1ULL << 27)
-#define RISCV_HWPROBE_EXT_ZFHMIN (1ULL << 28)
-#define RISCV_HWPROBE_EXT_ZIHINTNTL (1ULL << 29)
-#define RISCV_HWPROBE_EXT_ZVFH (1ULL << 30)
-#define RISCV_HWPROBE_EXT_ZVFHMIN (1ULL << 31)
-#define RISCV_HWPROBE_EXT_ZFA (1ULL << 32)
-#define RISCV_HWPROBE_EXT_ZTSO (1ULL << 33)
-#define RISCV_HWPROBE_EXT_ZACAS (1ULL << 34)
-#define RISCV_HWPROBE_EXT_ZICOND (1ULL << 35)
-#define RISCV_HWPROBE_EXT_ZIHINTPAUSE (1ULL << 36)
-#define RISCV_HWPROBE_KEY_CPUPERF_0 5
-#define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0)
-#define RISCV_HWPROBE_MISALIGNED_EMULATED (1ULL << 0)
-#define RISCV_HWPROBE_MISALIGNED_SLOW (2 << 0)
-#define RISCV_HWPROBE_MISALIGNED_FAST (3 << 0)
-#define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0)
-#define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0)
-#define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6
-/* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
-
-struct riscv_hwprobe {
-  long long key;
-  unsigned long long value;
-};
-
-#define __NR_riscv_hwprobe 258
-static long initHwProbe(struct riscv_hwprobe *Hwprobes, int len) {
-  return syscall_impl_5_args(__NR_riscv_hwprobe, (long)Hwprobes, len, 0, 0, 0);
-}
-
-#define SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(EXTNAME)                    \
-  SET_SINGLE_IMAEXT_RISCV_FEATURE(RISCV_HWPROBE_EXT_##EXTNAME, EXTNAME)
-
-#define SET_SINGLE_IMAEXT_RISCV_FEATURE(HWPROBE_BITMASK, EXT)                  \
-  SET_SINGLE_RISCV_FEATURE(IMAEXT0Value &HWPROBE_BITMASK, EXT)
-
-#define SET_SINGLE_RISCV_FEATURE(COND, EXT)                                    \
-  if (COND) {                                                                  \
-    SET_RISCV_FEATURE(EXT);                                                    \
-  }
-
-#define SET_RISCV_FEATURE(EXT) features[EXT##_GROUPID] |= EXT##_BITMASK
-
-static void initRISCVFeature(struct riscv_hwprobe Hwprobes[]) {
-
-  // Note: If a hwprobe key is unknown to the kernel, its key field
-  // will be cleared to -1, and its value set to 0.
-  // This unsets all extension bitmask bits.
-
-  // Init vendor extension
-  __riscv_vendor_feature_bits.length = 0;
-  __riscv_vendor_feature_bits.vendorID = Hwprobes[2].value;
-
-  // Init standard extension
-  // TODO: Maybe Extension implied generate from tablegen?
-  __riscv_feature_bits.length = RISCV_FEATURE_BITS_LENGTH;
-
-  unsigned long long features[RISCV_FEATURE_BITS_LENGTH];
-  int i;
-
-  for (i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++)
-    features[i] = 0;
-
-  // Check RISCV_HWPROBE_KEY_BASE_BEHAVIOR
-  unsigned long long BaseValue = Hwprobes[0].value;
-  if (BaseValue & RISCV_HWPROBE_BASE_BEHAVIOR_IMA) {
-    SET_RISCV_FEATURE(I);
-    SET_RISCV_FEATURE(M);
-    SET_RISCV_FEATURE(A);
-  }
-
-  // Check RISCV_HWPROBE_KEY_IMA_EXT_0
-  unsigned long long IMAEXT0Value = Hwprobes[1].value;
-  if (IMAEXT0Value & RISCV_HWPROBE_IMA_FD) {
-    SET_RISCV_FEATURE(F);
-    SET_RISCV_FEATURE(D);
-  }
-
-  SET_SINGLE_IMAEXT_RISCV_FEATURE(RISCV_HWPROBE_IMA_C, C);
-  SET_SINGLE_IMAEXT_RISCV_FEATURE(RISCV_HWPROBE_IMA_V, V);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBA);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBB);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBS);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZICBOZ);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBC);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBKB);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBKC);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZBKX);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKND);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKNE);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKNH);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKSED);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKSH);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZKT);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVBB);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVBC);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKB);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKG);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKNED);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKNHA);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKNHB);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKSED);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKSH);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVKT);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZFH);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZFHMIN);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZIHINTNTL);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZIHINTPAUSE);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVFH);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZVFHMIN);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZFA);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZTSO);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZACAS);
-  SET_RISCV_HWPROBE_EXT_SINGLE_RISCV_FEATURE(ZICOND);
-
-  for (i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++)
-    __riscv_feature_bits.features[i] = features[i];
-}
-
-#endif // defined(__linux__)
-
-static int FeaturesBitCached = 0;
-
-void __init_riscv_feature_bits() {
-
-  if (FeaturesBitCached)
-    return;
-
-#if defined(__linux__)
-  struct riscv_hwprobe Hwprobes[] = {
-      {RISCV_HWPROBE_KEY_BASE_BEHAVIOR, 0},
-      {RISCV_HWPROBE_KEY_IMA_EXT_0, 0},
-      {RISCV_HWPROBE_KEY_MVENDORID, 0},
-  };
-  if (initHwProbe(Hwprobes, sizeof(Hwprobes) / sizeof(Hwprobes[0])))
-    return;
-
-  initRISCVFeature(Hwprobes);
-#endif // defined(__linux__)
-
-  FeaturesBitCached = 1;
-}

>From 0e615206e3b2c5f329cd612c09f3237c6060c06e Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Wed, 31 Jul 2024 10:40:14 -0400
Subject: [PATCH 83/91] [libc++] Revert "Use GCC type traits builtins for
 remove_cv and remove_cvref (#81386)"

This reverts commit 55357160d0e151c32f86e1d6683b4bddbb706aa1.
This is only being reverted from the LLVM 19 branch as a
convenience to avoid breaking some IDEs which were not ready
for that change.

Fixes #99464
---
 libcxx/include/__type_traits/remove_cv.h    | 15 +++++++++++----
 libcxx/include/__type_traits/remove_cvref.h | 15 +++++----------
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/libcxx/include/__type_traits/remove_cv.h b/libcxx/include/__type_traits/remove_cv.h
index 50e9f3e8aa78d..c4bf612794bd5 100644
--- a/libcxx/include/__type_traits/remove_cv.h
+++ b/libcxx/include/__type_traits/remove_cv.h
@@ -10,6 +10,8 @@
 #define _LIBCPP___TYPE_TRAITS_REMOVE_CV_H
 
 #include <__config>
+#include <__type_traits/remove_const.h>
+#include <__type_traits/remove_volatile.h>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
@@ -17,18 +19,23 @@
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
+#if __has_builtin(__remove_cv) && !defined(_LIBCPP_COMPILER_GCC)
 template <class _Tp>
 struct remove_cv {
   using type _LIBCPP_NODEBUG = __remove_cv(_Tp);
 };
 
-#if defined(_LIBCPP_COMPILER_GCC)
 template <class _Tp>
-using __remove_cv_t = typename remove_cv<_Tp>::type;
+using __remove_cv_t = __remove_cv(_Tp);
 #else
 template <class _Tp>
-using __remove_cv_t = __remove_cv(_Tp);
-#endif
+struct _LIBCPP_TEMPLATE_VIS remove_cv {
+  typedef __remove_volatile_t<__remove_const_t<_Tp> > type;
+};
+
+template <class _Tp>
+using __remove_cv_t = __remove_volatile_t<__remove_const_t<_Tp> >;
+#endif // __has_builtin(__remove_cv)
 
 #if _LIBCPP_STD_VER >= 14
 template <class _Tp>
diff --git a/libcxx/include/__type_traits/remove_cvref.h b/libcxx/include/__type_traits/remove_cvref.h
index 55f894dbd1d81..e8e8745ab0960 100644
--- a/libcxx/include/__type_traits/remove_cvref.h
+++ b/libcxx/include/__type_traits/remove_cvref.h
@@ -20,26 +20,21 @@
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-#if defined(_LIBCPP_COMPILER_GCC)
+#if __has_builtin(__remove_cvref) && !defined(_LIBCPP_COMPILER_GCC)
 template <class _Tp>
-struct __remove_cvref_gcc {
-  using type = __remove_cvref(_Tp);
-};
-
-template <class _Tp>
-using __remove_cvref_t _LIBCPP_NODEBUG = typename __remove_cvref_gcc<_Tp>::type;
+using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp);
 #else
 template <class _Tp>
-using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp);
+using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cv_t<__libcpp_remove_reference_t<_Tp> >;
 #endif // __has_builtin(__remove_cvref)
 
 template <class _Tp, class _Up>
-using __is_same_uncvref = _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> >;
+struct __is_same_uncvref : _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> > {};
 
 #if _LIBCPP_STD_VER >= 20
 template <class _Tp>
 struct remove_cvref {
-  using type _LIBCPP_NODEBUG = __remove_cvref(_Tp);
+  using type _LIBCPP_NODEBUG = __remove_cvref_t<_Tp>;
 };
 
 template <class _Tp>

>From c3004032c244cb5264790dc535437b9c3b93acb6 Mon Sep 17 00:00:00 2001
From: Alexandre Ganea <aganea at havenstudios.com>
Date: Tue, 30 Jul 2024 19:06:03 -0400
Subject: [PATCH 84/91] [Support] Silence warnings when retrieving exported
 functions (#97905)

Since functions exported from DLLs are type-erased, before this patch I
was seeing the new Clang 19 warning `-Wcast-function-type-mismatch`.

This happens when building LLVM on Windows.

Following discussion in
https://github.com/llvm/llvm-project/commit/593f708118aef792f434185547f74fedeaf51dd4#commitcomment-143905744

(cherry picked from commit 39e192b379362e9e645427631c35450d55ed517d)
---
 llvm/lib/Support/Windows/Process.inc |  3 ++-
 llvm/lib/Support/Windows/Signals.inc | 38 +++++++++++++++-------------
 2 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/llvm/lib/Support/Windows/Process.inc b/llvm/lib/Support/Windows/Process.inc
index 34d294b232c32..d525f5b16e862 100644
--- a/llvm/lib/Support/Windows/Process.inc
+++ b/llvm/lib/Support/Windows/Process.inc
@@ -482,7 +482,8 @@ static RTL_OSVERSIONINFOEXW GetWindowsVer() {
     HMODULE hMod = ::GetModuleHandleW(L"ntdll.dll");
     assert(hMod);
 
-    auto getVer = (RtlGetVersionPtr)::GetProcAddress(hMod, "RtlGetVersion");
+    auto getVer =
+        (RtlGetVersionPtr)(void *)::GetProcAddress(hMod, "RtlGetVersion");
     assert(getVer);
 
     RTL_OSVERSIONINFOEXW info{};
diff --git a/llvm/lib/Support/Windows/Signals.inc b/llvm/lib/Support/Windows/Signals.inc
index 29ebf7c696e04..f11ad09f37139 100644
--- a/llvm/lib/Support/Windows/Signals.inc
+++ b/llvm/lib/Support/Windows/Signals.inc
@@ -171,23 +171,27 @@ static bool load64BitDebugHelp(void) {
   HMODULE hLib =
       ::LoadLibraryExA("Dbghelp.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32);
   if (hLib) {
-    fMiniDumpWriteDump =
-        (fpMiniDumpWriteDump)::GetProcAddress(hLib, "MiniDumpWriteDump");
-    fStackWalk64 = (fpStackWalk64)::GetProcAddress(hLib, "StackWalk64");
-    fSymGetModuleBase64 =
-        (fpSymGetModuleBase64)::GetProcAddress(hLib, "SymGetModuleBase64");
-    fSymGetSymFromAddr64 =
-        (fpSymGetSymFromAddr64)::GetProcAddress(hLib, "SymGetSymFromAddr64");
-    fSymGetLineFromAddr64 =
-        (fpSymGetLineFromAddr64)::GetProcAddress(hLib, "SymGetLineFromAddr64");
-    fSymGetModuleInfo64 =
-        (fpSymGetModuleInfo64)::GetProcAddress(hLib, "SymGetModuleInfo64");
-    fSymFunctionTableAccess64 = (fpSymFunctionTableAccess64)::GetProcAddress(
-        hLib, "SymFunctionTableAccess64");
-    fSymSetOptions = (fpSymSetOptions)::GetProcAddress(hLib, "SymSetOptions");
-    fSymInitialize = (fpSymInitialize)::GetProcAddress(hLib, "SymInitialize");
-    fEnumerateLoadedModules = (fpEnumerateLoadedModules)::GetProcAddress(
-        hLib, "EnumerateLoadedModules64");
+    fMiniDumpWriteDump = (fpMiniDumpWriteDump)(void *)::GetProcAddress(
+        hLib, "MiniDumpWriteDump");
+    fStackWalk64 = (fpStackWalk64)(void *)::GetProcAddress(hLib, "StackWalk64");
+    fSymGetModuleBase64 = (fpSymGetModuleBase64)(void *)::GetProcAddress(
+        hLib, "SymGetModuleBase64");
+    fSymGetSymFromAddr64 = (fpSymGetSymFromAddr64)(void *)::GetProcAddress(
+        hLib, "SymGetSymFromAddr64");
+    fSymGetLineFromAddr64 = (fpSymGetLineFromAddr64)(void *)::GetProcAddress(
+        hLib, "SymGetLineFromAddr64");
+    fSymGetModuleInfo64 = (fpSymGetModuleInfo64)(void *)::GetProcAddress(
+        hLib, "SymGetModuleInfo64");
+    fSymFunctionTableAccess64 =
+        (fpSymFunctionTableAccess64)(void *)::GetProcAddress(
+            hLib, "SymFunctionTableAccess64");
+    fSymSetOptions =
+        (fpSymSetOptions)(void *)::GetProcAddress(hLib, "SymSetOptions");
+    fSymInitialize =
+        (fpSymInitialize)(void *)::GetProcAddress(hLib, "SymInitialize");
+    fEnumerateLoadedModules =
+        (fpEnumerateLoadedModules)(void *)::GetProcAddress(
+            hLib, "EnumerateLoadedModules64");
   }
   return isDebugHelpInitialized();
 }

>From 19ebcf8685f2ef010a0bc8474b4bf732024a3576 Mon Sep 17 00:00:00 2001
From: Dimitry Andric <dimitry at andric.com>
Date: Mon, 29 Jul 2024 20:34:01 +0200
Subject: [PATCH 85/91] [InstrProf] Remove duplicate definition of IntPtrT

In 16e74fd48988a (for #82711) a duplicate definition of `IntPtrT` was
added to `InstrProfiling.h`, leading to warnings:

    compiler-rt/lib/profile/InstrProfiling.h:52:15: warning: redefinition of typedef 'IntPtrT' is a C11 feature [-Wtypedef-redefinition]
       52 | typedef void *IntPtrT;
          |               ^
    compiler-rt/lib/profile/InstrProfiling.h:34:15: note: previous definition is here
       34 | typedef void *IntPtrT;
          |               ^

Fix the warnings by removing the duplicate typedef.

(cherry picked from commit 2c376fe96c83443c15e6485d043ebe321904546b)
---
 compiler-rt/lib/profile/InstrProfiling.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/compiler-rt/lib/profile/InstrProfiling.h b/compiler-rt/lib/profile/InstrProfiling.h
index d424a22c212c3..6906d52eacaf1 100644
--- a/compiler-rt/lib/profile/InstrProfiling.h
+++ b/compiler-rt/lib/profile/InstrProfiling.h
@@ -49,7 +49,6 @@ typedef struct ValueProfNode {
 #include "profile/InstrProfData.inc"
 } ValueProfNode;
 
-typedef void *IntPtrT;
 typedef struct COMPILER_RT_ALIGNAS(INSTR_PROF_DATA_ALIGNMENT) VTableProfData {
 #define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Initializer) Type Name;
 #include "profile/InstrProfData.inc"

>From 2d7539381c278dd47c9dd6ecb9943d9685ab66f4 Mon Sep 17 00:00:00 2001
From: Tom Stellard <tstellar at redhat.com>
Date: Thu, 1 Aug 2024 11:23:03 -0700
Subject: [PATCH 86/91] workflows: Fix libclc-tests (#101524)

The old out-of-tree build configuration stopped working and in tree
builds are supported now, so we should use the in tree configuration.
The only downside is we can't run the tests any more, but at least we
will be able to test the build again.

(cherry picked from commit 0512ba0a435a9d693cb61f182fc9e3eb7f6dbd6a)
---
 .github/workflows/llvm-project-tests.yml | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/llvm-project-tests.yml b/.github/workflows/llvm-project-tests.yml
index 0a228c41f354e..17a54be16badc 100644
--- a/.github/workflows/llvm-project-tests.yml
+++ b/.github/workflows/llvm-project-tests.yml
@@ -131,6 +131,7 @@ jobs:
                 -DCMAKE_BUILD_TYPE=Release \
                 -DLLVM_ENABLE_ASSERTIONS=ON \
                 -DLLDB_INCLUDE_TESTS=OFF \
+                -DLIBCLC_TARGETS_TO_BUILD="amdgcn--;amdgcn--amdhsa;r600--;nvptx--;nvptx64--;nvptx--nvidiacl;nvptx64--nvidiacl" \
                 -DCMAKE_C_COMPILER_LAUNCHER=sccache \
                 -DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
                 $extra_cmake_args \
@@ -142,8 +143,6 @@ jobs:
         env:
           LLVM_BUILDDIR: ${{ steps.build-llvm.outputs.llvm-builddir }}
         run: |
-          # Make sure all of LLVM libraries that llvm-config needs are built.
+          # The libclc tests don't have a generated check target so all we can
+          # do is build it.
           ninja -C "$LLVM_BUILDDIR"
-          cmake -G Ninja -S libclc -B libclc-build -DLLVM_DIR="$LLVM_BUILDDIR"/lib/cmake/llvm -DLIBCLC_TARGETS_TO_BUILD="amdgcn--;amdgcn--amdhsa;r600--;nvptx--;nvptx64--;nvptx--nvidiacl;nvptx64--nvidiacl"
-          ninja -C libclc-build
-          ninja -C libclc-build test

>From 23f3b64082ecd06fcfbfbc2098fcaa008862545b Mon Sep 17 00:00:00 2001
From: Dimitry Andric <dimitry at andric.com>
Date: Thu, 1 Aug 2024 09:28:29 +0200
Subject: [PATCH 87/91] [lldb][FreeBSD] Fix
 NativeRegisterContextFreeBSD_{arm,mips64,powerpc} declarations (#101403)

Similar to #97796, fix the type of the `native_thread` parameter for the
arm, mips64 and powerpc variants of `NativeRegisterContextFreeBSD_*`.

Otherwise, this leads to compile errors similar to:

```
lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.cpp:85:39: error: out-of-line definition of 'NativeRegisterContextFreeBSD_powerpc' does not match any declaration in 'lldb_private::process_freebsd::NativeRegisterContextFreeBSD_powerpc'
   85 | NativeRegisterContextFreeBSD_powerpc::NativeRegisterContextFreeBSD_powerpc(
      |                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

(cherry picked from commit 7088a5ed880f29129ec844c66068e8cb61ca98bf)
---
 .../Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h  | 2 +-
 .../Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h       | 2 +-
 .../Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h      | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h
index 89ffa617294aa..b9537e6952f6c 100644
--- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h
+++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h
@@ -30,7 +30,7 @@ class NativeProcessFreeBSD;
 class NativeRegisterContextFreeBSD_arm : public NativeRegisterContextFreeBSD {
 public:
   NativeRegisterContextFreeBSD_arm(const ArchSpec &target_arch,
-                                   NativeThreadProtocol &native_thread);
+                                   NativeThreadFreeBSD &native_thread);
 
   uint32_t GetRegisterSetCount() const override;
 
diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h
index 0b4a508a7d5dd..286b4fd8d8b99 100644
--- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h
+++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h
@@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_mips64
     : public NativeRegisterContextFreeBSD {
 public:
   NativeRegisterContextFreeBSD_mips64(const ArchSpec &target_arch,
-                                      NativeThreadProtocol &native_thread);
+                                      NativeThreadFreeBSD &native_thread);
 
   uint32_t GetRegisterSetCount() const override;
 
diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h
index 3df371036f915..420db822acc0f 100644
--- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h
+++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h
@@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_powerpc
     : public NativeRegisterContextFreeBSD {
 public:
   NativeRegisterContextFreeBSD_powerpc(const ArchSpec &target_arch,
-                                       NativeThreadProtocol &native_thread);
+                                       NativeThreadFreeBSD &native_thread);
 
   uint32_t GetRegisterSetCount() const override;
 

>From 39e8e7797ae868e82b4184fbadf4572ff9bd3aa3 Mon Sep 17 00:00:00 2001
From: Damien L-G <dalg24 at gmail.com>
Date: Thu, 1 Aug 2024 10:39:27 -0400
Subject: [PATCH 88/91] [libc++] Increase atomic_ref's required alignment for
 small types (#99654)

This patch increases the alignment requirement for std::atomic_ref
such that we can guarantee lockfree operations more often. Specifically,
we require types that are 1, 2, 4, 8, or 16 bytes in size to be aligned
to at least their size to be used with std::atomic_ref.

This is the case for most types, however a notable exception is
`long long` on x86, which is 8 bytes in length but has an alignment
of 4.

As a result of this patch, one has to be more careful about the
alignment of objects used with std::atomic_ref. Failure to provide
a properly-aligned object to std::atomic_ref is a precondition
violation and is technically UB. On the flipside, this allows us
to provide an atomic_ref that is actually lockfree more often,
which is an important QOI property.

More information in the discussion at https://github.com/llvm/llvm-project/pull/99570#issuecomment-2237668661.

Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
(cherry picked from commit 59ca618e3b7aec8c32e24d781bae436dc99b2727)
---
 libcxx/include/__atomic/atomic_ref.h            | 17 +++++++++++------
 .../atomics.ref/is_always_lock_free.pass.cpp    |  2 +-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/libcxx/include/__atomic/atomic_ref.h b/libcxx/include/__atomic/atomic_ref.h
index 2849b82e1a3dd..b0180a37ab500 100644
--- a/libcxx/include/__atomic/atomic_ref.h
+++ b/libcxx/include/__atomic/atomic_ref.h
@@ -57,11 +57,6 @@ struct __get_aligner_instance {
 
 template <class _Tp>
 struct __atomic_ref_base {
-protected:
-  _Tp* __ptr_;
-
-  _LIBCPP_HIDE_FROM_ABI __atomic_ref_base(_Tp& __obj) : __ptr_(std::addressof(__obj)) {}
-
 private:
   _LIBCPP_HIDE_FROM_ABI static _Tp* __clear_padding(_Tp& __val) noexcept {
     _Tp* __ptr = std::addressof(__val);
@@ -108,10 +103,14 @@ struct __atomic_ref_base {
 
   friend struct __atomic_waitable_traits<__atomic_ref_base<_Tp>>;
 
+  // require types that are 1, 2, 4, 8, or 16 bytes in length to be aligned to at least their size to be potentially
+  // used lock-free
+  static constexpr size_t __min_alignment = (sizeof(_Tp) & (sizeof(_Tp) - 1)) || (sizeof(_Tp) > 16) ? 0 : sizeof(_Tp);
+
 public:
   using value_type = _Tp;
 
-  static constexpr size_t required_alignment = alignof(_Tp);
+  static constexpr size_t required_alignment = alignof(_Tp) > __min_alignment ? alignof(_Tp) : __min_alignment;
 
   // The __atomic_always_lock_free builtin takes into account the alignment of the pointer if provided,
   // so we create a fake pointer with a suitable alignment when querying it. Note that we are guaranteed
@@ -218,6 +217,12 @@ struct __atomic_ref_base {
   }
   _LIBCPP_HIDE_FROM_ABI void notify_one() const noexcept { std::__atomic_notify_one(*this); }
   _LIBCPP_HIDE_FROM_ABI void notify_all() const noexcept { std::__atomic_notify_all(*this); }
+
+protected:
+  typedef _Tp _Aligned_Tp __attribute__((aligned(required_alignment)));
+  _Aligned_Tp* __ptr_;
+
+  _LIBCPP_HIDE_FROM_ABI __atomic_ref_base(_Tp& __obj) : __ptr_(std::addressof(__obj)) {}
 };
 
 template <class _Tp>
diff --git a/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp b/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
index acdbf63a24d85..78e46c0397951 100644
--- a/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
+++ b/libcxx/test/std/atomics/atomics.ref/is_always_lock_free.pass.cpp
@@ -54,7 +54,7 @@ void check_always_lock_free(std::atomic_ref<T> const& a) {
 #define CHECK_ALWAYS_LOCK_FREE(T)                                                                                      \
   do {                                                                                                                 \
     typedef T type;                                                                                                    \
-    type obj{};                                                                                                        \
+    alignas(std::atomic_ref<type>::required_alignment) type obj{};                                                     \
     std::atomic_ref<type> a(obj);                                                                                      \
     check_always_lock_free(a);                                                                                         \
   } while (0)

>From 3ee69f240579430c0c0abdc4641ccdf85b4efe92 Mon Sep 17 00:00:00 2001
From: Xing Xue <xingxue at outlook.com>
Date: Thu, 1 Aug 2024 07:25:01 -0400
Subject: [PATCH 89/91] [NFC][libc++][libc++abi][libunwind][test] Fix/unify AIX
 triples used in LIT tests (#101196)

This patch fixes/unifies AIX target triples used in libc++, libc++abi,
and libunwind LIT tests.

(cherry picked from commit 2d3655037ccfa276cb0949c2ce0cff56985f6637)
---
 libcxx/test/libcxx/vendor/ibm/bad_function_call.pass.cpp        | 2 +-
 libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_32.pass.sh.s  | 2 +-
 libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_64.pass.sh.s  | 2 +-
 .../test/vendor/ibm/aix_xlclang_passing_excp_obj_32.pass.sh.S   | 2 +-
 .../test/vendor/ibm/aix_xlclang_passing_excp_obj_64.pass.sh.S   | 2 +-
 libcxxabi/test/vendor/ibm/cond_reg_restore.pass.cpp             | 2 +-
 libcxxabi/test/vendor/ibm/vec_reg_restore.pass.cpp              | 2 +-
 libunwind/test/aix_signal_unwind.pass.sh.S                      | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/libcxx/test/libcxx/vendor/ibm/bad_function_call.pass.cpp b/libcxx/test/libcxx/vendor/ibm/bad_function_call.pass.cpp
index 2b684465650fa..3714e4037a2dc 100644
--- a/libcxx/test/libcxx/vendor/ibm/bad_function_call.pass.cpp
+++ b/libcxx/test/libcxx/vendor/ibm/bad_function_call.pass.cpp
@@ -6,7 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
-// REQUIRES: target={{powerpc.*-ibm-aix.*}}
+// REQUIRES: target={{.+}}-aix{{.*}}
 // ADDITIONAL_COMPILE_FLAGS: -fvisibility-inlines-hidden
 
 // When there is a weak hidden symbol in user code and a strong definition
diff --git a/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_32.pass.sh.s b/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_32.pass.sh.s
index ce90045586082..b35c999e6e50d 100644
--- a/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_32.pass.sh.s
+++ b/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_32.pass.sh.s
@@ -9,7 +9,7 @@
 # Test that a nested exception is thrown by a destructor inside a try-block
 # when the code is generated by the legacy AIX xlclang compiler.
 
-# REQUIRES: target=powerpc-ibm-aix
+# REQUIRES: target=powerpc-ibm-aix{{.*}}
 # UNSUPPORTED: no-exceptions
 
 # RUN: %{cxx} %{flags} %s %{link_flags} \
diff --git a/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_64.pass.sh.s b/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_64.pass.sh.s
index 7b0afb9ebae38..16754db2837ca 100644
--- a/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_64.pass.sh.s
+++ b/libcxxabi/test/vendor/ibm/aix_xlclang_nested_excp_64.pass.sh.s
@@ -8,7 +8,7 @@
 # Test that a nested exception is thrown by a destructor inside a try-block
 # when the code is generated by the legacy AIX xlclang compiler.
 
-# REQUIRES: target=powerpc64-ibm-aix
+# REQUIRES: target=powerpc64-ibm-aix{{.*}}
 # UNSUPPORTED: no-exceptions
 
 # RUN: %{cxx} %{flags} %s %{link_flags} \
diff --git a/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_32.pass.sh.S b/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_32.pass.sh.S
index 71c3ab9409a81..8b92e4febf562 100644
--- a/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_32.pass.sh.S
+++ b/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_32.pass.sh.S
@@ -14,7 +14,7 @@
 // xlclang++ compiler included in this file. This file tests for the 32-bit
 // mode.
 
-# REQUIRES: target=powerpc-ibm-aix
+# REQUIRES: target=powerpc-ibm-aix{{.*}}
 # UNSUPPORTED: no-exceptions
 
 // RUN: %{cxx} -c %s -o %t1_32.o -DT1_CPP_CODE %{flags} %{compile_flags}
diff --git a/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_64.pass.sh.S b/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_64.pass.sh.S
index da413577bd38f..64d7c80e9e6dd 100644
--- a/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_64.pass.sh.S
+++ b/libcxxabi/test/vendor/ibm/aix_xlclang_passing_excp_obj_64.pass.sh.S
@@ -14,7 +14,7 @@
 // xlclang++ compiler included in this file. This file tests for the 64-bit
 // mode.
 
-# REQUIRES: target=powerpc64-ibm-aix
+# REQUIRES: target=powerpc64-ibm-aix{{.*}}
 # UNSUPPORTED: no-exceptions
 
 // RUN: %{cxx} -c %s -o %t1_64.o -DT1_CPP_CODE %{flags} %{compile_flags}
diff --git a/libcxxabi/test/vendor/ibm/cond_reg_restore.pass.cpp b/libcxxabi/test/vendor/ibm/cond_reg_restore.pass.cpp
index 63817e1b13a25..a5eb3c20534a3 100644
--- a/libcxxabi/test/vendor/ibm/cond_reg_restore.pass.cpp
+++ b/libcxxabi/test/vendor/ibm/cond_reg_restore.pass.cpp
@@ -10,7 +10,7 @@
 // on AIX. Option -O3 is required so that the compiler will re-use the value
 // in the condition register instead of re-evaluating the condition expression.
 
-// REQUIRES: target=powerpc{{(64)?}}-ibm-aix
+// REQUIRES: target={{.+}}-aix{{.*}}
 // ADDITIONAL_COMPILE_FLAGS: -O3
 // UNSUPPORTED: no-exceptions
 
diff --git a/libcxxabi/test/vendor/ibm/vec_reg_restore.pass.cpp b/libcxxabi/test/vendor/ibm/vec_reg_restore.pass.cpp
index 703c311dae392..7c31970546993 100644
--- a/libcxxabi/test/vendor/ibm/vec_reg_restore.pass.cpp
+++ b/libcxxabi/test/vendor/ibm/vec_reg_restore.pass.cpp
@@ -9,7 +9,7 @@
 // Check that the PowerPC vector registers are restored properly during
 // unwinding. Option -mabi=vec-extabi is required to compile the test case.
 
-// REQUIRES: target=powerpc{{(64)?}}-ibm-aix
+// REQUIRES: target={{.+}}-aix{{.*}}
 // ADDITIONAL_COMPILE_FLAGS: -mabi=vec-extabi
 // UNSUPPORTED: no-exceptions
 
diff --git a/libunwind/test/aix_signal_unwind.pass.sh.S b/libunwind/test/aix_signal_unwind.pass.sh.S
index a666577d095b1..2c0cf140fe267 100644
--- a/libunwind/test/aix_signal_unwind.pass.sh.S
+++ b/libunwind/test/aix_signal_unwind.pass.sh.S
@@ -10,7 +10,7 @@
 // a correct traceback when the function raising the signal does not save
 // the link register or does not store the stack back chain.
 
-// REQUIRES: target=powerpc{{(64)?}}-ibm-aix{{.*}}
+// REQUIRES: target={{.+}}-aix{{.*}}
 
 // Test when the function raising the signal does not save the link register
 // RUN: %{cxx} -x c++ %s -o %t.exe -DCXX_CODE %{flags} %{compile_flags}

>From 142499d9a21309c7c5bacf34c35bb42fbffb7a8f Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Thu, 1 Aug 2024 10:22:03 -0700
Subject: [PATCH 90/91] [ELF] Support relocatable files using CREL with
 explicit addends

... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.

---

Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.

(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)

A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.

---

Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).

* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.

SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/98115

(cherry picked from commit 0af07c078798b7c427e2981377781b5cc555a568)
---
 lld/ELF/DWARF.cpp                          |   3 +-
 lld/ELF/ICF.cpp                            |   8 +-
 lld/ELF/InputFiles.cpp                     |   1 +
 lld/ELF/InputFiles.h                       |   1 +
 lld/ELF/InputSection.cpp                   |  67 ++++++++---
 lld/ELF/InputSection.h                     |  14 ++-
 lld/ELF/LinkerScript.cpp                   |   2 +
 lld/ELF/MarkLive.cpp                       |  12 +-
 lld/ELF/OutputSections.cpp                 | 132 ++++++++++++++++++++-
 lld/ELF/OutputSections.h                   |   6 +
 lld/ELF/Relocations.cpp                    |  38 ++++--
 lld/ELF/Relocations.h                      |  92 ++++++++++++++
 lld/ELF/SyntheticSections.cpp              |   6 +-
 lld/ELF/Writer.cpp                         |  13 +-
 lld/test/ELF/crel-rel-mixed.s              |  22 ++++
 lld/test/ELF/crel.s                        |  90 ++++++++++++++
 lld/test/ELF/debug-names.s                 |   2 +-
 lld/test/ELF/gc-sections.s                 |   4 +
 lld/test/ELF/icf1.s                        |   3 +
 lld/test/ELF/icf4.s                        |   2 +-
 lld/test/ELF/linkerscript/nocrossrefs.test |   4 +-
 lld/test/ELF/relocatable-crel-32.s         |  71 +++++++++++
 lld/test/ELF/relocatable-crel.s            | 107 +++++++++++++++++
 23 files changed, 656 insertions(+), 44 deletions(-)
 create mode 100644 lld/test/ELF/crel-rel-mixed.s
 create mode 100644 lld/test/ELF/crel.s
 create mode 100644 lld/test/ELF/relocatable-crel-32.s
 create mode 100644 lld/test/ELF/relocatable-crel.s

diff --git a/lld/ELF/DWARF.cpp b/lld/ELF/DWARF.cpp
index 5d58e0c60a952..517d26810a378 100644
--- a/lld/ELF/DWARF.cpp
+++ b/lld/ELF/DWARF.cpp
@@ -136,7 +136,8 @@ template <class ELFT>
 std::optional<RelocAddrEntry>
 LLDDwarfObj<ELFT>::find(const llvm::DWARFSection &s, uint64_t pos) const {
   auto &sec = static_cast<const LLDDWARFSection &>(s);
-  const RelsOrRelas<ELFT> rels = sec.sec->template relsOrRelas<ELFT>();
+  const RelsOrRelas<ELFT> rels =
+      sec.sec->template relsOrRelas<ELFT>(/*supportsCrel=*/false);
   if (rels.areRelocsRel())
     return findAux(*sec.sec, pos, rels.rels);
   return findAux(*sec.sec, pos, rels.relas);
diff --git a/lld/ELF/ICF.cpp b/lld/ELF/ICF.cpp
index a6b52d78fa806..44e8a71cc6286 100644
--- a/lld/ELF/ICF.cpp
+++ b/lld/ELF/ICF.cpp
@@ -324,6 +324,8 @@ bool ICF<ELFT>::equalsConstant(const InputSection *a, const InputSection *b) {
 
   const RelsOrRelas<ELFT> ra = a->template relsOrRelas<ELFT>();
   const RelsOrRelas<ELFT> rb = b->template relsOrRelas<ELFT>();
+  if (ra.areRelocsCrel())
+    return constantEq(a, ra.crels, b, rb.crels);
   return ra.areRelocsRel() || rb.areRelocsRel()
              ? constantEq(a, ra.rels, b, rb.rels)
              : constantEq(a, ra.relas, b, rb.relas);
@@ -374,6 +376,8 @@ template <class ELFT>
 bool ICF<ELFT>::equalsVariable(const InputSection *a, const InputSection *b) {
   const RelsOrRelas<ELFT> ra = a->template relsOrRelas<ELFT>();
   const RelsOrRelas<ELFT> rb = b->template relsOrRelas<ELFT>();
+  if (ra.areRelocsCrel())
+    return variableEq(a, ra.crels, b, rb.crels);
   return ra.areRelocsRel() || rb.areRelocsRel()
              ? variableEq(a, ra.rels, b, rb.rels)
              : variableEq(a, ra.relas, b, rb.relas);
@@ -505,7 +509,9 @@ template <class ELFT> void ICF<ELFT>::run() {
   for (unsigned cnt = 0; cnt != 2; ++cnt) {
     parallelForEach(sections, [&](InputSection *s) {
       const RelsOrRelas<ELFT> rels = s->template relsOrRelas<ELFT>();
-      if (rels.areRelocsRel())
+      if (rels.areRelocsCrel())
+        combineRelocHashes(cnt, s, rels.crels);
+      else if (rels.areRelocsRel())
         combineRelocHashes(cnt, s, rels.rels);
       else
         combineRelocHashes(cnt, s, rels.relas);
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index 03ff4eadfe670..f1c0eb292361b 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -834,6 +834,7 @@ void ObjFile<ELFT>::initializeSections(bool ignoreComdats,
     case SHT_STRTAB:
     case SHT_REL:
     case SHT_RELA:
+    case SHT_CREL:
     case SHT_NULL:
       break;
     case SHT_PROGBITS:
diff --git a/lld/ELF/InputFiles.h b/lld/ELF/InputFiles.h
index 0617f41e1e13a..8566baf61e1ab 100644
--- a/lld/ELF/InputFiles.h
+++ b/lld/ELF/InputFiles.h
@@ -84,6 +84,7 @@ class InputFile {
     assert(fileKind == ObjKind || fileKind == BinaryKind);
     return sections;
   }
+  void cacheDecodedCrel(size_t i, InputSectionBase *s) { sections[i] = s; }
 
   // Returns object file symbols. It is a runtime error to call this
   // function on files of other types.
diff --git a/lld/ELF/InputSection.cpp b/lld/ELF/InputSection.cpp
index 7857d857488c0..570e485455bad 100644
--- a/lld/ELF/InputSection.cpp
+++ b/lld/ELF/InputSection.cpp
@@ -133,21 +133,56 @@ void InputSectionBase::decompress() const {
   compressed = false;
 }
 
-template <class ELFT> RelsOrRelas<ELFT> InputSectionBase::relsOrRelas() const {
+template <class ELFT>
+RelsOrRelas<ELFT> InputSectionBase::relsOrRelas(bool supportsCrel) const {
   if (relSecIdx == 0)
     return {};
   RelsOrRelas<ELFT> ret;
-  typename ELFT::Shdr shdr =
-      cast<ELFFileBase>(file)->getELFShdrs<ELFT>()[relSecIdx];
+  auto *f = cast<ObjFile<ELFT>>(file);
+  typename ELFT::Shdr shdr = f->template getELFShdrs<ELFT>()[relSecIdx];
+  if (shdr.sh_type == SHT_CREL) {
+    // Return an iterator if supported by caller.
+    if (supportsCrel) {
+      ret.crels = Relocs<typename ELFT::Crel>(
+          (const uint8_t *)f->mb.getBufferStart() + shdr.sh_offset);
+      return ret;
+    }
+    InputSectionBase *const &relSec = f->getSections()[relSecIdx];
+    // Otherwise, allocate a buffer to hold the decoded RELA relocations. When
+    // called for the first time, relSec is null (without --emit-relocs) or an
+    // InputSection with zero eqClass[0].
+    if (!relSec || !cast<InputSection>(relSec)->eqClass[0]) {
+      auto *sec = makeThreadLocal<InputSection>(*f, shdr, name);
+      f->cacheDecodedCrel(relSecIdx, sec);
+      sec->type = SHT_RELA;
+      sec->eqClass[0] = SHT_RELA;
+
+      RelocsCrel<ELFT::Is64Bits> entries(sec->content_);
+      sec->size = entries.size() * sizeof(typename ELFT::Rela);
+      auto *relas = makeThreadLocalN<typename ELFT::Rela>(entries.size());
+      sec->content_ = reinterpret_cast<uint8_t *>(relas);
+      for (auto [i, r] : llvm::enumerate(entries)) {
+        relas[i].r_offset = r.r_offset;
+        relas[i].setSymbolAndType(r.r_symidx, r.r_type, false);
+        relas[i].r_addend = r.r_addend;
+      }
+    }
+    ret.relas = {ArrayRef(
+        reinterpret_cast<const typename ELFT::Rela *>(relSec->content_),
+        relSec->size / sizeof(typename ELFT::Rela))};
+    return ret;
+  }
+
+  const void *content = f->mb.getBufferStart() + shdr.sh_offset;
+  size_t size = shdr.sh_size;
   if (shdr.sh_type == SHT_REL) {
-    ret.rels = ArrayRef(reinterpret_cast<const typename ELFT::Rel *>(
-                            file->mb.getBufferStart() + shdr.sh_offset),
-                        shdr.sh_size / sizeof(typename ELFT::Rel));
+    ret.rels = {ArrayRef(reinterpret_cast<const typename ELFT::Rel *>(content),
+                         size / sizeof(typename ELFT::Rel))};
   } else {
     assert(shdr.sh_type == SHT_RELA);
-    ret.relas = ArrayRef(reinterpret_cast<const typename ELFT::Rela *>(
-                             file->mb.getBufferStart() + shdr.sh_offset),
-                         shdr.sh_size / sizeof(typename ELFT::Rela));
+    ret.relas = {
+        ArrayRef(reinterpret_cast<const typename ELFT::Rela *>(content),
+                 size / sizeof(typename ELFT::Rela))};
   }
   return ret;
 }
@@ -1248,7 +1283,7 @@ SyntheticSection *EhInputSection::getParent() const {
 // .eh_frame is a sequence of CIE or FDE records.
 // This function splits an input section into records and returns them.
 template <class ELFT> void EhInputSection::split() {
-  const RelsOrRelas<ELFT> rels = relsOrRelas<ELFT>();
+  const RelsOrRelas<ELFT> rels = relsOrRelas<ELFT>(/*supportsCrel=*/false);
   // getReloc expects the relocations to be sorted by r_offset. See the comment
   // in scanRelocs.
   if (rels.areRelocsRel()) {
@@ -1414,10 +1449,14 @@ template void InputSection::writeTo<ELF32BE>(uint8_t *);
 template void InputSection::writeTo<ELF64LE>(uint8_t *);
 template void InputSection::writeTo<ELF64BE>(uint8_t *);
 
-template RelsOrRelas<ELF32LE> InputSectionBase::relsOrRelas<ELF32LE>() const;
-template RelsOrRelas<ELF32BE> InputSectionBase::relsOrRelas<ELF32BE>() const;
-template RelsOrRelas<ELF64LE> InputSectionBase::relsOrRelas<ELF64LE>() const;
-template RelsOrRelas<ELF64BE> InputSectionBase::relsOrRelas<ELF64BE>() const;
+template RelsOrRelas<ELF32LE>
+InputSectionBase::relsOrRelas<ELF32LE>(bool) const;
+template RelsOrRelas<ELF32BE>
+InputSectionBase::relsOrRelas<ELF32BE>(bool) const;
+template RelsOrRelas<ELF64LE>
+InputSectionBase::relsOrRelas<ELF64LE>(bool) const;
+template RelsOrRelas<ELF64BE>
+InputSectionBase::relsOrRelas<ELF64BE>(bool) const;
 
 template MergeInputSection::MergeInputSection(ObjFile<ELF32LE> &,
                                               const ELF32LE::Shdr &, StringRef);
diff --git a/lld/ELF/InputSection.h b/lld/ELF/InputSection.h
index c89a545e1543f..6659530a9c9c2 100644
--- a/lld/ELF/InputSection.h
+++ b/lld/ELF/InputSection.h
@@ -35,17 +35,21 @@ class OutputSection;
 
 LLVM_LIBRARY_VISIBILITY extern std::vector<Partition> partitions;
 
-// Returned by InputSectionBase::relsOrRelas. At least one member is empty.
+// Returned by InputSectionBase::relsOrRelas. At most one member is empty.
 template <class ELFT> struct RelsOrRelas {
   Relocs<typename ELFT::Rel> rels;
   Relocs<typename ELFT::Rela> relas;
+  Relocs<typename ELFT::Crel> crels;
   bool areRelocsRel() const { return rels.size(); }
+  bool areRelocsCrel() const { return crels.size(); }
 };
 
 #define invokeOnRelocs(sec, f, ...)                                            \
   {                                                                            \
     const RelsOrRelas<ELFT> rs = (sec).template relsOrRelas<ELFT>();           \
-    if (rs.areRelocsRel())                                                     \
+    if (rs.areRelocsCrel())                                                    \
+      f(__VA_ARGS__, rs.crels);                                                \
+    else if (rs.areRelocsRel())                                                \
       f(__VA_ARGS__, rs.rels);                                                 \
     else                                                                       \
       f(__VA_ARGS__, rs.relas);                                                \
@@ -209,7 +213,8 @@ class InputSectionBase : public SectionBase {
   // used by --gc-sections.
   InputSectionBase *nextInSectionGroup = nullptr;
 
-  template <class ELFT> RelsOrRelas<ELFT> relsOrRelas() const;
+  template <class ELFT>
+  RelsOrRelas<ELFT> relsOrRelas(bool supportsCrel = true) const;
 
   // InputSections that are dependent on us (reverse dependency for GC)
   llvm::TinyPtrVector<InputSection *> dependentSections;
@@ -483,7 +488,8 @@ class SyntheticSection : public InputSection {
 };
 
 inline bool isStaticRelSecType(uint32_t type) {
-  return type == llvm::ELF::SHT_RELA || type == llvm::ELF::SHT_REL;
+  return type == llvm::ELF::SHT_RELA || type == llvm::ELF::SHT_CREL ||
+         type == llvm::ELF::SHT_REL;
 }
 
 inline bool isDebugSection(const InputSectionBase &sec) {
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index e2208da18dce0..055fa21d44ca6 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -61,6 +61,8 @@ static StringRef getOutputSectionName(const InputSectionBase *s) {
         assert(config->relocatable && (rel->flags & SHF_LINK_ORDER));
         return s->name;
       }
+      if (s->type == SHT_CREL)
+        return saver().save(".crel" + out->name);
       if (s->type == SHT_RELA)
         return saver().save(".rela" + out->name);
       return saver().save(".rel" + out->name);
diff --git a/lld/ELF/MarkLive.cpp b/lld/ELF/MarkLive.cpp
index 45431e44a6c8c..16e5883c2002c 100644
--- a/lld/ELF/MarkLive.cpp
+++ b/lld/ELF/MarkLive.cpp
@@ -85,6 +85,13 @@ static uint64_t getAddend(InputSectionBase &sec,
   return rel.r_addend;
 }
 
+// Currently, we assume all input CREL relocations have an explicit addend.
+template <class ELFT>
+static uint64_t getAddend(InputSectionBase &sec,
+                          const typename ELFT::Crel &rel) {
+  return rel.r_addend;
+}
+
 template <class ELFT>
 template <class RelTy>
 void MarkLive<ELFT>::resolveReloc(InputSectionBase &sec, RelTy &rel,
@@ -239,7 +246,8 @@ template <class ELFT> void MarkLive<ELFT>::run() {
   // all of them. We also want to preserve personality routines and LSDA
   // referenced by .eh_frame sections, so we scan them for that here.
   for (EhInputSection *eh : ctx.ehInputSections) {
-    const RelsOrRelas<ELFT> rels = eh->template relsOrRelas<ELFT>();
+    const RelsOrRelas<ELFT> rels =
+        eh->template relsOrRelas<ELFT>(/*supportsCrel=*/false);
     if (rels.areRelocsRel())
       scanEhFrameSection(*eh, rels.rels);
     else if (rels.relas.size())
@@ -310,6 +318,8 @@ template <class ELFT> void MarkLive<ELFT>::mark() {
       resolveReloc(sec, rel, false);
     for (const typename ELFT::Rela &rel : rels.relas)
       resolveReloc(sec, rel, false);
+    for (const typename ELFT::Crel &rel : rels.crels)
+      resolveReloc(sec, rel, false);
 
     for (InputSectionBase *isec : sec.dependentSections)
       enqueue(isec, 0);
diff --git a/lld/ELF/OutputSections.cpp b/lld/ELF/OutputSections.cpp
index 60de10061c53d..29f18f89274f3 100644
--- a/lld/ELF/OutputSections.cpp
+++ b/lld/ELF/OutputSections.cpp
@@ -18,6 +18,7 @@
 #include "llvm/BinaryFormat/Dwarf.h"
 #include "llvm/Config/llvm-config.h" // LLVM_ENABLE_ZLIB
 #include "llvm/Support/Compression.h"
+#include "llvm/Support/LEB128.h"
 #include "llvm/Support/Parallel.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/TimeProfiler.h"
@@ -115,7 +116,19 @@ void OutputSection::recordSection(InputSectionBase *isec) {
 // other InputSections.
 void OutputSection::commitSection(InputSection *isec) {
   if (LLVM_UNLIKELY(type != isec->type)) {
-    if (hasInputSections || typeIsSet) {
+    if (!hasInputSections && !typeIsSet) {
+      type = isec->type;
+    } else if (isStaticRelSecType(type) && isStaticRelSecType(isec->type) &&
+               (type == SHT_CREL) != (isec->type == SHT_CREL)) {
+      // Combine mixed SHT_REL[A] and SHT_CREL to SHT_CREL.
+      type = SHT_CREL;
+      if (type == SHT_REL) {
+        if (name.consume_front(".rel"))
+          name = saver().save(".crel" + name);
+      } else if (name.consume_front(".rela")) {
+        name = saver().save(".crel" + name);
+      }
+    } else {
       if (typeIsSet || !canMergeToProgbits(type) ||
           !canMergeToProgbits(isec->type)) {
         // The (NOLOAD) changes the section type to SHT_NOBITS, the intention is
@@ -133,8 +146,6 @@ void OutputSection::commitSection(InputSection *isec) {
       }
       if (!typeIsSet)
         type = SHT_PROGBITS;
-    } else {
-      type = isec->type;
     }
   }
   if (!hasInputSections) {
@@ -470,6 +481,11 @@ void OutputSection::writeTo(uint8_t *buf, parallel::TaskGroup &tg) {
   llvm::TimeTraceScope timeScope("Write sections", name);
   if (type == SHT_NOBITS)
     return;
+  if (type == SHT_CREL && !(flags & SHF_ALLOC)) {
+    buf += encodeULEB128(crelHeader, buf);
+    memcpy(buf, crelBody.data(), crelBody.size());
+    return;
+  }
 
   // If the section is compressed due to
   // --compress-debug-section/--compress-sections, the content is already known.
@@ -505,6 +521,12 @@ void OutputSection::writeTo(uint8_t *buf, parallel::TaskGroup &tg) {
   if (nonZeroFiller)
     fill(buf, sections.empty() ? size : sections[0]->outSecOff, filler);
 
+  if (type == SHT_CREL && !(flags & SHF_ALLOC)) {
+    buf += encodeULEB128(crelHeader, buf);
+    memcpy(buf, crelBody.data(), crelBody.size());
+    return;
+  }
+
   auto fn = [=](size_t begin, size_t end) {
     size_t numSections = sections.size();
     for (size_t i = begin; i != end; ++i) {
@@ -592,6 +614,103 @@ static void finalizeShtGroup(OutputSection *os, InputSection *section) {
   os->size = (1 + seen.size()) * sizeof(uint32_t);
 }
 
+template <class uint>
+LLVM_ATTRIBUTE_ALWAYS_INLINE static void
+encodeOneCrel(raw_svector_ostream &os, Elf_Crel<sizeof(uint) == 8> &out,
+              uint offset, const Symbol &sym, uint32_t type, uint addend) {
+  const auto deltaOffset = static_cast<uint64_t>(offset - out.r_offset);
+  out.r_offset = offset;
+  int64_t symidx = in.symTab->getSymbolIndex(sym);
+  if (sym.type == STT_SECTION) {
+    auto *d = dyn_cast<Defined>(&sym);
+    if (d) {
+      SectionBase *section = d->section;
+      assert(section->isLive());
+      addend = sym.getVA(addend) - section->getOutputSection()->addr;
+    } else {
+      // Encode R_*_NONE(symidx=0).
+      symidx = type = addend = 0;
+    }
+  }
+
+  // Similar to llvm::ELF::encodeCrel.
+  uint8_t b = deltaOffset * 8 + (out.r_symidx != symidx) +
+              (out.r_type != type ? 2 : 0) +
+              (uint(out.r_addend) != addend ? 4 : 0);
+  if (deltaOffset < 0x10) {
+    os << char(b);
+  } else {
+    os << char(b | 0x80);
+    encodeULEB128(deltaOffset >> 4, os);
+  }
+  if (b & 1) {
+    encodeSLEB128(static_cast<int32_t>(symidx - out.r_symidx), os);
+    out.r_symidx = symidx;
+  }
+  if (b & 2) {
+    encodeSLEB128(static_cast<int32_t>(type - out.r_type), os);
+    out.r_type = type;
+  }
+  if (b & 4) {
+    encodeSLEB128(std::make_signed_t<uint>(addend - out.r_addend), os);
+    out.r_addend = addend;
+  }
+}
+
+template <class ELFT>
+static size_t relToCrel(raw_svector_ostream &os, Elf_Crel<ELFT::Is64Bits> &out,
+                        InputSection *relSec, InputSectionBase *sec) {
+  const auto &file = *cast<ELFFileBase>(relSec->file);
+  if (relSec->type == SHT_REL) {
+    // REL conversion is complex and unsupported yet.
+    errorOrWarn(toString(relSec) + ": REL cannot be converted to CREL");
+    return 0;
+  }
+  auto rels = relSec->getDataAs<typename ELFT::Rela>();
+  for (auto rel : rels) {
+    encodeOneCrel<typename ELFT::uint>(
+        os, out, sec->getVA(rel.r_offset), file.getRelocTargetSym(rel),
+        rel.getType(config->isMips64EL), getAddend<ELFT>(rel));
+  }
+  return rels.size();
+}
+
+// Compute the content of a non-alloc CREL section due to -r or --emit-relocs.
+// Input CREL sections are decoded while REL[A] need to be converted.
+template <bool is64> void OutputSection::finalizeNonAllocCrel() {
+  using uint = typename Elf_Crel_Impl<is64>::uint;
+  raw_svector_ostream os(crelBody);
+  uint64_t totalCount = 0;
+  Elf_Crel<is64> out{};
+  assert(commands.size() == 1);
+  auto *isd = cast<InputSectionDescription>(commands[0]);
+  for (InputSection *relSec : isd->sections) {
+    const auto &file = *cast<ELFFileBase>(relSec->file);
+    InputSectionBase *sec = relSec->getRelocatedSection();
+    if (relSec->type == SHT_CREL) {
+      RelocsCrel<is64> entries(relSec->content_);
+      totalCount += entries.size();
+      for (Elf_Crel_Impl<is64> r : entries) {
+        encodeOneCrel<uint>(os, out, uint(sec->getVA(r.r_offset)),
+                            file.getSymbol(r.r_symidx), r.r_type, r.r_addend);
+      }
+      continue;
+    }
+
+    // Convert REL[A] to CREL.
+    if constexpr (is64) {
+      totalCount += config->isLE ? relToCrel<ELF64LE>(os, out, relSec, sec)
+                                 : relToCrel<ELF64BE>(os, out, relSec, sec);
+    } else {
+      totalCount += config->isLE ? relToCrel<ELF32LE>(os, out, relSec, sec)
+                                 : relToCrel<ELF32BE>(os, out, relSec, sec);
+    }
+  }
+
+  crelHeader = totalCount * 8 + 4;
+  size = getULEB128Size(crelHeader) + crelBody.size();
+}
+
 void OutputSection::finalize() {
   InputSection *first = getFirstInputSection(this);
 
@@ -628,6 +747,13 @@ void OutputSection::finalize() {
   InputSectionBase *s = first->getRelocatedSection();
   info = s->getOutputSection()->sectionIndex;
   flags |= SHF_INFO_LINK;
+  // Finalize the content of non-alloc CREL.
+  if (type == SHT_CREL) {
+    if (config->is64)
+      finalizeNonAllocCrel<true>();
+    else
+      finalizeNonAllocCrel<false>();
+  }
 }
 
 // Returns true if S is in one of the many forms the compiler driver may pass
diff --git a/lld/ELF/OutputSections.h b/lld/ELF/OutputSections.h
index 78fede48a23f2..8c0c52f34ac9f 100644
--- a/lld/ELF/OutputSections.h
+++ b/lld/ELF/OutputSections.h
@@ -84,6 +84,11 @@ class OutputSection final : public SectionBase {
   Expr alignExpr;
   Expr lmaExpr;
   Expr subalignExpr;
+
+  // Used by non-alloc SHT_CREL to hold the header and content byte stream.
+  uint64_t crelHeader = 0;
+  SmallVector<char, 0> crelBody;
+
   SmallVector<SectionCommand *, 0> commands;
   SmallVector<StringRef, 0> phdrs;
   std::optional<std::array<uint8_t, 4>> filler;
@@ -106,6 +111,7 @@ class OutputSection final : public SectionBase {
   // DATA_RELRO_END.
   bool relro = false;
 
+  template <bool is64> void finalizeNonAllocCrel();
   void finalize();
   template <class ELFT>
   void writeTo(uint8_t *buf, llvm::parallel::TaskGroup &tg);
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 9a799cd286135..e19b1e6c8efb8 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1441,10 +1441,11 @@ void RelocationScanner::scanOne(typename Relocs<RelTy>::const_iterator &i) {
   uint32_t symIndex = rel.getSymbol(config->isMips64EL);
   Symbol &sym = sec->getFile<ELFT>()->getSymbol(symIndex);
   RelType type;
-  if constexpr (ELFT::Is64Bits) {
+  if constexpr (ELFT::Is64Bits || RelTy::IsCrel) {
     type = rel.getType(config->isMips64EL);
     ++i;
   } else {
+    // CREL is unsupported for MIPS N32.
     if (config->mipsN32Abi) {
       type = getMipsN32RelType(i);
     } else {
@@ -1497,15 +1498,18 @@ void RelocationScanner::scanOne(typename Relocs<RelTy>::const_iterator &i) {
 
     if ((type == R_PPC64_TLSGD && expr == R_TLSDESC_CALL) ||
         (type == R_PPC64_TLSLD && expr == R_TLSLD_HINT)) {
-      if (i == end) {
-        errorOrWarn("R_PPC64_TLSGD/R_PPC64_TLSLD may not be the last "
-                    "relocation" +
-                    getLocation(*sec, sym, offset));
-        return;
+      // Skip the error check for CREL, which does not set `end`.
+      if constexpr (!RelTy::IsCrel) {
+        if (i == end) {
+          errorOrWarn("R_PPC64_TLSGD/R_PPC64_TLSLD may not be the last "
+                      "relocation" +
+                      getLocation(*sec, sym, offset));
+          return;
+        }
       }
 
-      // Offset the 4-byte aligned R_PPC64_TLSGD by one byte in the NOTOC case,
-      // so we can discern it later from the toc-case.
+      // Offset the 4-byte aligned R_PPC64_TLSGD by one byte in the NOTOC
+      // case, so we can discern it later from the toc-case.
       if (i->getType(/*isMips64EL=*/false) == R_PPC64_REL24_NOTOC)
         ++offset;
     }
@@ -1545,7 +1549,7 @@ void RelocationScanner::scanOne(typename Relocs<RelTy>::const_iterator &i) {
 // instructions are generated by very old IBM XL compilers. Work around the
 // issue by disabling GD/LD to IE/LE relaxation.
 template <class RelTy>
-static void checkPPC64TLSRelax(InputSectionBase &sec, ArrayRef<RelTy> rels) {
+static void checkPPC64TLSRelax(InputSectionBase &sec, Relocs<RelTy> rels) {
   // Skip if sec is synthetic (sec.file is null) or if sec has been marked.
   if (!sec.file || sec.file->ppc64DisableTLSRelax)
     return;
@@ -1593,9 +1597,15 @@ void RelocationScanner::scan(Relocs<RelTy> rels) {
   if (isa<EhInputSection>(sec) || config->emachine == EM_S390)
     rels = sortRels(rels, storage);
 
-  end = static_cast<const void *>(rels.end());
-  for (auto i = rels.begin(); i != end;)
-    scanOne<ELFT, RelTy>(i);
+  if constexpr (RelTy::IsCrel) {
+    for (auto i = rels.begin(); i != rels.end();)
+      scanOne<ELFT, RelTy>(i);
+  } else {
+    // The non-CREL code path has additional check for PPC64 TLS.
+    end = static_cast<const void *>(rels.end());
+    for (auto i = rels.begin(); i != end;)
+      scanOne<ELFT, RelTy>(i);
+  }
 
   // Sort relocations by offset for more efficient searching for
   // R_RISCV_PCREL_HI20 and R_PPC64_ADDR64.
@@ -1611,7 +1621,9 @@ template <class ELFT> void RelocationScanner::scanSection(InputSectionBase &s) {
   sec = &s;
   getter = OffsetGetter(s);
   const RelsOrRelas<ELFT> rels = s.template relsOrRelas<ELFT>();
-  if (rels.areRelocsRel())
+  if (rels.areRelocsCrel())
+    scan<ELFT>(rels.crels);
+  else if (rels.areRelocsRel())
     scan<ELFT>(rels.rels);
   else
     scan<ELFT>(rels.relas);
diff --git a/lld/ELF/Relocations.h b/lld/ELF/Relocations.h
index 77d8d52ca3d3f..aaa4581490a28 100644
--- a/lld/ELF/Relocations.h
+++ b/lld/ELF/Relocations.h
@@ -12,6 +12,7 @@
 #include "lld/Common/LLVM.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/Object/ELFTypes.h"
 #include <vector>
 
 namespace lld::elf {
@@ -205,11 +206,91 @@ class ThunkCreator {
   uint32_t pass = 0;
 };
 
+// Decode LEB128 without error checking. Only used by performance critical code
+// like RelocsCrel.
+inline uint64_t readLEB128(const uint8_t *&p, uint64_t leb) {
+  uint64_t acc = 0, shift = 0, byte;
+  do {
+    byte = *p++;
+    acc |= (byte - 128 * (byte >= leb)) << shift;
+    shift += 7;
+  } while (byte >= 128);
+  return acc;
+}
+inline uint64_t readULEB128(const uint8_t *&p) { return readLEB128(p, 128); }
+inline int64_t readSLEB128(const uint8_t *&p) { return readLEB128(p, 64); }
+
+// This class implements a CREL iterator that does not allocate extra memory.
+template <bool is64> struct RelocsCrel {
+  using uint = std::conditional_t<is64, uint64_t, uint32_t>;
+  struct const_iterator {
+    using iterator_category = std::forward_iterator_tag;
+    using value_type = llvm::object::Elf_Crel_Impl<is64>;
+    using difference_type = ptrdiff_t;
+    using pointer = value_type *;
+    using reference = const value_type &;
+    uint32_t count;
+    uint8_t flagBits, shift;
+    const uint8_t *p;
+    llvm::object::Elf_Crel_Impl<is64> crel{};
+    const_iterator(size_t hdr, const uint8_t *p)
+        : count(hdr / 8), flagBits(hdr & 4 ? 3 : 2), shift(hdr % 4), p(p) {
+      if (count)
+        step();
+    }
+    void step() {
+      // See object::decodeCrel.
+      const uint8_t b = *p++;
+      crel.r_offset += b >> flagBits << shift;
+      if (b >= 0x80)
+        crel.r_offset +=
+            ((readULEB128(p) << (7 - flagBits)) - (0x80 >> flagBits)) << shift;
+      if (b & 1)
+        crel.r_symidx += readSLEB128(p);
+      if (b & 2)
+        crel.r_type += readSLEB128(p);
+      if (b & 4 && flagBits == 3)
+        crel.r_addend += static_cast<uint>(readSLEB128(p));
+    }
+    llvm::object::Elf_Crel_Impl<is64> operator*() const { return crel; };
+    const llvm::object::Elf_Crel_Impl<is64> *operator->() const {
+      return &crel;
+    }
+    // For llvm::enumerate.
+    bool operator==(const const_iterator &r) const { return count == r.count; }
+    bool operator!=(const const_iterator &r) const { return count != r.count; }
+    const_iterator &operator++() {
+      if (--count)
+        step();
+      return *this;
+    }
+    // For RelocationScanner::scanOne.
+    void operator+=(size_t n) {
+      for (; n; --n)
+        operator++();
+    }
+  };
+
+  size_t hdr = 0;
+  const uint8_t *p = nullptr;
+
+  constexpr RelocsCrel() = default;
+  RelocsCrel(const uint8_t *p) : hdr(readULEB128(p)) { this->p = p; }
+  size_t size() const { return hdr / 8; }
+  const_iterator begin() const { return {hdr, p}; }
+  const_iterator end() const { return {0, nullptr}; }
+};
+
 template <class RelTy> struct Relocs : ArrayRef<RelTy> {
   Relocs() = default;
   Relocs(ArrayRef<RelTy> a) : ArrayRef<RelTy>(a) {}
 };
 
+template <bool is64>
+struct Relocs<llvm::object::Elf_Crel_Impl<is64>> : RelocsCrel<is64> {
+  using RelocsCrel<is64>::RelocsCrel;
+};
+
 // Return a int64_t to make sure we get the sign extension out of the way as
 // early as possible.
 template <class ELFT>
@@ -220,6 +301,10 @@ template <class ELFT>
 static inline int64_t getAddend(const typename ELFT::Rela &rel) {
   return rel.r_addend;
 }
+template <class ELFT>
+static inline int64_t getAddend(const typename ELFT::Crel &rel) {
+  return rel.r_addend;
+}
 
 template <typename RelTy>
 inline Relocs<RelTy> sortRels(Relocs<RelTy> rels,
@@ -235,6 +320,13 @@ inline Relocs<RelTy> sortRels(Relocs<RelTy> rels,
   return rels;
 }
 
+template <bool is64>
+inline Relocs<llvm::object::Elf_Crel_Impl<is64>>
+sortRels(Relocs<llvm::object::Elf_Crel_Impl<is64>> rels,
+         SmallVector<llvm::object::Elf_Crel_Impl<is64>, 0> &storage) {
+  return {};
+}
+
 // Returns true if Expr refers a GOT entry. Note that this function returns
 // false for TLS variables even though they need GOT, because TLS variables uses
 // GOT differently than the regular variables.
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index b40ff0bc3cb03..41053c6472751 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -455,7 +455,8 @@ template <class ELFT>
 void EhFrameSection::addSectionAux(EhInputSection *sec) {
   if (!sec->isLive())
     return;
-  const RelsOrRelas<ELFT> rels = sec->template relsOrRelas<ELFT>();
+  const RelsOrRelas<ELFT> rels =
+      sec->template relsOrRelas<ELFT>(/*supportsCrel=*/false);
   if (rels.areRelocsRel())
     addRecords<ELFT>(sec, rels.rels);
   else
@@ -489,7 +490,8 @@ void EhFrameSection::iterateFDEWithLSDA(
   DenseSet<size_t> ciesWithLSDA;
   for (EhInputSection *sec : sections) {
     ciesWithLSDA.clear();
-    const RelsOrRelas<ELFT> rels = sec->template relsOrRelas<ELFT>();
+    const RelsOrRelas<ELFT> rels =
+        sec->template relsOrRelas<ELFT>(/*supportsCrel=*/false);
     if (rels.areRelocsRel())
       iterateFDEWithLSDAAux<ELFT>(*sec, rels.rels, ciesWithLSDA, fn);
     else
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 5cffdb771a738..8e3a746a08eb2 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -401,10 +401,19 @@ template <class ELFT> static void markUsedLocalSymbols() {
       InputSection *isec = dyn_cast_or_null<InputSection>(s);
       if (!isec)
         continue;
-      if (isec->type == SHT_REL)
+      if (isec->type == SHT_REL) {
         markUsedLocalSymbolsImpl(f, isec->getDataAs<typename ELFT::Rel>());
-      else if (isec->type == SHT_RELA)
+      } else if (isec->type == SHT_RELA) {
         markUsedLocalSymbolsImpl(f, isec->getDataAs<typename ELFT::Rela>());
+      } else if (isec->type == SHT_CREL) {
+        // The is64=true variant also works with ELF32 since only the r_symidx
+        // member is used.
+        for (Elf_Crel_Impl<true> r : RelocsCrel<true>(isec->content_)) {
+          Symbol &sym = file->getSymbol(r.r_symidx);
+          if (sym.isLocal())
+            sym.used = true;
+        }
+      }
     }
   }
 }
diff --git a/lld/test/ELF/crel-rel-mixed.s b/lld/test/ELF/crel-rel-mixed.s
new file mode 100644
index 0000000000000..a69fa1c09b436
--- /dev/null
+++ b/lld/test/ELF/crel-rel-mixed.s
@@ -0,0 +1,22 @@
+# REQUIRES: arm
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=armv7a -crel a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=armv7a b.s -o b.o
+# RUN: not ld.lld -r a.o b.o 2>&1 | FileCheck %s --check-prefix=ERR
+
+# ERR: error: b.o:(.rel.text): REL cannot be converted to CREL
+
+#--- a.s
+.global _start, foo
+_start:
+  bl foo
+  bl .text.foo
+
+.section .text.foo,"ax"
+foo:
+  nop
+
+#--- b.s
+.globl fb
+fb:
+  bl fb
diff --git a/lld/test/ELF/crel.s b/lld/test/ELF/crel.s
new file mode 100644
index 0000000000000..d7c87be9a5402
--- /dev/null
+++ b/lld/test/ELF/crel.s
@@ -0,0 +1,90 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 -crel a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 -crel b.s -o b.o
+# RUN: ld.lld -pie a.o b.o -o out
+# RUN: llvm-objdump -d out | FileCheck %s
+# RUN: llvm-readelf -Srs out | FileCheck %s --check-prefix=RELOC
+
+# CHECK:       <_start>:
+# CHECK-NEXT:    callq {{.*}} <foo>
+# CHECK-NEXT:    callq {{.*}} <foo>
+# CHECK-EMPTY:
+# CHECK-NEXT:  <foo>:
+# CHECK-NEXT:    leaq {{.*}}  # 0x27c
+# CHECK-NEXT:    leaq {{.*}}  # 0x278
+
+# RELOC:  .data             PROGBITS        {{0*}}[[#%x,DATA:]]
+
+# RELOC:  {{0*}}[[#DATA+8]]  0000000000000008 R_X86_64_RELATIVE [[#%x,DATA+0x8000000000000000]]
+
+# RUN: ld.lld -pie --emit-relocs a.o b.o -o out1
+# RUN: llvm-objdump -dr out1 | FileCheck %s --check-prefix=CHECKE
+# RUN: llvm-readelf -Sr out1 | FileCheck %s --check-prefix=RELOCE
+
+# CHECKE:       <_start>:
+# CHECKE-NEXT:    callq {{.*}} <foo>
+# CHECKE-NEXT:      R_X86_64_PLT32 foo-0x4
+# CHECKE-NEXT:    callq {{.*}} <foo>
+# CHECKE-NEXT:      R_X86_64_PLT32 .text+0x6
+# CHECKE-EMPTY:
+# CHECKE-NEXT:  <foo>:
+# CHECKE-NEXT:    leaq {{.*}}
+# CHECKE-NEXT:      R_X86_64_PC32 .L.str-0x4
+# CHECKE-NEXT:    leaq {{.*}}
+# CHECKE-NEXT:      R_X86_64_PC32 .L.str1-0x4
+
+# RELOCE:      .rodata             PROGBITS        {{0*}}[[#%x,RO:]]
+# RELOCE:      .eh_frame           PROGBITS        {{0*}}[[#%x,EHFRAME:]]
+# RELOCE:      .data               PROGBITS        {{0*}}[[#%x,DATA:]]
+
+# RELOCE:      Relocation section '.crel.data' at offset {{.*}} contains 2 entries:
+# RELOCE-NEXT:     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
+# RELOCE-NEXT: {{0*}}[[#DATA+8]] {{.*}}           R_X86_64_64            {{.*}}           .data - 8000000000000000
+# RELOCE-NEXT: {{0*}}[[#DATA+24]]{{.*}}           R_X86_64_64            {{.*}}           .data - 1
+# RELOCE:      Relocation section '.crel.eh_frame' at offset {{.*}} contains 2 entries:
+# RELOCE-NEXT:     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
+# RELOCE-NEXT: {{0*}}[[#EHFRAME+32]] {{.*}}       R_X86_64_PC32          {{.*}}           .text + 0
+# RELOCE-NEXT: {{0*}}[[#EHFRAME+52]] {{.*}}       R_X86_64_PC32          {{.*}}           .text + a
+# RELOCE:      Relocation section '.crel.rodata' at offset {{.*}} contains 4 entries:
+# RELOCE-NEXT:     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
+# RELOCE-NEXT: {{0*}}[[#RO+8]]   {{.*}}           R_X86_64_PC32          {{.*}}           foo + 0
+# RELOCE-NEXT: {{0*}}[[#RO+23]]  {{.*}}           R_X86_64_PC32          {{.*}}           foo + 3f
+# RELOCE-NEXT: {{0*}}[[#RO+39]]  {{.*}}           R_X86_64_PC64          {{.*}}           foo + 7f
+# RELOCE-NEXT: {{0*}}[[#RO+47]]  {{.*}}           R_X86_64_PC32          {{.*}}           _start - 1f81
+
+#--- a.s
+.global _start, foo
+_start:
+  .cfi_startproc # Test .eh_frame
+  call foo
+  call .text.foo
+  .cfi_endproc
+
+.section .text.foo,"ax"
+foo:
+  .cfi_startproc
+  leaq .L.str(%rip), %rsi
+  leaq .L.str1(%rip), %rsi
+  .cfi_endproc
+
+.section .rodata.str1.1,"aMS", at progbits,1
+.L.str:
+  .asciz  "abc"
+.L.str1:
+  .asciz  "def"
+
+.data
+.quad 0
+.quad .data - 0x8000000000000000
+.quad 0
+.quad .data - 1
+
+#--- b.s
+.section .rodata,"a"
+.long foo - .
+.space 15-4
+.long foo - . + 63  # offset+=15
+.space 16-4
+.quad foo - . + 127  # offset+=16
+.long _start - . - 8065
diff --git a/lld/test/ELF/debug-names.s b/lld/test/ELF/debug-names.s
index 888dd9007ed12..1bbb07b065e33 100644
--- a/lld/test/ELF/debug-names.s
+++ b/lld/test/ELF/debug-names.s
@@ -10,7 +10,7 @@
 
 # REQUIRES: x86
 # RUN: rm -rf %t && split-file %s %t && cd %t
-# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 --crel a.s -o a.o
 # RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o
 
 # RUN: ld.lld --debug-names --no-debug-names a.o b.o -o out0
diff --git a/lld/test/ELF/gc-sections.s b/lld/test/ELF/gc-sections.s
index 94adc8210b4bc..31e00d495146a 100644
--- a/lld/test/ELF/gc-sections.s
+++ b/lld/test/ELF/gc-sections.s
@@ -8,6 +8,10 @@
 # RUN: ld.lld --export-dynamic --gc-sections %t -o %t2
 # RUN: llvm-readobj --sections --symbols %t2 | FileCheck -check-prefix=GC2 %s
 
+# RUN: llvm-mc -filetype=obj -triple=x86_64 --crel %s -o %t.o
+# RUN: ld.lld --gc-sections --print-gc-sections %t.o -o %t2 | FileCheck --check-prefix=GC1-DISCARD %s
+# RUN: llvm-readobj --sections --symbols %t2 | FileCheck -check-prefix=GC1 %s
+
 # NOGC: Name: .eh_frame
 # NOGC: Name: .text
 # NOGC: Name: .init
diff --git a/lld/test/ELF/icf1.s b/lld/test/ELF/icf1.s
index 5c6e667d53c78..9682b06f4606f 100644
--- a/lld/test/ELF/icf1.s
+++ b/lld/test/ELF/icf1.s
@@ -3,6 +3,9 @@
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
 # RUN: ld.lld %t -o /dev/null --icf=all --print-icf-sections | FileCheck %s
 
+# RUN: llvm-mc -filetype=obj -triple=x86_64 --crel %s -o %t
+# RUN: ld.lld %t -o /dev/null --icf=all --print-icf-sections | FileCheck %s
+
 # CHECK: selected section {{.*}}:(.text.f1)
 # CHECK:   removing identical section {{.*}}:(.text.f2)
 
diff --git a/lld/test/ELF/icf4.s b/lld/test/ELF/icf4.s
index ff13a7ebff3da..310577a55c0d8 100644
--- a/lld/test/ELF/icf4.s
+++ b/lld/test/ELF/icf4.s
@@ -1,6 +1,6 @@
 # REQUIRES: x86
 
-# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
+# RUN: llvm-mc --crel -filetype=obj -triple=x86_64-unknown-linux %s -o %t
 # RUN: ld.lld %t -o /dev/null --icf=all --print-icf-sections | count 0
 
 .globl _start, f1, f2
diff --git a/lld/test/ELF/linkerscript/nocrossrefs.test b/lld/test/ELF/linkerscript/nocrossrefs.test
index f13d50a03be87..5eb56190fe63b 100644
--- a/lld/test/ELF/linkerscript/nocrossrefs.test
+++ b/lld/test/ELF/linkerscript/nocrossrefs.test
@@ -2,6 +2,7 @@
 # RUN: rm -rf %t && split-file %s %t && cd %t
 
 # RUN: llvm-mc --triple=x86_64 -filetype=obj a.s -o a.o
+# RUN: llvm-mc --triple=x86_64 -filetype=obj -crel a.s -o ac.o
 # RUN: llvm-mc --triple=x86_64 -filetype=obj data.s -o data.o
 # RUN: ld.lld a.o data.o -T 0.t 2>&1 | FileCheck %s --check-prefix=CHECK0 --implicit-check-not=warning:
 
@@ -9,7 +10,8 @@
 # CHECK0-NEXT: warning: 0.t:4: ignored with fewer than 2 output sections
 
 # RUN: not ld.lld a.o data.o -T 1.t 2>&1 | FileCheck %s --check-prefix=CHECK1 --implicit-check-not=error:
-# CHECK1:      error: a.o:(.text.start+0x11): prohibited cross reference from '.text' to 'data' in '.data'
+# RUN: not ld.lld ac.o data.o -T 1.t 2>&1 | FileCheck %s --check-prefix=CHECK1 --implicit-check-not=error:
+# CHECK1:      error: a{{.?}}.o:(.text.start+0x11): prohibited cross reference from '.text' to 'data' in '.data'
 
 ## .text and .text1 are in two NOCROSSREFS commands. Violations are reported twice.
 # RUN: not ld.lld --threads=1 a.o data.o -T 2.t 2>&1 | FileCheck %s --check-prefix=CHECK2 --implicit-check-not=error:
diff --git a/lld/test/ELF/relocatable-crel-32.s b/lld/test/ELF/relocatable-crel-32.s
new file mode 100644
index 0000000000000..8fbf236d77452
--- /dev/null
+++ b/lld/test/ELF/relocatable-crel-32.s
@@ -0,0 +1,71 @@
+# REQUIRES: ppc
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=powerpc -crel a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=powerpc -crel b.s -o b.o
+# RUN: ld.lld -r b.o a.o -o out
+# RUN: llvm-readobj -r out | FileCheck %s --check-prefixes=CHECK,CRELFOO
+
+# RUN: llvm-mc -filetype=obj -triple=powerpc a.s -o a1.o
+# RUN: ld.lld -r b.o a1.o -o out1
+# RUN: llvm-readobj -r out1 | FileCheck %s --check-prefixes=CHECK,RELAFOO
+# RUN: ld.lld -r a1.o b.o -o out2
+# RUN: llvm-readobj -r out2 | FileCheck %s --check-prefixes=CHECK2
+
+# CHECK:      Relocations [
+# CHECK-NEXT:   Section (2) .crel.text {
+# CHECK-NEXT:     0x0 R_PPC_REL24 fb 0x0
+# CHECK-NEXT:     0x4 R_PPC_REL24 foo 0x0
+# CHECK-NEXT:     0x8 R_PPC_REL24 .text.foo 0x0
+# CHECK-NEXT:     0xE R_PPC_ADDR16_HA .rodata.str1.1 0x4
+# CHECK-NEXT:     0x12 R_PPC_ADDR16_LO .rodata.str1.1 0x4
+# CHECK-NEXT:     0x16 R_PPC_ADDR16_HA .rodata.str1.1 0x0
+# CHECK-NEXT:     0x1A R_PPC_ADDR16_LO .rodata.str1.1 0x0
+# CHECK-NEXT:   }
+# CRELFOO-NEXT: Section (4) .crel.text.foo {
+# RELAFOO-NEXT: Section (4) .rela.text.foo {
+# CHECK-NEXT:     0x0 R_PPC_REL24 g 0x0
+# CHECK-NEXT:     0x4 R_PPC_REL24 g 0x0
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+# CHECK2:      Relocations [
+# CHECK2-NEXT:   Section (2) .crel.text {
+# CHECK2-NEXT:     0x0 R_PPC_REL24 foo 0x0
+# CHECK2-NEXT:     0x4 R_PPC_REL24 .text.foo 0x0
+# CHECK2-NEXT:     0xA R_PPC_ADDR16_HA .rodata.str1.1 0x4
+# CHECK2-NEXT:     0xE R_PPC_ADDR16_LO .rodata.str1.1 0x4
+# CHECK2-NEXT:     0x12 R_PPC_ADDR16_HA .rodata.str1.1 0x0
+# CHECK2-NEXT:     0x16 R_PPC_ADDR16_LO .rodata.str1.1 0x0
+# CHECK2-NEXT:     0x18 R_PPC_REL24 fb 0x0
+# CHECK2-NEXT:   }
+# CHECK2-NEXT:   Section (4) .rela.text.foo {
+# CHECK2-NEXT:     0x0 R_PPC_REL24 g 0x0
+# CHECK2-NEXT:     0x4 R_PPC_REL24 g 0x0
+# CHECK2-NEXT:   }
+# CHECK2-NEXT: ]
+
+#--- a.s
+.global _start, foo
+_start:
+  bl foo
+  bl .text.foo
+  lis 3, .L.str at ha
+  la 3, .L.str at l(3)
+  lis 3, .L.str1 at ha
+  la 3, .L.str1 at l(3)
+
+.section .text.foo,"ax"
+foo:
+  bl g
+  bl g
+
+.section .rodata.str1.1,"aMS", at progbits,1
+.L.str:
+  .asciz  "abc"
+.L.str1:
+  .asciz  "def"
+
+#--- b.s
+.globl fb
+fb:
+  bl fb
diff --git a/lld/test/ELF/relocatable-crel.s b/lld/test/ELF/relocatable-crel.s
new file mode 100644
index 0000000000000..6e97c3e24d66c
--- /dev/null
+++ b/lld/test/ELF/relocatable-crel.s
@@ -0,0 +1,107 @@
+# REQUIRES: x86
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 -crel a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 -crel b.s -o b.o
+# RUN: ld.lld -r b.o a.o -o out
+# RUN: llvm-readobj -r out | FileCheck %s --check-prefixes=CHECK,CRELFOO
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a1.o
+# RUN: ld.lld -r b.o a1.o -o out1
+# RUN: llvm-readobj -r out1 | FileCheck %s --check-prefixes=CHECK,RELAFOO
+# RUN: ld.lld -r a1.o b.o -o out2
+# RUN: llvm-readobj -r out2 | FileCheck %s --check-prefixes=CHECK2
+
+# CHECK:      Relocations [
+# CHECK-NEXT:   .crel.text {
+# CHECK-NEXT:     0x1 R_X86_64_PLT32 fb 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:     0x9 R_X86_64_PLT32 foo 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:     0xE R_X86_64_PLT32 .text.foo 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:   }
+# CHECK-NEXT:   .crel.rodata {
+# CHECK-NEXT:     0x0 R_X86_64_PC32 foo 0x0
+# CHECK-NEXT:     0xF R_X86_64_PC32 foo 0x3F
+# CHECK-NEXT:     0x1F R_X86_64_PC64 foo 0x7F
+# CHECK-NEXT:     0x27 R_X86_64_PC32 _start 0xFFFFFFFFFFFFE07F
+# CHECK-COUNT-12:      R_X86_64_32 _start 0x0
+# CHECK-NEXT:   }
+# CRELFOO-NEXT: .crel.text.foo {
+# RELAFOO-NEXT: .rela.text.foo {
+# CHECK-NEXT:     0x3 R_X86_64_PC32 .L.str 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:     0xA R_X86_64_PC32 .L.str1 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:     0xF R_X86_64_PLT32 g 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:     0x14 R_X86_64_PLT32 g 0xFFFFFFFFFFFFFFFC
+# CHECK-NEXT:   }
+# CRELFOO-NEXT: .crel.data {
+# RELAFOO-NEXT: .rela.data {
+# CHECK-NEXT:     0x8 R_X86_64_64 _start 0x8000000000000000
+# CHECK-NEXT:     0x18 R_X86_64_64 _start 0xFFFFFFFFFFFFFFFF
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+# CHECK2:      Relocations [
+# CHECK2-NEXT:   .crel.text {
+# CHECK2-NEXT:     0x1 R_X86_64_PLT32 foo 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:     0x6 R_X86_64_PLT32 .text.foo 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:     0xD R_X86_64_PLT32 fb 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:   }
+# CHECK2-NEXT:   .rela.text.foo {
+# CHECK2-NEXT:     0x3 R_X86_64_PC32 .L.str 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:     0xA R_X86_64_PC32 .L.str1 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:     0xF R_X86_64_PLT32 g 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:     0x14 R_X86_64_PLT32 g 0xFFFFFFFFFFFFFFFC
+# CHECK2-NEXT:   }
+# CHECK2-NEXT:   .rela.data {
+# CHECK2-NEXT:     0x8 R_X86_64_64 _start 0x8000000000000000
+# CHECK2-NEXT:     0x18 R_X86_64_64 _start 0xFFFFFFFFFFFFFFFF
+# CHECK2-NEXT:   }
+# CHECK2-NEXT:   .crel.rodata {
+# CHECK2-NEXT:     0x0 R_X86_64_PC32 foo 0x0
+# CHECK2-NEXT:     0xF R_X86_64_PC32 foo 0x3F
+# CHECK2-NEXT:     0x1F R_X86_64_PC64 foo 0x7F
+# CHECK2-NEXT:     0x27 R_X86_64_PC32 _start 0xFFFFFFFFFFFFE07F
+# CHECK2-COUNT-12:      R_X86_64_32 _start 0x0
+# CHECK2-NEXT:   }
+# CHECK2-NEXT: ]
+
+#--- a.s
+.global _start, foo
+_start:
+  call foo
+  call .text.foo
+
+.section .text.foo,"ax"
+foo:
+  leaq .L.str(%rip), %rsi
+  leaq .L.str1(%rip), %rsi
+  call g
+  call g
+
+.section .rodata.str1.1,"aMS", at progbits,1
+.L.str:
+  .asciz  "abc"
+.L.str1:
+  .asciz  "def"
+
+.data
+.quad 0
+.quad _start - 0x8000000000000000
+.quad 0
+.quad _start - 1
+
+#--- b.s
+.globl fb
+fb:
+  call fb
+
+.section .rodata,"a"
+.long foo - .
+.space 15-4
+.long foo - . + 63  # offset+=15
+.space 16-4
+.quad foo - . + 127  # offset+=16
+.long _start - . - 8065
+
+## Ensure .crel.rodata contains 16 relocations so that getULEB128Size(crelHeader) > 1.
+.rept 12
+.long _start
+.endr

>From b2eab3486499656ec6ef30ace5033f80d4d9dfc9 Mon Sep 17 00:00:00 2001
From: Nikolas Klauser <nikolasklauser at berlin.de>
Date: Fri, 2 Aug 2024 10:53:33 +0200
Subject: [PATCH 91/91] [Clang] Add a release note deprecating __is_nullptr

---
 clang/docs/ReleaseNotes.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b4ef1e9672a5d..c42cb9932f3f7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -447,6 +447,10 @@ Non-comprehensive list of changes in this release
   type of the pointer was taken into account. This improves
   compatibility with GCC's libstdc++.
 
+- The type traits builtin ``__is_nullptr`` is deprecated in CLang 19 and will be
+  removed in Clang 20. ``__is_same(__remove_cv(T), decltype(nullptr))`` can be
+  used instead to check whether a type ``T`` is a ``nullptr``.
+
 New Compiler Flags
 ------------------
 - ``-fsanitize=implicit-bitfield-conversion`` checks implicit truncation and
@@ -754,7 +758,7 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses dangling assignments for pointer-like objects (annotated with `[[gsl::Pointer]]`) under `-Wdangling-assignment-gsl` (off by default)
   Fixes #GH63310.
-  
+
 - Clang now diagnoses uses of alias templates with a deprecated attribute. (Fixes #GH18236).
 
   .. code-block:: c++