[llvm] [Transforms] Introduce BuildBuiltins.h atomic helpers (PR #134455)

Mon Apr 7 05:56:42 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Michael Kruse (Meinersbur)

<details>
<summary>Changes</summary>

Introduce BuildBuildins.h atomic helpers and uses them in OpenMPIRBuilder

The idea is for it to contain code generation functions to be used by frontends akin to BuildLibcall.h. While BuildLibcall will only emit a specific libcall function, BuildBuiltin can choose between multiple implementations. Three builtins for emitting atomic access have been implemented:
 * atomic load (`emitAtomicLoadBuiltin`)
 * atomic store (`emitAtomicStoreBuiltin`)
 * atomic compare & exchange (`emitAtomicCompareExchangeBuiltin`)

The functions in this file are intended to handle the complexity of builtins so frontends do not need to care about the details. A major difference between the cases is that the IR instructions take values directly as an llvm::Value but the libcall functions almost always take pointers to those values. This abstraction passes everything as pointers. The caller is responsible to emit a temporary AllocaInst and store if it needs to pass an llvm::Value. Mem2Reg/SROA will easily remove any unnecessary store/load pairs. As a side-effect, thanks to opaque pointers, no explicit cast is ever needed.

This is meant to be a potential substitute for implementations in Clang's CGAtomic, LLVM's AtomicExpandPass, LLVM's FrontendAtomic, LLVM's OpenMPIRBuilder's handling of atomics, as well as lowering of atomics from MLIR and Flang (as well as any other frontend). Potentially also LLVM's LowerAtomicPass and LLVM's NVPTXAtomicLower. So there is a lot of redundancy. Longer-term, it could be possible to introduce atomic intrinsics what works like atomic LoadInst, StroreInst, and AtomicCmpXchgInst but without the limitation. They would be lowered by an LLVM pass. This is rather invasive and a departure on how it is currently done. But when introduced, emitAtomicLoadBuiltin/emitAtomicStoreBuiltin/emitAtomicCompareExchangeBuiltin can just be updated to emit the intrinsic instead.

Also see the discussion at https://discourse.llvm.org/t/rfc-refactoring-cgatomic-into-llvmfrontend/80168/6

This PR is split off from #101966 to only introduce the helper functions, but without the OpenMPIRBuilder changes (changes in this PR are just moving code to be accessible from OpenMPIRBuilder and BuildBuiltins). It is therefore not used by anything but self-contained with their own unittests. Since GitHub does not allow changing the source branch in the #101966, and this will be a stacked PR, the helpers functions had to be the new PR.

### Internal Details

In case of `emitAtomicCompareExchangeBuiltin`, it may emit either a `cmpxchg` instruction, multiple `cmpxchg` instructions to fulfil its constant argument requirement, a sized `__atomic_compare_exchange_n` libcall function call, or a generic `__atomic_compare_exchange` libcall function. Future enhancements may also do a fallback lowering using locks, necessary for platforms where no libcall equivalent is evailable (see https://discourse.llvm.org/t/atomics-on-windows/84329).

A lot of ABI information is represented redundantly by Clang as well as similar LLVM objects. Without access to Clang's ABI information, one must at least provide accesses to `TargetLibraryInfo`. Clang previously only created these objects when invoking the LLVM pass pipeline, but not emitting LLVM-IR. To make the ABI information available, these objects need to be created earlier.

The different implementations on atomic lowering do not agree on the details, for instanc lowering of libcall functions is done in Clang with the help of `CodeGenModule::CreateRuntimeFunction`, in LLVM with the help of `TargetLibraryInfo`, and once more in LLVM using the `llvm::RTLIB` utilities. Unfortunately they disagree on how to call libcall functions. For instance, Clang emits boolean return values as `i1`, but `TargetLibraryInfo` uses `i8`. `TargetLibraryInfo` adds function attributes to known libcall functions while CGAtomic only uses the same generic properties for all libcall functions. `TargetLibraryInfo` has list of libcall functions supported by a target but `RTLIB` does not. While `RTLIB` is more Machine-IR-centric, `AtomicExpandPass` still uses it and is happy to emit libcall functions that `TargetLibraryInfo` thinks are not supported. `AtomicExpandPass` can lower to sized `__atomic_compare_exchange_n` functions, CGAtomic does not. etc. Clang thinks fp80 "long double" has 12 bytes while for `AtomicCompareExchange` it is 10.

---

Patch is 309.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134455.diff


13 Files Affected:

- (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.def (+80) 
- (modified) llvm/include/llvm/Support/AtomicOrdering.h (+22) 
- (modified) llvm/include/llvm/Testing/Support/Error.h (+49) 
- (added) llvm/include/llvm/Transforms/Utils/BuildBuiltins.h (+278) 
- (modified) llvm/include/llvm/Transforms/Utils/BuildLibCalls.h (+47) 
- (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+21) 
- (added) llvm/lib/Transforms/Utils/BuildBuiltins.cpp (+913) 
- (modified) llvm/lib/Transforms/Utils/BuildLibCalls.cpp (+215-1) 
- (modified) llvm/lib/Transforms/Utils/CMakeLists.txt (+1) 
- (modified) llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml (+2-2) 
- (modified) llvm/unittests/Analysis/TargetLibraryInfoTest.cpp (+18) 
- (added) llvm/unittests/Transforms/Utils/BuildBuiltinsTest.cpp (+4462) 
- (modified) llvm/unittests/Transforms/Utils/CMakeLists.txt (+2) 


``````````diff

diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.def b/llvm/include/llvm/Analysis/TargetLibraryInfo.def
index db566b8ee610e..53fb11aff8a44 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.def
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.def
@@ -462,11 +462,91 @@ TLI_DEFINE_ENUM_INTERNAL(atomic_load)
 TLI_DEFINE_STRING_INTERNAL("__atomic_load")
 TLI_DEFINE_SIG_INTERNAL(Void, SizeT, Ptr, Ptr, Int)
 
+/// int8_t __atomic_load_1(void *ptr, int memorder);
+TLI_DEFINE_ENUM_INTERNAL(atomic_load_1)
+TLI_DEFINE_STRING_INTERNAL("__atomic_load_1")
+TLI_DEFINE_SIG_INTERNAL(Int8, Ptr, Int)
+
+/// int16_t __atomic_load_2(void *ptr, int memorder);
+TLI_DEFINE_ENUM_INTERNAL(atomic_load_2)
+TLI_DEFINE_STRING_INTERNAL("__atomic_load_2")
+TLI_DEFINE_SIG_INTERNAL(Int16, Ptr, Int)
+
+/// int32_t __atomic_load_4(void *ptr, int memorder);
+TLI_DEFINE_ENUM_INTERNAL(atomic_load_4)
+TLI_DEFINE_STRING_INTERNAL("__atomic_load_4")
+TLI_DEFINE_SIG_INTERNAL(Int32, Ptr, Int)
+
+/// int64_t __atomic_load_8(void *ptr int memorder);
+TLI_DEFINE_ENUM_INTERNAL(atomic_load_8)
+TLI_DEFINE_STRING_INTERNAL("__atomic_load_8")
+TLI_DEFINE_SIG_INTERNAL(Int64, Ptr, Int)
+
+/// int128_t __atomic_load_16(void *ptr, int memorder);
+TLI_DEFINE_ENUM_INTERNAL(atomic_load_16)
+TLI_DEFINE_STRING_INTERNAL("__atomic_load_16")
+TLI_DEFINE_SIG_INTERNAL(Int128, Ptr, Int)
+
 /// void __atomic_store(size_t size, void *mptr, void *vptr, int smodel);
 TLI_DEFINE_ENUM_INTERNAL(atomic_store)
 TLI_DEFINE_STRING_INTERNAL("__atomic_store")
 TLI_DEFINE_SIG_INTERNAL(Void, SizeT, Ptr, Ptr, Int)
 
+/// void __atomic_store_1(void *ptr, int8_t val, int smodel);
+TLI_DEFINE_ENUM_INTERNAL(atomic_store_1)
+TLI_DEFINE_STRING_INTERNAL("__atomic_store_1")
+TLI_DEFINE_SIG_INTERNAL(Void, Ptr, Int8, Int)
+
+/// void __atomic_store_2(void *ptr, int16_t val, int smodel);
+TLI_DEFINE_ENUM_INTERNAL(atomic_store_2)
+TLI_DEFINE_STRING_INTERNAL("__atomic_store_2")
+TLI_DEFINE_SIG_INTERNAL(Void, Ptr, Int16, Int)
+
+/// void __atomic_store_4(void *ptr, int32_t val, int smodel);
+TLI_DEFINE_ENUM_INTERNAL(atomic_store_4)
+TLI_DEFINE_STRING_INTERNAL("__atomic_store_4")
+TLI_DEFINE_SIG_INTERNAL(Void, Ptr, Int32, Int)
+
+/// void __atomic_store_8(void *ptr, int64_t val, int smodel);
+TLI_DEFINE_ENUM_INTERNAL(atomic_store_8)
+TLI_DEFINE_STRING_INTERNAL("__atomic_store_8")
+TLI_DEFINE_SIG_INTERNAL(Void, Ptr, Int64, Int)
+
+/// void __atomic_store_16(void *ptr, int128_t val, int smodel);
+TLI_DEFINE_ENUM_INTERNAL(atomic_store_16)
+TLI_DEFINE_STRING_INTERNAL("__atomic_store_16")
+TLI_DEFINE_SIG_INTERNAL(Void, Ptr, Int128, Int)
+
+/// bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange")
+TLI_DEFINE_SIG_INTERNAL(Bool, SizeT, Ptr, Ptr, Ptr, Int, Int)
+
+/// bool __atomic_compare_exchange_1(void *ptr, void *expected, uint8_t desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange_1)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange_1")
+TLI_DEFINE_SIG_INTERNAL(Bool, Ptr, Ptr, Int8, Int, Int)
+
+/// bool __atomic_compare_exchange_2(void *ptr, void *expected, uint16_t desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange_2)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange_2")
+TLI_DEFINE_SIG_INTERNAL(Bool, Ptr, Ptr, Int16, Int, Int)
+
+/// bool __atomic_compare_exchange_4(void *ptr, void *expected, uint32_t desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange_4)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange_4")
+TLI_DEFINE_SIG_INTERNAL(Bool, Ptr, Ptr, Int32, Int, Int)
+
+/// bool __atomic_compare_exchange_8(void *ptr, void *expected, uint64_t desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange_8)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange_8")
+TLI_DEFINE_SIG_INTERNAL(Bool, Ptr, Ptr, Int64, Int, Int)
+
+/// bool __atomic_compare_exchange_16(void *ptr, void *expected, uint128_t desired, int success, int failure);
+TLI_DEFINE_ENUM_INTERNAL(atomic_compare_exchange_16)
+TLI_DEFINE_STRING_INTERNAL("__atomic_compare_exchange_16")
+TLI_DEFINE_SIG_INTERNAL(Bool, Ptr, Ptr, Int128, Int, Int)
+
 /// double __cosh_finite(double x);
 TLI_DEFINE_ENUM_INTERNAL(cosh_finite)
 TLI_DEFINE_STRING_INTERNAL("__cosh_finite")
diff --git a/llvm/include/llvm/Support/AtomicOrdering.h b/llvm/include/llvm/Support/AtomicOrdering.h
index e08c1b262a92b..010bc06bb8570 100644
--- a/llvm/include/llvm/Support/AtomicOrdering.h
+++ b/llvm/include/llvm/Support/AtomicOrdering.h
@@ -158,6 +158,28 @@ inline AtomicOrderingCABI toCABI(AtomicOrdering AO) {
   return lookup[static_cast<size_t>(AO)];
 }
 
+inline AtomicOrdering fromCABI(AtomicOrderingCABI AO) {
+  // Acquire is the the closest but still stronger ordering of consume.
+  static const AtomicOrdering lookup[8] = {
+      /* relaxed */ AtomicOrdering::Monotonic,
+      /* consume */ AtomicOrdering::Acquire,
+      /* acquire */ AtomicOrdering::Acquire,
+      /* release */ AtomicOrdering::Release,
+      /* acq_rel */ AtomicOrdering::AcquireRelease,
+      /* acq_seq */ AtomicOrdering::SequentiallyConsistent,
+  };
+  return lookup[static_cast<size_t>(AO)];
+}
+
+inline AtomicOrdering fromCABI(int64_t AO) {
+  if (!isValidAtomicOrderingCABI(AO)) {
+    // This fallback is what CGAtomic does
+    return AtomicOrdering::Monotonic;
+  }
+  assert(isValidAtomicOrderingCABI(AO));
+  return fromCABI(static_cast<AtomicOrderingCABI>(AO));
+}
+
 } // end namespace llvm
 
 #endif // LLVM_SUPPORT_ATOMICORDERING_H
diff --git a/llvm/include/llvm/Testing/Support/Error.h b/llvm/include/llvm/Testing/Support/Error.h
index 5ed8f11e6189b..05a50a74c06ec 100644
--- a/llvm/include/llvm/Testing/Support/Error.h
+++ b/llvm/include/llvm/Testing/Support/Error.h
@@ -80,6 +80,48 @@ class ValueMatchesPoly {
   M Matcher;
 };
 
+template <typename RefT> class StoreResultMatcher {
+  class Impl : public testing::MatcherInterface<
+                   const llvm::detail::ExpectedHolder<RefT> &> {
+  public:
+    explicit Impl(RefT &Ref) : Ref(Ref) {}
+
+    bool
+    MatchAndExplain(const llvm::detail::ExpectedHolder<RefT> &Holder,
+                    testing::MatchResultListener *listener) const override {
+      // If failed to get a value, fail the ASSERT/EXPECT and do not store any
+      // value
+      if (!Holder.Success())
+        return false;
+
+      // Succeeded with a value, remember it
+      Ref = *Holder.Exp;
+
+      return true;
+    }
+
+    void DescribeTo(std::ostream *OS) const override { *OS << "succeeded"; }
+
+    void DescribeNegationTo(std::ostream *OS) const override {
+      *OS << "failed";
+    }
+
+  private:
+    RefT &Ref;
+  };
+
+public:
+  explicit StoreResultMatcher(RefT &Ref) : Ref(Ref) {}
+
+  template <typename T>
+  operator testing::Matcher<const llvm::detail::ExpectedHolder<T> &>() const {
+    return MakeMatcher(new Impl(Ref));
+  }
+
+private:
+  RefT &Ref;
+};
+
 template <typename InfoT>
 class ErrorMatchesMono : public testing::MatcherInterface<const ErrorHolder &> {
 public:
@@ -222,6 +264,13 @@ detail::ValueMatchesPoly<M> HasValue(M Matcher) {
   return detail::ValueMatchesPoly<M>(Matcher);
 }
 
+/// Matches on Expected<T> values that succeed, but also stores its value into a
+/// variable.
+template <typename RefT>
+detail::StoreResultMatcher<RefT> StoreResult(RefT &Ref) {
+  return detail::StoreResultMatcher<RefT>(Ref);
+}
+
 } // namespace llvm
 
 #endif
diff --git a/llvm/include/llvm/Transforms/Utils/BuildBuiltins.h b/llvm/include/llvm/Transforms/Utils/BuildBuiltins.h
new file mode 100644
index 0000000000000..65765adc297ea
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Utils/BuildBuiltins.h
@@ -0,0 +1,278 @@
+//===- BuildBuiltins.h - Utility builder for builtins ---------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements some functions for lowering compiler builtins,
+// specifically for atomics. Currently, LLVM-IR has no representation of atomics
+// that can be used independent of its arguments:
+//
+// * The instructions load atomic, store atomic, atomicrmw, and cmpxchg can only
+//   be used with constant memory model, sync scope, data sizes (that must be
+//   power-of-2), volatile and weak property, and should not be used with data
+//   types that are untypically large which may slow down the compiler.
+//
+// * libcall (in GCC's case: libatomic; LLVM: Compiler-RT) functions work with
+//   any data size, but are slower. Specialized functions for a selected number
+//   of data sizes exist as well. They do not support sync scopes, the volatile
+//   or weakness property. These functions may be implemented using a lock and
+//   availability depends on the target triple (e.g. GPU devices cannot
+//   implement a global lock by design).
+//
+// Whe want to mimic Clang's behaviour:
+//
+// * Prefer atomic instructions over libcall functions whenever possible. When a
+//   target backend does not support atomic instructions natively,
+//   AtomicExpandPass, LowerAtomicPass, or some backend-specific pass lower will
+//   convert such instructions to a libcall function call. The reverse is not
+//   the case, i.e. once a libcall function is emitted, there is no pass that
+//   optimizes it into an instruction.
+//
+// * When passed a non-constant enum argument which the instruction requires to
+//   be constant, then emit a switch case for each enum case.
+//
+// Clang currently doesn't actually check whether the target actually supports
+// atomic libcall functions so it will always fall back to a libcall function
+// even if the target does not support it. That is, emitting an atomic builtin
+// may fail and a frontend needs to handle this case.
+//
+// Clang also assumes that the maximum supported data size of atomic instruction
+// is 16, despite this is target-dependent and should be queried using
+// TargetLowing::getMaxAtomicSizeInBitsSupported(). However, TargetMachine
+// (which is a factory for TargetLowing) is not available during Clang's CodeGen
+// phase, it is only created for the LLVM pass pipeline.
+//
+// The functions in this file are intended to handle the complexity of builtins
+// so frontends do not need to care about the details. A major difference betwee
+// the cases is that the IR instructions take values directly as an llvm::Value
+// (except the atomic address of course), but the libcall functions almost
+// always take pointers to those values. Since we cannot assume that everything
+// can be passed an llvm::Value (LLVM does not handle large types such as i4096
+// well), our abstraction passes everything as pointer which is load'ed when
+// needed. The caller is responsible to emit a temporary AllocaInst and store if
+// it needs to pass an llvm::Value. Mem2Reg/SROA will easily remove any
+// unnecessary store/load pairs.
+//
+// In the future LLVM may introduce more generic atomic constructs that is
+// lowered by an LLVM pass, such as AtomicExpandPass. Once this exist, the
+// emitBuiltin functions in this file become trivial.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_UTILS_BUILDBUILTINS_H
+#define LLVM_TRANSFORMS_UTILS_BUILDBUILTINS_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/Support/Alignment.h"
+#include "llvm/Support/AtomicOrdering.h"
+#include "llvm/Support/Error.h"
+#include <cstdint>
+#include <variant>
+
+namespace llvm {
+class Value;
+class TargetLibraryInfo;
+class DataLayout;
+class IRBuilderBase;
+class Type;
+class TargetLowering;
+
+namespace SyncScope {
+typedef uint8_t ID;
+}
+
+/// Options for controlling atomic builtins.
+struct AtomicEmitOptions {
+  AtomicEmitOptions(const DataLayout &DL, const TargetLibraryInfo *TLI,
+                    const TargetLowering *TL = nullptr)
+      : DL(DL), TLI(TLI), TL(TL) {}
+
+  /// The target's data layout.
+  const DataLayout &DL;
+
+  /// The target's libcall library availability.
+  const TargetLibraryInfo *TLI;
+
+  /// Used to determine which instructions thetarget support. If omitted,
+  /// assumes all accesses up to a size of 16 bytes are supported.
+  const TargetLowering *TL = nullptr;
+
+  /// Whether an LLVM instruction can be emitted. LLVM instructions include:
+  ///  * load atomic
+  ///  * store atomic
+  ///  * cmpxchg
+  ///  * atomicrmw
+  ///
+  /// Atomic LLVM intructions have several restructions on when they can be
+  /// used, including:
+  ///  * Properties such as IsWeak,Memorder,Scope must be constant.
+  ///  * Must be an integer or pointer type. Some cases also allow float types.
+  ///  * Size must be a power-of-two number of bytes.
+  ///  * Size must be at most the size of atomics supported by the target.
+  ///  * Size should not be too large (e.g. i4096) since LLVM does not scale
+  ///    will with huge types.
+  ///
+  /// Even with all these limitations adhered to, AtomicExpandPass may still
+  /// lower the instruction to a libcall function if the target does not support
+  /// it.
+  ///
+  /// See also:
+  ///  * https://llvm.org/docs/Atomics.html
+  ///  * https://llvm.org/docs/LangRef.html#i-load
+  ///  * https://llvm.org/docs/LangRef.html#i-store
+  ///  * https://llvm.org/docs/LangRef.html#cmpxchg-instruction
+  ///  * https://llvm.org/docs/LangRef.html#i-atomicrmw
+  bool AllowInstruction = true;
+
+  /// Whether a switch can be emitted to work around the requirement of
+  /// properties of an instruction must be constant. That is, for each possible
+  /// value of the property, jump to a version of that instruction encoding that
+  /// property.
+  bool AllowSwitch = true;
+
+  /// Allow emitting calls to constant-sized libcall functions, such as
+  ///  * __atomic_load_n
+  ///  * __atomic_store_n
+  ///  * __atomic_compare_exchange_n
+  ///
+  /// where n is as size supported by the target, typically 1,2,4,8,16
+  ///
+  /// See also:
+  ///  * https://llvm.org/docs/Atomics.html
+  ///  * https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary#GCC_intrinsics
+  bool AllowSizedLibcall = true;
+
+  /// Allow emitting call to variable-sized libcall functions, such as
+  // / * __atomic_load
+  ///  * __atomic_store
+  ///  * __atomic_compare_exchange
+  ///
+  /// Note that the signatures of these libcall functions are different from the
+  /// compiler builtins of the same name.
+  ///
+  /// See also:
+  ///  * https://llvm.org/docs/Atomics.html
+  ///  * https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary#GCC_intrinsics
+  bool AllowLibcall = true;
+
+  // TODO: Add additional lowerings:
+  //  * __sync_* libcalls
+  //  * Differently named atomic primitives
+  //    (e.g. InterlockedCompareExchange, C11 primitives on Windows)
+  //  * Using a lock implemention as last resort
+};
+
+/// Emit the __atomic_load builtin. This may either be lowered to the load LLVM
+/// instruction, or to one of the following libcall functions: __atomic_load_1,
+/// __atomic_load_2, __atomic_load_4, __atomic_load_8, __atomic_load_16,
+/// __atomic_load.
+///
+/// Also see:
+/// * https://llvm.org/docs/Atomics.html
+/// * https://llvm.org/docs/LangRef.html#load-instruction
+/// * https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
+/// * https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary#GCC_intrinsics
+Error emitAtomicLoadBuiltin(
+    Value *AtomicPtr, Value *RetPtr, std::variant<Type *, uint64_t> TypeOrSize,
+    bool IsVolatile,
+    std::variant<Value *, AtomicOrdering, AtomicOrderingCABI> Memorder,
+    SyncScope::ID Scope, MaybeAlign Align, IRBuilderBase &Builder,
+    AtomicEmitOptions EmitOptions, const Twine &Name = Twine());
+
+/// Emit the __atomic_store builtin. It may either be lowered to the store LLVM
+/// instruction, or to one of the following libcall functions: __atomic_store_1,
+/// __atomic_store_2, __atomic_store_4, __atomic_store_8, __atomic_store_16,
+/// __atomic_static.
+///
+/// Also see:
+/// * https://llvm.org/docs/Atomics.html
+/// * https://llvm.org/docs/LangRef.html#store-instruction
+/// * https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
+/// * https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary#GCC_intrinsics
+Error emitAtomicStoreBuiltin(
+    Value *AtomicPtr, Value *ValPtr, std::variant<Type *, uint64_t> TypeOrSize,
+    bool IsVolatile,
+    std::variant<Value *, AtomicOrdering, AtomicOrderingCABI> Memorder,
+    SyncScope::ID Scope, MaybeAlign Align, IRBuilderBase &Builder,
+    AtomicEmitOptions EmitOptions, const Twine &Name = Twine());
+
+/// Emit the __atomic_compare_exchange builtin. This may either be
+/// lowered to the cmpxchg LLVM instruction, or to one of the following libcall
+/// functions: __atomic_compare_exchange_1, __atomic_compare_exchange_2,
+/// __atomic_compare_exchange_4, __atomic_compare_exchange_8,
+/// __atomic_compare_exchange_16, __atomic_compare_exchange.
+///
+/// Also see:
+///  * https://llvm.org/docs/Atomics.html
+///  * https://llvm.org/docs/LangRef.html#cmpxchg-instruction
+///  * https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
+///  * https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary#GCC_intrinsics
+///
+/// @param AtomicPtr   The memory location accessed atomically.
+/// @Param ExpectedPtr Pointer to the data expected at \p Ptr. The exchange will
+///                    only happen if the value at \p Ptr is equal to this
+///                    (unless IsWeak is set). Data at \p ExpectedPtr may or may
+///                    not be be overwritten, so do not use after this call.
+/// @Param DesiredPtr  Pointer to the data that the data at \p Ptr is replaced
+///                    with.
+/// @param TypeOrSize  Type of the value to be accessed. cmpxchg
+///                    supports integer and pointers only, other atomics also
+///                    support floats. If any other type or omitted, type-prunes
+///                    to an integer the holds at least \p DataSize bytes.
+///                    Alternatively, the number of bytes can be specified in
+///                    which case an intergers is also used.
+/// @param IsWeak      If true, the exchange may not happen even if the data at
+///                    \p Ptr equals to \p ExpectedPtr.
+/// @param IsVolatile  Whether to mark the access as volatile.
+/// @param SuccessMemorder If the exchange succeeds, memory is affected
+///                    according to the memory model.
+/// @param FailureMemorder If the exchange fails, memory is affected according
+///                    to the memory model. It is considered an atomic "read"
+///                    for the purpose of identifying release sequences. Must
+///                    not be release, acquire-release, and at most as strong as
+///                    \p SuccessMemorder.
+/// @param Scope       (optional) The synchronization scope (domain of threads
+///                    where this access has to be atomic, e.g. CUDA
+///                    warp/block/grid-level atomics) of this access. Defaults
+///                    to system scope.
+/// @param ActualPtr   (optional) Receives the value at \p Ptr before the atomic
+///                    exchange is attempted. This means:
+///                    In case of success:
+///                      The value at \p Ptr before the update. That is, the
+///                      value passed behind \p ExpectedPtr.
+///                    In case of failure
+///                    (including spurious failures if IsWeak):
+///                      The current value at \p Ptr, i.e. the operation
+///                      effectively was an atomic load of that value using
+///                      FailureMemorder semantics.
+///                    Can be the same as ExpectedPtr in which case after the
+///                    call returns \p ExpectedPtr/\p ActualPtr will be the
+///                    value as defined above (in contrast to being undefined).
+/// @param Align       (optional) Known alignment of /p Ptr. If omitted,
+///                    alignment is inferred from /p Ptr itself and falls back
+///                    to no...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/134455