[llvm] [Transforms] Introduce BuildBuiltins.h atomic helpers (PR #134455)

Sun Apr 13 03:10:57 PDT 2025

================
@@ -0,0 +1,278 @@
+//===- BuildBuiltins.h - Utility builder for builtins ---------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements some functions for lowering compiler builtins,
+// specifically for atomics. Currently, LLVM-IR has no representation of atomics
+// that can be used independent of its arguments:
+//
+// * The instructions load atomic, store atomic, atomicrmw, and cmpxchg can only
+//   be used with constant memory model, sync scope, data sizes (that must be
+//   power-of-2), volatile and weak property, and should not be used with data
+//   types that are untypically large which may slow down the compiler.
+//
+// * libcall (in GCC's case: libatomic; LLVM: Compiler-RT) functions work with
+//   any data size, but are slower. Specialized functions for a selected number
+//   of data sizes exist as well. They do not support sync scopes, the volatile
+//   or weakness property. These functions may be implemented using a lock and
+//   availability depends on the target triple (e.g. GPU devices cannot
+//   implement a global lock by design).
+//
+// Whe want to mimic Clang's behaviour:
+//
+// * Prefer atomic instructions over libcall functions whenever possible. When a
+//   target backend does not support atomic instructions natively,
+//   AtomicExpandPass, LowerAtomicPass, or some backend-specific pass lower will
+//   convert such instructions to a libcall function call. The reverse is not
+//   the case, i.e. once a libcall function is emitted, there is no pass that
+//   optimizes it into an instruction.
+//
+// * When passed a non-constant enum argument which the instruction requires to
+//   be constant, then emit a switch case for each enum case.
+//
+// Clang currently doesn't actually check whether the target actually supports
+// atomic libcall functions so it will always fall back to a libcall function
+// even if the target does not support it. That is, emitting an atomic builtin
+// may fail and a frontend needs to handle this case.
+//
+// Clang also assumes that the maximum supported data size of atomic instruction
+// is 16, despite this is target-dependent and should be queried using
+// TargetLowing::getMaxAtomicSizeInBitsSupported(). However, TargetMachine
+// (which is a factory for TargetLowing) is not available during Clang's CodeGen
+// phase, it is only created for the LLVM pass pipeline.
+//
+// The functions in this file are intended to handle the complexity of builtins
+// so frontends do not need to care about the details. A major difference betwee
+// the cases is that the IR instructions take values directly as an llvm::Value
+// (except the atomic address of course), but the libcall functions almost
+// always take pointers to those values. Since we cannot assume that everything
+// can be passed an llvm::Value (LLVM does not handle large types such as i4096
+// well), our abstraction passes everything as pointer which is load'ed when
+// needed. The caller is responsible to emit a temporary AllocaInst and store if
+// it needs to pass an llvm::Value. Mem2Reg/SROA will easily remove any
+// unnecessary store/load pairs.
+//
+// In the future LLVM may introduce more generic atomic constructs that is
+// lowered by an LLVM pass, such as AtomicExpandPass. Once this exist, the
+// emitBuiltin functions in this file become trivial.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_UTILS_BUILDBUILTINS_H
+#define LLVM_TRANSFORMS_UTILS_BUILDBUILTINS_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/Support/Alignment.h"
+#include "llvm/Support/AtomicOrdering.h"
+#include "llvm/Support/Error.h"
+#include <cstdint>
+#include <variant>
+
+namespace llvm {
+class Value;
+class TargetLibraryInfo;
+class DataLayout;
+class IRBuilderBase;
+class Type;
+class TargetLowering;
+
+namespace SyncScope {
+typedef uint8_t ID;
+}
+
+/// Options for controlling atomic builtins.
+struct AtomicEmitOptions {
+  AtomicEmitOptions(const DataLayout &DL, const TargetLibraryInfo *TLI,
+                    const TargetLowering *TL = nullptr)
+      : DL(DL), TLI(TLI), TL(TL) {}
+
+  /// The target's data layout.
+  const DataLayout &DL;
+
+  /// The target's libcall library availability.
+  const TargetLibraryInfo *TLI;
+
+  /// Used to determine which instructions thetarget support. If omitted,
+  /// assumes all accesses up to a size of 16 bytes are supported.
+  const TargetLowering *TL = nullptr;
+
+  /// Whether an LLVM instruction can be emitted. LLVM instructions include:
+  ///  * load atomic
+  ///  * store atomic
+  ///  * cmpxchg
+  ///  * atomicrmw
+  ///
+  /// Atomic LLVM intructions have several restructions on when they can be
+  /// used, including:
+  ///  * Properties such as IsWeak,Memorder,Scope must be constant.
+  ///  * Must be an integer or pointer type. Some cases also allow float types.
+  ///  * Size must be a power-of-two number of bytes.
+  ///  * Size must be at most the size of atomics supported by the target.
----------------
jyknight wrote:

You can use the IR instruction regardless of whether the target supports it or not. Clang frontend does always emit an IR instruction for 1,2,4,8,16-byte atomics.

I think this code should do the same.

https://github.com/llvm/llvm-project/pull/134455