[clang] [ClangIR] Add ABI Lowering Design Document (PR #178326)

Andy Kaylor via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 13 14:57:39 PST 2026


================
@@ -0,0 +1,602 @@
+# ClangIR ABI Lowering - Design Document
+
+## 1. Introduction
+
+This design describes calling convention lowering that **builds on the GSoC ABI
+Lowering Library** (PR #140112): we use its `abi::Type*` and target ABI logic
+and add an MLIR integration layer (MLIRTypeMapper, ABI lowering pass, and
+dialect rewriters).  The framework relies on the LLVM ABI library in
+`llvm/lib/ABI/` as the single source of truth for ABI classification; MLIR
+dialects use it via an adapter layer.  The design enables CIR to perform
+ABI-compliant calling convention lowering, be reusable by other MLIR dialects
+(particularly FIR), and achieve parity with the CIR incubator for x86_64 and
+AArch64. **What the design is, in concrete terms:** inputs are high-level
+function signatures in CIR, FIR, or other MLIR dialects; outputs are ABI-lowered
+signatures and call sites; lowering runs as an MLIR pass in the compilation
+pipeline, before dialect lowering to LLVM IR or other back ends.
+
+### 1.1 Problem Statement
+
+Calling convention lowering is currently implemented separately for each MLIR
+dialect that needs it.  The CIR incubator has a partial implementation, but it's
+tightly coupled to CIR-specific types and operations, making it unsuitable for
+reuse by other dialects.  This means that FIR (Fortran IR) and future MLIR
+dialects would need to duplicate this complex logic.  While Classic Clang
+CodeGen contains mature ABI lowering code, it cannot be reused directly because
+it's tightly coupled to Clang's AST representation and LLVM IR generation.
+
+### 1.2 Design Goals
+
+Building on the GSoC library and adding an MLIR integration layer avoids
+duplicating complex ABI logic across MLIR dialects, reduces maintenance, and
+keeps a single source of ABI compliance in `llvm/lib/ABI/`.  The separation
+between GSoC (classification) and dialect-specific ABIRewriteContext (rewriting)
+enables clearer testing and a straightforward migration path from the CIR
+incubator by porting useful algorithms into the GSoC library where appropriate.
+
+A central goal is that generated code be **call-compatible with Classic Clang
+CodeGen** (and other compilers).  Parity is with Classic Clang CodeGen output,
+not only with the incubator.  Success means CIR correctly lowers x86_64 and
+AArch64 calling conventions with full ABI compliance using the GSoC library
+and MLIR integration layer; FIR can adopt the same infrastructure with minimal
+dialect-specific adaptation (e.g.  cdecl when calling C from Fortran).  ABI
+compliance will be validated through differential testing against Classic Clang
+CodeGen, and performance overhead should remain under 5% compared to a direct,
+dialect-specific implementation.  Initial scope focuses on fixed-argument
+functions; variadic support (varargs) is deferred.
+
+## 2. Background and Context
+
+### 2.1 What is Calling Convention Lowering?
+
+Calling convention lowering transforms high-level function signatures to match
+target ABI (Application Binary Interface) requirements.  When a function is
+declared at the source level with convenient, language-level types, these types
+must be translated into the specific register assignments, memory layouts, and
+calling sequences that the target architecture expects.  For example, on x86_64
+System V ABI, a struct containing two 64-bit integers might be "expanded" into
+two separate arguments passed in registers, rather than being passed as a single
+aggregate:
+
+```
+// High-level CIR
+func @foo(i32, struct<i64, i64>) -> i32
+
+// After ABI lowering
+func @foo(i32 %arg0, i64 %arg1, i64 %arg2) -> i32
+//        ^       ^            ^        ^
+//        |       |            +--------+ struct expanded into fields
+//        |       +---- first field passed in register
+//        +---- small integer passed in register
+```
+
+Calling convention lowering is complex for several reasons: it is highly
+target-specific (each architecture has different rules for registers vs.
+memory), type-dependent (rules differ for integers, floats, structs, unions,
+arrays), and context-sensitive (varargs, virtual calls, conventions like
+vectorcall or preserve_most).  The same target may have multiple ABI variants
+(e.g.  x86_64 System V vs.  Windows x64), adding further complexity.
+
+### 2.2 Existing Implementations
+
+#### Classic Clang CodeGen
+
+Classic Clang CodeGen (located in `clang/lib/CodeGen/`) transforms calling
+conventions during the AST-to-LLVM-IR lowering process.  This implementation is
+mature and well-tested, handling all supported targets with comprehensive ABI
+coverage.  However, it's tightly coupled to both Clang's AST representation and
+LLVM IR, making it difficult to reuse for MLIR-based frontends.
+
+#### CIR Incubator
+
+The CIR incubator includes a calling convention lowering pass in
+`clang/lib/CIR/Dialect/Transforms/TargetLowering/` that transforms CIR
+operations into ABI-lowered CIR operations as an MLIR pass.  This implementation
+successfully adapted logic from Classic Clang CodeGen to work within the MLIR
+framework.  However, it relies on CIR-specific types and operations, preventing
+reuse by other MLIR dialects.
+
+#### GSoC ABI Lowering Library
+
+A 2025 Google Summer of Code project produced [PR
+#140112](https://github.com/llvm/llvm-project/pull/140112), which proposes
+extracting Clang's ABI logic into a reusable library in `llvm/lib/ABI/`.  The
+design centers on a shadow type system (`abi::Type*`) separate from both Clang's
+AST types and LLVM IR types, enabling the ABI classification algorithms to work
+independently of any specific frontend representation.  The library includes
+abstract `ABIInfo` base classes and target-specific implementations (e.g.
+x86_64, BPF) and provides QualTypeMapper for Clang to map `QualType` to
+`abi::Type*`.
+
+Our approach is to complete and extend this library and use it as the single
+source of truth for ABI classification.  One implementation in one place reduces
+duplication, simplifies bug fixes, and creates a path for Classic Clang CodeGen
+to use the same logic in the future.  MLIR dialects (CIR, FIR, and others) will
+use the library via an adapter layer rather than reimplementing ABI logic.
+
+**Current state.** The x86_64 implementation is largely complete and under
+review.  AArch64 and some other targets are not yet implemented; there is no
+MLIR integration today.  The work is being upstreamed in smaller parts (e.g.
+[PR 158329](https://github.com/llvm/llvm-project/pull/158329)); progress is
+limited by reviewer bandwidth.  The overhead of the shadow type system
+(converting to and from `abi::Type*`) has been measured at under 0.1% for clang
+-O0, so it is negligible for CIR.  Our approach therefore depends on the GSoC
+library being merged upstream or our contributions to it being accepted.
+
+**Our approach.** The approach is to complete and extend the GSoC library (e.g.
+AArch64, review feedback, tests) and add an **MLIR integration layer** so that
+MLIR dialects can use it:
+
+- **MLIRTypeMapper**: maps `mlir::Type` to `abi::Type*`, analogous to
+  QualTypeMapper for Clang.
+
+- **MLIR ABI lowering pass**: uses the library's `ABIInfo` for classification,
+  then performs dialect-specific rewriting via `ABIRewriteContext` for CIR, FIR,
+  and other dialects.
+
+The CIR incubator serves as a **reference only** (e.g. for AArch64 algorithms).
+We do not upstream the incubator's CIR-specific ABI implementation as the
+long-term solution; we port useful algorithms into the GSoC library where
+appropriate.
+
+### 2.3 Requirements for MLIR Dialects
+
+CIR needs to lower C/C++ calling conventions correctly, with initial support for
+x86_64 and AArch64 targets.  It must handle structs, unions, and complex types,
+as well as support instance methods and virtual calls.  FIR's initial need is
+**cdecl for calling C from Fortran** (C interop); that is in scope.
+Fortran-specific ABI semantics (e.g.  CHARACTER hidden length parameters, array
+descriptors) are out of initial scope; full Fortran ABI lowering is a broader
+goal.  Both dialects share common requirements: strict target ABI compliance,
+efficient lowering with minimal overhead, extensibility for adding new target
+architectures, and comprehensive testability and validation capabilities.
+
+## 3. Proposed Solution
+
+**Core.** The GSoC library in `llvm/lib/ABI/` performs ABI classification on
+`abi::Type*`.  It provides `ABIInfo` and target-specific implementations
+(x86_64, BPF, and eventually AArch64 and others).  This is the single place
+where ABI rules are implemented.
+
+**MLIR side.** To use this library from MLIR dialects we add an integration
+layer: (1) **MLIRTypeMapper** maps `mlir::Type` to `abi::Type*` (analogous to
+QualTypeMapper for Clang).  (2) A **generic ABI lowering pass** invokes the
+library's `ABIInfo` for classification, then (3) performs **dialect-specific
+rewriting** via the `ABIRewriteContext` interface—each dialect (CIR, FIR, etc.)
+implements only the glue to create its own operations (e.g. `cir.call`,
+`fir.call`).  Classification logic is shared; operation creation is
+dialect-specific.
+
+The following diagram shows the layering.  At the top, the GSoC library holds
+the ABI logic.  In the middle, adapters connect frontends to it: Classic Clang
+CodeGen uses QualTypeMapper; MLIR uses MLIRTypeMapper and the ABI lowering pass.
+At the bottom, each dialect implements `ABIRewriteContext` only; FIR is shown as
+a consumer for cdecl/C interop (e.g. calling C from Fortran).
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│  GSoC ABI Library (llvm/lib/ABI/)                               │
+│  ABIInfo, abi::Type*, target implementations (X86, AArch64,…)   │
+└─────────────────────────────────────────────────────────────────┘
+                              │
+            ┌─────────────────┴─────────────────┐
+            │                                   │
+            ▼                                   ▼
+┌───────────────────────┐         ┌───────────────────────────────┐
+│  Classic CodeGen      │         │  MLIR adapter                 │
+│  QualTypeMapper       │         │  MLIRTypeMapper + ABI pass    │
+└───────────────────────┘         └───────────────────────────────┘
+                                                │
+                               ┌────────────────┼────────────────┐
+                               │                │                │
+                               ▼                ▼                ▼
+                         ┌────────────┐   ┌────────────┐   ┌────────────┐
+                         │ CIR        │   │ FIR        │   │ Future     │
+                         │ ABIRewrite │   │ (cdecl/C   │   │ Dialects   │
+                         │ Context    │   │  interop)  │   │            │
+                         └────────────┘   └────────────┘   └────────────┘
+```
+
+## 4. Design Overview
+
+### 4.1 Architecture Diagram
+
+The following diagram shows how the design builds on the GSoC library (Section
+3).  At the top, GSoC holds the ABI classification logic.  The middle layer
+adapts MLIR to GSoC: MLIRTypeMapper converts `mlir::Type` to `abi::Type*`, and
----------------
andykaylor wrote:

MLIRTypeMapper feels like a very general name. We might want something to make it clear that it's being used for ABI type mapping.

https://github.com/llvm/llvm-project/pull/178326


More information about the cfe-commits mailing list