[llvm] [DataLayout] Introduce DataLayout::getPointerAddressSize(AS) (PR #137412)
Alexander Richardson via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 25 15:56:14 PDT 2025
https://github.com/arichardson created https://github.com/llvm/llvm-project/pull/137412
This function can be used to retrieve the number of bits that can be used
for arithmetic in a given address space (i.e. the range of the address
space). For most in-tree targets this should not make any difference
but differentiating between the size of a pointer in bits and the address
range is extremely important e.g. for CHERI-enabled targets, where pointers
carry additional metadata such as bounds and permissions and only a subset
of the pointer bits is used as the address.
Importantly, this is not always the same as the index width since some
architectures have an address space that is wider than the GEP indexing
allows (e.g. because pointers are represented as base+offset).
The only in-tree target that this currently affects are AMDGPU buffer
fat pointers, which consist of a 48 bit base address and a 32-bit offset.
Currently, this new API is not used anywhere, but the plan is to introduce
a ptrtoaddr operation in the future. See the discussion on
https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/38
Originally uploaded as https://reviews.llvm.org/D135158
>From 110540d7e9104f14cc2ed41c48fb4b8f11c3c38e Mon Sep 17 00:00:00 2001
From: Alex Richardson <alexrichardson at google.com>
Date: Fri, 25 Apr 2025 15:55:57 -0700
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
=?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Created using spr 1.3.6-beta.1
---
llvm/docs/LangRef.rst | 25 +++++++++++++++++--------
llvm/include/llvm/IR/DataLayout.h | 29 ++++++++++++++++++++++++++---
llvm/lib/IR/DataLayout.cpp | 26 +++++++++++++++++++++-----
3 files changed, 64 insertions(+), 16 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 33c85c7ba9d29..b89fb3ba56e8b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3134,16 +3134,25 @@ as follows:
``A<address space>``
Specifies the address space of objects created by '``alloca``'.
Defaults to the default address space of 0.
-``p[n]:<size>:<abi>[:<pref>][:<idx>]``
+``p[n]:<size>:<abi>[:<pref>][:<idx>][:<addr>]``
This specifies the *size* of a pointer and its ``<abi>`` and
``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
- and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
- index that used for address calculation, which must be less than or equal
- to the pointer size. If not
- specified, the default index size is equal to the pointer size. All sizes
- are in bits. The address space, ``n``, is optional, and if not specified,
+ and defaults to ``<abi>``.
+ The fourth parameter ``<idx>`` is the size of the index that used for
+ address calculations such as :ref:`getelementptr <i_getelementptr>`.
+ It must be less than or equal to the pointer size. If not specified, the
+ default index size is equal to the pointer size.
+ The fifth parameter ``<addr>`` specifies the width of addresses in this
+ address space. If not specified, the default address size is equal to the
+ index size. The address size may be wider than either the index or pointer
+ size as it could be a value relative to a base address. For example AMDGPU
+ buffer fat pointers use a 48-bit address range, but only allow for 32 bits
+ of indexing.
+ All sizes are in bits.
+ The address space, ``n``, is optional, and if not specified,
denotes the default address space 0. The value of ``n`` must be
in the range [1,2^24).
+
``i<size>:<abi>[:<pref>]``
This specifies the alignment for an integer type of a given bit
``<size>``. The value of ``<size>`` must be in the range [1,2^24).
@@ -12996,9 +13005,9 @@ This instruction requires several arguments:
- Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
- The call is in tail position (ret immediately follows call and ret
uses value of call or is void).
- - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is
+ - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is
``true``, or the calling convention is ``tailcc``.
- - `Platform-specific constraints are met.
+ - `Platform-specific constraints are met.
<CodeGenerator.html#tail-call-optimization>`_
#. The optional ``notail`` marker indicates that the optimizers should not add
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 2ad080e6d0cd2..b6d788f4db66c 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -78,6 +78,7 @@ class DataLayout {
Align ABIAlign;
Align PrefAlign;
uint32_t IndexBitWidth;
+ uint32_t AddressBitWidth;
/// Pointers in this address space don't have a well-defined bitwise
/// representation (e.g. may be relocated by a copying garbage collector).
/// Additionally, they may also be non-integral (i.e. containing additional
@@ -148,7 +149,7 @@ class DataLayout {
/// Sets or updates the specification for pointer in the given address space.
void setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign,
Align PrefAlign, uint32_t IndexBitWidth,
- bool IsNonIntegral);
+ uint32_t AddressBitWidth, bool IsNonIntegral);
/// Internal helper to get alignment for integer of given bitwidth.
Align getIntegerAlignment(uint32_t BitWidth, bool abi_or_pref) const;
@@ -324,12 +325,26 @@ class DataLayout {
/// the backends/clients are updated.
Align getPointerPrefAlignment(unsigned AS = 0) const;
- /// Layout pointer size in bytes, rounded up to a whole
- /// number of bytes.
+ /// Layout pointer size in bytes, rounded up to a whole number of bytes. The
+ /// difference between this function and getPointerAddressSize() is this one
+ /// returns the size of the entire pointer type (this includes metadata bits
+ /// for fat pointers) and the latter only returns the number of address bits.
+ /// \sa DataLayout::getPointerAddressSizeInBits
/// FIXME: The defaults need to be removed once all of
/// the backends/clients are updated.
unsigned getPointerSize(unsigned AS = 0) const;
+ /// Returns the integral size of a pointer in a given address space in bytes.
+ /// For targets that store bits in pointers that are not part of the address,
+ /// this returns the number of bits that can be manipulated using operations
+ /// that change the address (e.g. addition/subtraction).
+ /// For example, a 64-bit CHERI-enabled target has 128-bit pointers of which
+ /// only 64 are used to represent the address and the remaining ones are used
+ /// for metadata such as bounds and access permissions. In this case
+ /// getPointerSize() returns 16, but getPointerAddressSize() returns 8.
+ /// \sa DataLayout::getPointerSize
+ unsigned getPointerAddressSize(unsigned AS) const;
+
// Index size in bytes used for address calculation,
/// rounded up to a whole number of bytes.
unsigned getIndexSize(unsigned AS) const;
@@ -365,6 +380,14 @@ class DataLayout {
return getPointerSpec(AS).BitWidth;
}
+ unsigned getPointerAddressSizeInBits(unsigned AS) const {
+ // Currently, this returns the same value as getIndexSizeInBits() as this
+ // is correct for all currently known LLVM targets. If another target is
+ // added that has pointer size != pointer range != GEP index width, we can
+ // add a new datalayout field for pointer integral range.
+ return getPointerSpec(AS).AddressBitWidth;
+ }
+
/// Size in bits of index used for address calculation in getelementptr.
unsigned getIndexSizeInBits(unsigned AS) const {
return getPointerSpec(AS).IndexBitWidth;
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index 0cf0bfc9702d3..fe618939ead63 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -208,7 +208,7 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = {
// Default pointer type specifications.
constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = {
// p0:64:64:64:64
- {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false},
+ {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, 64, false},
};
DataLayout::DataLayout()
@@ -454,8 +454,17 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
return createStringError(
"index size cannot be larger than the pointer size");
+ unsigned AddressBitWidth = BitWidth;
+ if (Components.size() > 4)
+ if (Error Err = parseSize(Components[4], AddressBitWidth, "address size"))
+ return Err;
+
+ if (AddressBitWidth > BitWidth)
+ return createStringError(
+ "address size cannot be larger than the pointer size");
+
setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth,
- false);
+ AddressBitWidth, false);
return Error::success();
}
@@ -631,7 +640,7 @@ Error DataLayout::parseLayoutString(StringRef LayoutString) {
// the spec for AS0, and we then update that to mark it non-integral.
const PointerSpec &PS = getPointerSpec(AS);
setPointerSpec(AS, PS.BitWidth, PS.ABIAlign, PS.PrefAlign, PS.IndexBitWidth,
- true);
+ PS.AddressBitWidth, true);
}
return Error::success();
@@ -679,16 +688,19 @@ DataLayout::getPointerSpec(uint32_t AddrSpace) const {
void DataLayout::setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth,
Align ABIAlign, Align PrefAlign,
- uint32_t IndexBitWidth, bool IsNonIntegral) {
+ uint32_t IndexBitWidth,
+ uint32_t AddressBitWidth, bool IsNonIntegral) {
auto I = lower_bound(PointerSpecs, AddrSpace, LessPointerAddrSpace());
if (I == PointerSpecs.end() || I->AddrSpace != AddrSpace) {
PointerSpecs.insert(I, PointerSpec{AddrSpace, BitWidth, ABIAlign, PrefAlign,
- IndexBitWidth, IsNonIntegral});
+ IndexBitWidth, AddressBitWidth,
+ IsNonIntegral});
} else {
I->BitWidth = BitWidth;
I->ABIAlign = ABIAlign;
I->PrefAlign = PrefAlign;
I->IndexBitWidth = IndexBitWidth;
+ I->AddressBitWidth = AddressBitWidth;
I->IsNonIntegral = IsNonIntegral;
}
}
@@ -728,6 +740,10 @@ const StructLayout *DataLayout::getStructLayout(StructType *Ty) const {
return L;
}
+unsigned DataLayout::getPointerAddressSize(unsigned AS) const {
+ return divideCeil(getPointerAddressSizeInBits(AS), 8);
+}
+
Align DataLayout::getPointerABIAlignment(unsigned AS) const {
return getPointerSpec(AS).ABIAlign;
}
More information about the llvm-commits
mailing list