[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)
    Alexander Richardson via llvm-branch-commits 
    llvm-branch-commits at lists.llvm.org
       
    Thu Jan  2 16:50:47 PST 2025
    
    
  
https://github.com/arichardson updated https://github.com/llvm/llvm-project/pull/105735
>From e4bd1181d160b8728e7d4158417a83e183bd1709 Mon Sep 17 00:00:00 2001
From: Alex Richardson <alexrichardson at google.com>
Date: Thu, 22 Aug 2024 14:36:04 -0700
Subject: [PATCH 1/3] fix indentation in langref
Created using spr 1.3.6-beta.1
---
 llvm/docs/LangRef.rst | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 200224c78be004..1a59fba65815cc 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3103,19 +3103,19 @@ as follows:
 ``A<address space>``
     Specifies the address space of objects created by '``alloca``'.
     Defaults to the default address space of 0.
-``p[<flags>][n]:<size>:<abi>[:<pref>][:<idx>]``
+``p[<flags>][<address space>]:<size>:<abi>[:<pref>][:<idx>]``
     This specifies the *size* of a pointer and its ``<abi>`` and
     ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
     and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
     index that used for address calculation, which must be less than or equal
     to the pointer size. If not
     specified, the default index size is equal to the pointer size. All sizes
-    are in bits. The address space, ``n``, is optional, and if not specified,
-    denotes the default address space 0. The value of ``n`` must be
-    in the range [1,2^24).
+    are in bits. The ``<address space>``, is optional, and if not specified,
+    denotes the default address space 0. The value of ``<address space>`` must
+    be in the range [1,2^24).
     The optional``<flags>`` are used to specify properties of pointers in this
-address space: the character ``u`` marks pointers as having an unstable
-    representation and ```n`` marks pointers as non-integral (i.e. having
+    address space: the character ``u`` marks pointers as having an unstable
+    representation and ``n`` marks pointers as non-integral (i.e. having
     additional metadata). See :ref:`Non-Integral Pointer Types <nointptrtype>`.
 
 ``i<size>:<abi>[:<pref>]``
>From db97145d3a653f2999b5935f9b1cb4550230689d Mon Sep 17 00:00:00 2001
From: Alex Richardson <alexrichardson at google.com>
Date: Fri, 25 Oct 2024 12:51:11 -0700
Subject: [PATCH 2/3] include feedback
Created using spr 1.3.6-beta.1
---
 llvm/docs/LangRef.rst             | 30 +++++++++++++++++-------------
 llvm/include/llvm/IR/DataLayout.h |  8 ++++----
 2 files changed, 21 insertions(+), 17 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index c137318af678b6..3c3d0e0b4ab8ee 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -659,7 +659,7 @@ LLVM IR optionally allows the frontend to denote pointers in certain address
 spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
 via the :ref:`datalayout string<langref_datalayout>`.
 
-These exact implications of these properties are target-specific, but the
+The exact implications of these properties are target-specific, but the
 following IR semantics and restrictions to optimization passes apply:
 
 Unstable pointer representation
@@ -668,7 +668,7 @@ Unstable pointer representation
 Pointers in this address space have an *unspecified* bitwise representation
 (i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
 allowed to change in a target-specific way. For example, this could be a pointer
-type used for with copying garbage collection where the garbage collector could
+type used with copying garbage collection where the garbage collector could
 update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
@@ -705,10 +705,10 @@ representation of the pointer.
 Non-integral pointer representation
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Pointers are not represented as an address, but may instead include
+Pointers are not represented as just an address, but may instead include
 additional metadata such as bounds information or a temporal identifier.
 Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
-32-bit offset or CHERI capabilities that contain bounds, permissions and an
+32-bit offset, or CHERI capabilities that contain bounds, permissions and an
 out-of-band validity bit. In general, these pointers cannot be re-created
 from just an integer value.
 
@@ -716,23 +716,25 @@ In most cases pointers with a non-integral representation behave exactly the
 same as an integral pointer, the only difference is that it is not possible to
 create a pointer just from an address.
 
-"Non-integral" pointers also impose restrictions on the optimizer, but in
-general these are less restrictive than for "unstable" pointers. The main
+"Non-integral" pointers also impose restrictions on transformation passes, but
+in general these are less restrictive than for "unstable" pointers. The main
 difference compared to integral pointers is that ``inttoptr`` instructions
 should not be inserted by passes as they may not be able to create a valid
 pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
 folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary metadata
 to reconstruct the pointer.
-Additionaly, since there could be out-of-band state, it is also not legal to
+Additionally, since there could be out-of-band state, it is also not legal to
 convert a load/store of a non-integral pointer type to a load/store of an
-integer type with same bitwidth as that may not copy all the state.
-However, it is legal to use appropriately aligned ``llvm.memcpy`` and
-``llvm.memmove`` for copies of non-integral pointers as long as these are not
-converted into integer operations.
+integer type with same bitwidth, as that may not copy all the state.
+However, it is legal to use appropriately-aligned ``llvm.memcpy`` and
+``llvm.memmove`` for copies of non-integral pointers.
+NOTE: Lowering of ``llvm.memcpy`` containing non-integral pointer types must use
+appropriately-aligned and sized types instead of smaller integer types.
 
 Unlike "unstable" pointers, the bit-wise representation is stable and
-``ptrtoint(x)`` always yields a deterministic values.
-This means optimizer is still permitted to insert new ``ptrtoint`` instructions.
+``ptrtoint(x)`` always yields a deterministic value.
+This means transformation passes are still permitted to insert new ``ptrtoint``
+instructions.
 However, it is important to note that ``ptrtoint`` may not yield the same value
 as storing the pointer via memory and reading it back as an integer, even if the
 bitwidth of the two types matches (since ptrtoint could involve some form of
@@ -12187,6 +12189,8 @@ If ``value`` is smaller than ``ty2`` then a zero extension is done. If
 ``value`` is larger than ``ty2`` then a truncation is done. If they are
 the same size, then nothing is done (*no-op cast*) other than a type
 change.
+For :ref:`non-integral pointers <_nointptrtype>` the ``ptrtoint`` instruction
+may involve additional transformations beyond truncations or extension.
 
 Example:
 """"""""
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index ca185bfec851a8..206abcdbea0a34 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -357,8 +357,8 @@ class DataLayout {
   /// instructions operating on pointers of this address space.
   /// TODO: remove this function after migrating to finer-grained properties.
   bool isNonIntegralAddressSpace(unsigned AddrSpace) const {
-    const PointerSpec &PS = getPointerSpec(AddrSpace);
-    return PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation;
+    return hasUnstableRepresentation(AddrSpace) ||
+           hasNonIntegralRepresentation(AddrSpace);
   }
 
   /// Returns whether this address space has an "unstable" pointer
@@ -390,8 +390,8 @@ class DataLayout {
   /// representations (hasUnstableRepresentation()) unless the pass knows it is
   /// within a critical section that retains the current representation.
   bool shouldAvoidIntToPtr(unsigned AddrSpace) const {
-    const PointerSpec &PS = getPointerSpec(AddrSpace);
-    return PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation;
+    return hasUnstableRepresentation(AddrSpace) ||
+           hasNonIntegralRepresentation(AddrSpace);
   }
 
   /// Returns whether passes should avoid introducing `ptrtoint` instructions
>From 94ecfa353dcf44087797594a8f77f9653c8b8e4a Mon Sep 17 00:00:00 2001
From: Alex Richardson <alexrichardson at google.com>
Date: Fri, 25 Oct 2024 14:54:59 -0700
Subject: [PATCH 3/3] address more feedback
Created using spr 1.3.6-beta.1
---
 llvm/docs/LangRef.rst                | 16 ++++++----
 llvm/include/llvm/IR/DataLayout.h    |  6 ++--
 llvm/lib/IR/DataLayout.cpp           |  5 +--
 llvm/unittests/IR/DataLayoutTest.cpp | 46 ++++++++++++++++------------
 4 files changed, 43 insertions(+), 30 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3c3d0e0b4ab8ee..2313527afedd7c 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -709,8 +709,10 @@ Pointers are not represented as just an address, but may instead include
 additional metadata such as bounds information or a temporal identifier.
 Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
 32-bit offset, or CHERI capabilities that contain bounds, permissions and an
-out-of-band validity bit. In general, these pointers cannot be re-created
-from just an integer value.
+out-of-band validity bit. In general, valid non-integral pointers cannot be
+created from just an integer value: while ``inttoptr`` yields a deterministic
+bitwise pattern, the resulting value is not guaranteed to be a valid
+dereferenceable pointer.
 
 In most cases pointers with a non-integral representation behave exactly the
 same as an integral pointer, the only difference is that it is not possible to
@@ -3200,9 +3202,11 @@ as follows:
     this set are considered to support most general arithmetic operations
     efficiently.
 ``ni:<address space0>:<address space1>:<address space2>...``
-    This specifies pointer types with the specified address spaces
-    as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
-    address space cannot be specified as non-integral.
+    This marks pointer types with the specified address spaces
+    as :ref:`non-integral and unstable <nointptrtype>`.
+    The ``0`` address space cannot be specified as non-integral.
+    It is only supported for backwards compatibility, the flags of the ``p``
+    specifier should be used instead for new code.
 
 On every specification that takes a ``<abi>:<pref>``, specifying the
 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
@@ -12189,7 +12193,7 @@ If ``value`` is smaller than ``ty2`` then a zero extension is done. If
 ``value`` is larger than ``ty2`` then a truncation is done. If they are
 the same size, then nothing is done (*no-op cast*) other than a type
 change.
-For :ref:`non-integral pointers <_nointptrtype>` the ``ptrtoint`` instruction
+For :ref:`non-integral pointers <nointptrtype>` the ``ptrtoint`` instruction
 may involve additional transformations beyond truncations or extension.
 
 Example:
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 206abcdbea0a34..af9556feb724f6 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -341,9 +341,9 @@ class DataLayout {
   /// rounded up to a whole number of bytes.
   unsigned getIndexSize(unsigned AS) const;
 
-  /// Return the address spaces containing non-integral pointers.  Pointers in
-  /// this address space don't have a well-defined bitwise representation.
-  SmallVector<unsigned, 8> getNonIntegralAddressSpaces() const {
+  /// Return the address spaces with special pointer semantics (such as being
+  /// unstable or non-integral).
+  SmallVector<unsigned, 8> getNonStandardAddressSpaces() const {
     SmallVector<unsigned, 8> AddrSpaces;
     for (const PointerSpec &PS : PointerSpecs) {
       if (PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation)
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index 722f7b57d160ee..9de984175228f2 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -209,7 +209,7 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = {
 // Default pointer type specifications.
 constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = {
     // p0:64:64:64:64
-    {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false},
+    {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false, false},
 };
 
 DataLayout::DataLayout()
@@ -437,7 +437,8 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
       return Err;
   }
   if (AddrSpace == 0 && (NonIntegralRepr || UnstableRepr))
-    return createStringError("address space 0 cannot be non-integral");
+    return createStringError(
+        "address space 0 cannot be non-integral or unstable");
 
   // Size. Required, cannot be zero.
   unsigned BitWidth;
diff --git a/llvm/unittests/IR/DataLayoutTest.cpp b/llvm/unittests/IR/DataLayoutTest.cpp
index 056584badcf74a..8b6616ce0fb167 100644
--- a/llvm/unittests/IR/DataLayoutTest.cpp
+++ b/llvm/unittests/IR/DataLayoutTest.cpp
@@ -412,7 +412,7 @@ TEST(DataLayout, ParsePointerSpec) {
                         "pn0:64:64", "pu0:64:64", "pun0:64:64", "pnu0:64:64"})
     EXPECT_THAT_EXPECTED(
         DataLayout::parse(Str),
-        FailedWithMessage("address space 0 cannot be non-integral"));
+        FailedWithMessage("address space 0 cannot be non-integral or unstable"));
 }
 
 TEST(DataLayoutTest, ParseNativeIntegersSpec) {
@@ -569,12 +569,12 @@ TEST(DataLayout, GetPointerPrefAlignment) {
 
 TEST(DataLayout, IsNonIntegralAddressSpace) {
   DataLayout Default;
-  EXPECT_THAT(Default.getNonIntegralAddressSpaces(), ::testing::SizeIs(0));
+  EXPECT_THAT(Default.getNonStandardAddressSpaces(), ::testing::SizeIs(0));
   EXPECT_FALSE(Default.isNonIntegralAddressSpace(0));
   EXPECT_FALSE(Default.isNonIntegralAddressSpace(1));
 
   DataLayout Custom = cantFail(DataLayout::parse("ni:2:16777215"));
-  EXPECT_THAT(Custom.getNonIntegralAddressSpaces(),
+  EXPECT_THAT(Custom.getNonStandardAddressSpaces(),
               ::testing::ElementsAreArray({2U, 16777215U}));
   EXPECT_FALSE(Custom.isNonIntegralAddressSpace(0));
   EXPECT_FALSE(Custom.isNonIntegralAddressSpace(1));
@@ -582,37 +582,45 @@ TEST(DataLayout, IsNonIntegralAddressSpace) {
   EXPECT_TRUE(Custom.isNonIntegralAddressSpace(16777215));
 
   // Pointers can be marked as non-integral using 'pn'
-  DataLayout NonIntegral = cantFail(DataLayout::parse("pn2:64:64:64:32"));
-  EXPECT_TRUE(NonIntegral.isNonIntegralAddressSpace(2));
-  EXPECT_TRUE(NonIntegral.hasNonIntegralRepresentation(2));
-  EXPECT_FALSE(NonIntegral.hasUnstableRepresentation(2));
-  EXPECT_TRUE(NonIntegral.shouldAvoidIntToPtr(2));
-  EXPECT_FALSE(NonIntegral.shouldAvoidPtrToInt(2));
+  Custom = cantFail(DataLayout::parse("pn2:64:64:64:32"));
+  EXPECT_TRUE(Custom.isNonIntegralAddressSpace(2));
+  EXPECT_TRUE(Custom.hasNonIntegralRepresentation(2));
+  EXPECT_FALSE(Custom.hasUnstableRepresentation(2));
+  EXPECT_TRUE(Custom.shouldAvoidIntToPtr(2));
+  EXPECT_FALSE(Custom.shouldAvoidPtrToInt(2));
+  EXPECT_THAT(Custom.getNonStandardAddressSpaces(),
+              ::testing::ElementsAreArray({2U}));
 
   // Pointers can be marked as unstable using 'pu'
-  DataLayout Unstable = cantFail(DataLayout::parse("pu2:64:64:64:32"));
-  EXPECT_TRUE(Unstable.isNonIntegralAddressSpace(2));
-  EXPECT_TRUE(Unstable.hasUnstableRepresentation(2));
-  EXPECT_FALSE(Unstable.hasNonIntegralRepresentation(2));
-  EXPECT_TRUE(Unstable.shouldAvoidPtrToInt(2));
-  EXPECT_TRUE(Unstable.shouldAvoidIntToPtr(2));
+  Custom = cantFail(DataLayout::parse("pu2:64:64:64:32"));
+  EXPECT_TRUE(Custom.isNonIntegralAddressSpace(2));
+  EXPECT_TRUE(Custom.hasUnstableRepresentation(2));
+  EXPECT_FALSE(Custom.hasNonIntegralRepresentation(2));
+  EXPECT_TRUE(Custom.shouldAvoidPtrToInt(2));
+  EXPECT_TRUE(Custom.shouldAvoidIntToPtr(2));
+  EXPECT_THAT(Custom.getNonStandardAddressSpaces(),
+              ::testing::ElementsAreArray({2U}));
 
   // Both properties can also be set using 'pnu'/'pun'
-  for (auto Layout : {"pnu2:64:64:64:32", "pun2:64:64:64:32"}) {
+  for (const auto *Layout : {"pnu2:64:64:64:32", "pun2:64:64:64:32"}) {
     DataLayout DL = cantFail(DataLayout::parse(Layout));
     EXPECT_TRUE(DL.isNonIntegralAddressSpace(2));
     EXPECT_TRUE(DL.hasNonIntegralRepresentation(2));
     EXPECT_TRUE(DL.hasUnstableRepresentation(2));
+    EXPECT_THAT(DL.getNonStandardAddressSpaces(),
+                ::testing::ElementsAreArray({2U}));
   }
 
   // For backwards compatibility, the ni DataLayout part overrides any p[n][u].
-  for (auto Layout : {"ni:2-pn2:64:64:64:32", "ni:2-pnu2:64:64:64:32",
-                      "ni:2-pu2:64:64:64:32", "pn2:64:64:64:32-ni:2",
-                      "pnu2:64:64:64:32-ni:2", "pu2:64:64:64:32-ni:2"}) {
+  for (const auto *Layout : {"ni:2-pn2:64:64:64:32", "ni:2-pnu2:64:64:64:32",
+                             "ni:2-pu2:64:64:64:32", "pn2:64:64:64:32-ni:2",
+                             "pnu2:64:64:64:32-ni:2", "pu2:64:64:64:32-ni:2"}) {
     DataLayout DL = cantFail(DataLayout::parse(Layout));
     EXPECT_TRUE(DL.isNonIntegralAddressSpace(2));
     EXPECT_TRUE(DL.hasNonIntegralRepresentation(2));
     EXPECT_TRUE(DL.hasUnstableRepresentation(2));
+    EXPECT_THAT(DL.getNonStandardAddressSpaces(),
+                ::testing::ElementsAreArray({2U}));
   }
 }
 
    
    
More information about the llvm-branch-commits
mailing list