[llvm] [MC][AArch64] Segregate constant pool caches by size. (PR #86832)

Simon Tatham via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 27 10:05:17 PDT 2024


https://github.com/statham-arm created https://github.com/llvm/llvm-project/pull/86832

If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:

```
  ldr w0, =something
  ldr x1, =something
```

then the first call to `ConstantPool::addEntry` will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register.

In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.

>From 3de9dbf648f7f235d9deaadf627386775956aa85 Mon Sep 17 00:00:00 2001
From: Simon Tatham <simon.tatham at arm.com>
Date: Wed, 27 Mar 2024 15:39:17 +0000
Subject: [PATCH] [MC][AArch64] Segregate constant pool caches by size.

If you write a 32- and a 64-bit LDR instruction that both refer to the
same constant or symbol using the = syntax:

  ldr w0, =something
  ldr x1, =something

then the first call to `ConstantPool::addEntry` will insert the
constant into its cache of existing entries, and the second one will
find the cache entry and reuse it. This results in a 64-bit load from
a 32-bit constant, reading nonsense into the other half of the target
register.

In this patch I've done the simplest fix: include the size of the
constant pool entry as part of the key used to index the cache. So now
32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine
merging the two slots with appropriate overlap, so that the 32-bit
load loads the LSW of the 64-bit value. But that's much more
complicated: you have to take endianness into account, and maybe also
adjust the size of an existing entry. This is the simplest fix that
restores correctness.
---
 llvm/include/llvm/MC/ConstantPools.h       |  9 ++++++--
 llvm/lib/MC/ConstantPools.cpp              |  9 ++++----
 llvm/test/MC/AArch64/constant-pool-sizes.s | 25 ++++++++++++++++++++++
 3 files changed, 37 insertions(+), 6 deletions(-)
 create mode 100644 llvm/test/MC/AArch64/constant-pool-sizes.s

diff --git a/llvm/include/llvm/MC/ConstantPools.h b/llvm/include/llvm/MC/ConstantPools.h
index 7eac75362effdc..ff21ccda07a836 100644
--- a/llvm/include/llvm/MC/ConstantPools.h
+++ b/llvm/include/llvm/MC/ConstantPools.h
@@ -43,8 +43,13 @@ struct ConstantPoolEntry {
 class ConstantPool {
   using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
   EntryVecTy Entries;
-  std::map<int64_t, const MCSymbolRefExpr *> CachedConstantEntries;
-  DenseMap<const MCSymbol *, const MCSymbolRefExpr *> CachedSymbolEntries;
+
+  // Caches of entries that already exist, indexed by their contents
+  // and also the size of the constant.
+  std::map<std::pair<int64_t, unsigned>, const MCSymbolRefExpr *>
+      CachedConstantEntries;
+  DenseMap<std::pair<const MCSymbol *, unsigned>, const MCSymbolRefExpr *>
+      CachedSymbolEntries;
 
 public:
   // Initialize a new empty constant pool
diff --git a/llvm/lib/MC/ConstantPools.cpp b/llvm/lib/MC/ConstantPools.cpp
index f895cc6413d74f..824d2463f30fc5 100644
--- a/llvm/lib/MC/ConstantPools.cpp
+++ b/llvm/lib/MC/ConstantPools.cpp
@@ -43,14 +43,15 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
 
   // Check if there is existing entry for the same constant. If so, reuse it.
   if (C) {
-    auto CItr = CachedConstantEntries.find(C->getValue());
+    auto CItr = CachedConstantEntries.find(std::make_pair(C->getValue(), Size));
     if (CItr != CachedConstantEntries.end())
       return CItr->second;
   }
 
   // Check if there is existing entry for the same symbol. If so, reuse it.
   if (S) {
-    auto SItr = CachedSymbolEntries.find(&(S->getSymbol()));
+    auto SItr =
+        CachedSymbolEntries.find(std::make_pair(&(S->getSymbol()), Size));
     if (SItr != CachedSymbolEntries.end())
       return SItr->second;
   }
@@ -60,9 +61,9 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
   Entries.push_back(ConstantPoolEntry(CPEntryLabel, Value, Size, Loc));
   const auto SymRef = MCSymbolRefExpr::create(CPEntryLabel, Context);
   if (C)
-    CachedConstantEntries[C->getValue()] = SymRef;
+    CachedConstantEntries[std::make_pair(C->getValue(), Size)] = SymRef;
   if (S)
-    CachedSymbolEntries[&(S->getSymbol())] = SymRef;
+    CachedSymbolEntries[std::make_pair(&(S->getSymbol()), Size)] = SymRef;
   return SymRef;
 }
 
diff --git a/llvm/test/MC/AArch64/constant-pool-sizes.s b/llvm/test/MC/AArch64/constant-pool-sizes.s
new file mode 100644
index 00000000000000..279402af025f34
--- /dev/null
+++ b/llvm/test/MC/AArch64/constant-pool-sizes.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple aarch64-none-linux-gnu %s | FileCheck %s
+
+  ldr w0, =symbol
+  ldr x1, =symbol
+
+  ldr w2, =1234567890
+  ldr x3, =1234567890
+
+// CHECK:             ldr     w0, .Ltmp0
+// CHECK:             ldr     x1, .Ltmp1
+// CHECK:             ldr     w2, .Ltmp2
+// CHECK:             ldr     x3, .Ltmp3
+
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp0:
+// CHECK-NEXT:        .word   symbol
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp1:
+// CHECK-NEXT:        .xword  symbol
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp2:
+// CHECK-NEXT:        .word   1234567890
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp3:
+// CHECK-NEXT:        .xword  1234567890



More information about the llvm-commits mailing list