[clang] [Clang][CodeGen] Do not set inbounds flag for struct GEP with null base pointers (PR #130734)

Yingwei Zheng via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 11 01:37:17 PDT 2025


https://github.com/dtcxzyw created https://github.com/llvm/llvm-project/pull/130734

In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`: https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs. Generally, it exists in some (actually) unreachable blocks, which is introduced by JumpThreading.

However, some old-style offsetof macros are still widely used in real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking existing code and inconvenience to downstream users, this patch removes the inbounds flag from the struct gep if the base pointer is null.


>From 02065e86c63ab3fbdefd2ce6e963ffeec96e6a24 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng <dtcxzyw2333 at gmail.com>
Date: Tue, 11 Mar 2025 16:20:08 +0800
Subject: [PATCH] [Clang][CodeGen] Do not set inbounds flag for struct GEP with
 null base pointers

---
 clang/lib/CodeGen/CGBuilder.h                     | 15 ++++++++++-----
 ...nullptr-and-nonzero-offset-in-offsetof-idiom.c |  2 +-
 ...llptr-and-nonzero-offset-in-offsetof-idiom.cpp |  2 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuilder.h b/clang/lib/CodeGen/CGBuilder.h
index b8036cf6e6a30..11e8818b33397 100644
--- a/clang/lib/CodeGen/CGBuilder.h
+++ b/clang/lib/CodeGen/CGBuilder.h
@@ -223,11 +223,16 @@ class CGBuilderTy : public CGBuilderBaseTy {
     const llvm::StructLayout *Layout = DL.getStructLayout(ElTy);
     auto Offset = CharUnits::fromQuantity(Layout->getElementOffset(Index));
 
-    return Address(CreateStructGEP(Addr.getElementType(), Addr.getBasePointer(),
-                                   Index, Name),
-                   ElTy->getElementType(Index),
-                   Addr.getAlignment().alignmentAtOffset(Offset),
-                   Addr.isKnownNonNull());
+    // Specially, we don't add inbounds flags if the base pointer is null.
+    // This is a workaround for old-style offsetof macros.
+    llvm::GEPNoWrapFlags NWFlags = llvm::GEPNoWrapFlags::noUnsignedWrap();
+    if (!isa<llvm::ConstantPointerNull>(Addr.getBasePointer()))
+      NWFlags |= llvm::GEPNoWrapFlags::inBounds();
+    return Address(
+        CreateConstGEP2_32(Addr.getElementType(), Addr.getBasePointer(), 0,
+                           Index, Name, NWFlags),
+        ElTy->getElementType(Index),
+        Addr.getAlignment().alignmentAtOffset(Offset), Addr.isKnownNonNull());
   }
 
   /// Given
diff --git a/clang/test/CodeGen/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.c b/clang/test/CodeGen/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.c
index 68c0ee3a3a885..a7cfd77766712 100644
--- a/clang/test/CodeGen/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.c
+++ b/clang/test/CodeGen/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.c
@@ -17,7 +17,7 @@ struct S {
 
 // CHECK-LABEL: @get_offset_of_y_naively(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    ret i64 ptrtoint (ptr getelementptr inbounds nuw ([[STRUCT_S:%.*]], ptr null, i32 0, i32 1) to i64)
+// CHECK-NEXT:    ret i64 ptrtoint (ptr getelementptr nuw ([[STRUCT_S:%.*]], ptr null, i32 0, i32 1) to i64)
 //
 uintptr_t get_offset_of_y_naively(void) {
   return ((uintptr_t)(&(((struct S *)0)->y)));
diff --git a/clang/test/CodeGenCXX/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.cpp b/clang/test/CodeGenCXX/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.cpp
index 34d4f4c9e34eb..f00a2c486574c 100644
--- a/clang/test/CodeGenCXX/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.cpp
+++ b/clang/test/CodeGenCXX/catch-nullptr-and-nonzero-offset-in-offsetof-idiom.cpp
@@ -10,7 +10,7 @@ struct S {
 
 // CHECK-LABEL: @_Z23get_offset_of_y_naivelyv(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    ret i64 ptrtoint (ptr getelementptr inbounds nuw ([[STRUCT_S:%.*]], ptr null, i32 0, i32 1) to i64)
+// CHECK-NEXT:    ret i64 ptrtoint (ptr getelementptr nuw ([[STRUCT_S:%.*]], ptr null, i32 0, i32 1) to i64)
 //
 uintptr_t get_offset_of_y_naively() {
   return ((uintptr_t)(&(((S *)nullptr)->y)));



More information about the cfe-commits mailing list