[llvm] [SeparateConstOffsetFromGEP] Avoid miscompiles related to trunc nuw/nsw (PR #154582)

Björn Pettersson via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 20 10:53:52 PDT 2025


https://github.com/bjope updated https://github.com/llvm/llvm-project/pull/154582

>From 1e15421537b281b23d9b6afcb6a4088cdda4ca4a Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: Wed, 20 Aug 2025 19:07:05 +0200
Subject: [PATCH 1/4] [SeparateConstOffsetFromGEP] Add test case with trunc
 nuw/nsw showing miscompile

Pre commit a test case for issue #154116. When redistributing
trunc over add/sub/or we may need to drop poison generating flags
from the trunc.
---
 .../AMDGPU/rebuild-trunc.ll                   | 36 +++++++++++++++++++
 1 file changed, 36 insertions(+)
 create mode 100644 llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll

diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
new file mode 100644
index 0000000000000..8970bfc488421
--- /dev/null
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
@@ -0,0 +1,36 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -passes=separate-const-offset-from-gep -S | FileCheck %s
+
+; FIXME: (add (trunc nuw A), (trunc nuw B)) is more poisonous than
+;        (trunc nuw (add A, B))), so we should for example drop the
+;        nuw on the trunc when doing the rewrite here.
+define ptr @pr154116_nuw(ptr %p, i128 %i) {
+; CHECK-LABEL: define ptr @pr154116_nuw(
+; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[TMP1:%.*]] = trunc nuw i128 [[I]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr i32, ptr [[P]], i64 [[TMP1]]
+; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr i8, ptr [[TMP2]], i64 80
+; CHECK-NEXT:    ret ptr [[ARRAYIDX2]]
+;
+  %idx = add i128 %i, 20
+  %idx.conv = trunc nuw i128 %idx to i64
+  %arrayidx = getelementptr i32, ptr %p, i64 %idx.conv
+  ret ptr %arrayidx
+}
+
+; FIXME: (add (trunc nsw A), (trunc nsw B)) is more poisonous than
+;        (trunc nsw (add A, B))), so we should for example drop the
+;        nsw on the trunc when doing the rewrite here.
+define ptr @pr154116_nsw(ptr %p, i128 %i) {
+; CHECK-LABEL: define ptr @pr154116_nsw(
+; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[TMP1:%.*]] = trunc nsw i128 [[I]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr i32, ptr [[P]], i64 [[TMP1]]
+; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr i8, ptr [[TMP2]], i64 4
+; CHECK-NEXT:    ret ptr [[ARRAYIDX2]]
+;
+  %idx = add i128 %i, 1
+  %idx.conv = trunc nsw i128 %idx to i64
+  %arrayidx = getelementptr i32, ptr %p, i64 %idx.conv
+  ret ptr %arrayidx
+}

>From e83cbc0b75562c78ef97e712286a01acd896dcb4 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: Wed, 20 Aug 2025 19:26:11 +0200
Subject: [PATCH 2/4] [SeparateConstOffsetFromGEP] Avoid miscompiles related to
 trunc nuw/nsw

Drop poison generating flags on trunc when distributing trunc over
add/sub/or. We need to do this since for example
(add (trunc nuw A), (trunc nuw B)) is more poisonous than
(trunc nuw (add A, B))).

In some situations it is pessimistic to drop the flags. Such as
if the add in the example above also has the nuw flag. For now we
keep it simple and always drop the flags.

Worth mentioning is that we drop the flags when cloning
instructions and rebuilding the chain. This is done after the
"allowsPreservingNUW" checks in ConstantOffsetExtractor::Extract.
So we still take the "trunc nuw" into consideration when determining
if nuw can be preserved in the gep (which should be ok since that
check also require that all the involved binary operations has nuw).

Fixes #154116
---
 llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp  | 7 +++++++
 .../SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll | 2 +-
 .../SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll     | 4 ++--
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index fc96589f83238..d4e0176f7e823 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -642,6 +642,13 @@ Value *ConstantOffsetExtractor::applyExts(Value *V) {
 
     Instruction *Ext = I->clone();
     Ext->setOperand(0, Current);
+    // In ConstantOffsetExtractor::find we do not analyze nuw/nsw for trunc, so
+    // we assume that it is ok to redistibute trunc over add/sub/or. But for
+    // example (add (trunc nuw A), (trunc nuw B)) is more poisonous than (trunc
+    // nuw (add A, B))). To make such redistributions legal we drop all the poison
+    // generating flags from cloned trunc instructions here.
+    if (isa<TruncInst>(Ext))
+      Ext->dropPoisonGeneratingFlags();
     Ext->insertBefore(*IP->getParent(), IP);
     Current = Ext;
   }
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
index 2a5b678e91fd8..0d21a99153ea0 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
@@ -311,7 +311,7 @@ entry:
 define ptr @nuw_inbounds_implies_nuw_inbounds_trunc_nuw(ptr %p, i128 %i) {
 ; CHECK-LABEL: @nuw_inbounds_implies_nuw_inbounds_trunc_nuw(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = trunc nuw i128 [[I:%.*]] to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = trunc i128 [[I:%.*]] to i64
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds nuw i32, ptr [[P:%.*]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP1]], i64 4
 ; CHECK-NEXT:    ret ptr [[ARRAYIDX2]]
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
index 8970bfc488421..88dd84474a6cb 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
@@ -7,7 +7,7 @@
 define ptr @pr154116_nuw(ptr %p, i128 %i) {
 ; CHECK-LABEL: define ptr @pr154116_nuw(
 ; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0:[0-9]+]] {
-; CHECK-NEXT:    [[TMP1:%.*]] = trunc nuw i128 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = trunc i128 [[I]] to i64
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr i32, ptr [[P]], i64 [[TMP1]]
 ; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr i8, ptr [[TMP2]], i64 80
 ; CHECK-NEXT:    ret ptr [[ARRAYIDX2]]
@@ -24,7 +24,7 @@ define ptr @pr154116_nuw(ptr %p, i128 %i) {
 define ptr @pr154116_nsw(ptr %p, i128 %i) {
 ; CHECK-LABEL: define ptr @pr154116_nsw(
 ; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0]] {
-; CHECK-NEXT:    [[TMP1:%.*]] = trunc nsw i128 [[I]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = trunc i128 [[I]] to i64
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr i32, ptr [[P]], i64 [[TMP1]]
 ; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr i8, ptr [[TMP2]], i64 4
 ; CHECK-NEXT:    ret ptr [[ARRAYIDX2]]

>From fd626e39189fc0fdde1a51f7bea79639e5aa9fe7 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: Wed, 20 Aug 2025 19:38:42 +0200
Subject: [PATCH 3/4] Fixup. Remove FIXME from test case.

---
 .../SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll    | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
index 88dd84474a6cb..a93257f7e564a 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/rebuild-trunc.ll
@@ -1,9 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
 ; RUN: opt < %s -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -passes=separate-const-offset-from-gep -S | FileCheck %s
 
-; FIXME: (add (trunc nuw A), (trunc nuw B)) is more poisonous than
-;        (trunc nuw (add A, B))), so we should for example drop the
-;        nuw on the trunc when doing the rewrite here.
+; Verify that we drop "nuw" from trunc.
 define ptr @pr154116_nuw(ptr %p, i128 %i) {
 ; CHECK-LABEL: define ptr @pr154116_nuw(
 ; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0:[0-9]+]] {
@@ -18,9 +16,7 @@ define ptr @pr154116_nuw(ptr %p, i128 %i) {
   ret ptr %arrayidx
 }
 
-; FIXME: (add (trunc nsw A), (trunc nsw B)) is more poisonous than
-;        (trunc nsw (add A, B))), so we should for example drop the
-;        nsw on the trunc when doing the rewrite here.
+; Verify that we drop "nsw" from trunc.
 define ptr @pr154116_nsw(ptr %p, i128 %i) {
 ; CHECK-LABEL: define ptr @pr154116_nsw(
 ; CHECK-SAME: ptr [[P:%.*]], i128 [[I:%.*]]) #[[ATTR0]] {

>From 7e045601d39f03a110231fa831fb06976b4950d3 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: Wed, 20 Aug 2025 19:53:23 +0200
Subject: [PATCH 4/4] Fixup formatting in code comment

---
 llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index d4e0176f7e823..28885cad53454 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -645,8 +645,8 @@ Value *ConstantOffsetExtractor::applyExts(Value *V) {
     // In ConstantOffsetExtractor::find we do not analyze nuw/nsw for trunc, so
     // we assume that it is ok to redistibute trunc over add/sub/or. But for
     // example (add (trunc nuw A), (trunc nuw B)) is more poisonous than (trunc
-    // nuw (add A, B))). To make such redistributions legal we drop all the poison
-    // generating flags from cloned trunc instructions here.
+    // nuw (add A, B))). To make such redistributions legal we drop all the
+    // poison generating flags from cloned trunc instructions here.
     if (isa<TruncInst>(Ext))
       Ext->dropPoisonGeneratingFlags();
     Ext->insertBefore(*IP->getParent(), IP);



More information about the llvm-commits mailing list