[llvm] 2e3cabe - [SeparateConstOffsetFromGEP] Fix bug handling negative offsets
Tom Stellard via llvm-commits
llvm-commits at lists.llvm.org
Thu May 4 18:46:30 PDT 2023
Author: Tom Stellard
Date: 2023-05-04T18:45:49-07:00
New Revision: 2e3cabe172c6f2eaf1f097ffeff1664b3768223a
URL: https://github.com/llvm/llvm-project/commit/2e3cabe172c6f2eaf1f097ffeff1664b3768223a
DIFF: https://github.com/llvm/llvm-project/commit/2e3cabe172c6f2eaf1f097ffeff1664b3768223a.diff
LOG: [SeparateConstOffsetFromGEP] Fix bug handling negative offsets
Fix bug constants and sub instructions
When finding constants in a chain starting with the RHS operator of
sub instructions, we were negating the constant before zero extending
it, which is incorrect.
Unfortunately, I was unable to find a simple way to implement this
transformation correctly, so for now I just disabled this optimization
for constants that feed into the RHS of a sub.
Resolves #62379
Transformation from alive2.llvm.org:
define i16 @src(i8 %a, i8 %b, i8 %c) {
entry:
%0 = sub nuw nsw i8 %c, %a
%1 = sub nuw nsw i8 %b, %0
%2 = zext i8 %1 to i16
ret i16 %2
}
Before/Bad:
define i16 @tgt(i8 %a, i8 %b, i8 %c) {
entry:
%0 = zext i8 %a to i16
%1 = zext i8 %b to i16
%c_neg = sub i8 0, %c
%c_zext = zext i8 %c_neg to i16
%2 = sub i16 0, %0
%3 = sub i16 %1, %2
%4 = add i16 %3, %c_zext
ret i16 %4
}
Correct:
define i16 @tgt(i8 %a, i8 %b, i8 %c) {
entry:
%0 = zext i8 %a to i16
%1 = zext i8 %b to i16
%c_zext = zext i8 %c to i16
%c_neg = sub i16 0, %c_zext
%2 = sub i16 0, %0
%3 = sub i16 %1, %2
%4 = add i16 %3, %c_neg
ret i16 %4
}
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D149507
Added:
llvm/test/Transforms/SeparateConstOffsetFromGEP/pr62379-zeroext-negative.ll
Modified:
llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index 4f5ae7f0ced97..b303edbb7a163 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -521,6 +521,12 @@ bool ConstantOffsetExtractor::CanTraceInto(bool SignExtended,
!haveNoCommonBitsSet(LHS, RHS, DL, nullptr, BO, DT))
return false;
+ // FIXME: We don't currently support constants from the RHS of subs,
+ // when we are zero-extended, because we need a way to zero-extended
+ // them before they are negated.
+ if (ZeroExtended && !SignExtended && BO->getOpcode() == Instruction::Sub)
+ return false;
+
// In addition, tracing into BO requires that its surrounding s/zext (if
// any) is distributable to both operands.
//
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/pr62379-zeroext-negative.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/pr62379-zeroext-negative.ll
new file mode 100644
index 0000000000000..16c1b52070e3f
--- /dev/null
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/pr62379-zeroext-negative.ll
@@ -0,0 +1,28 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt -S '-passes=separate-const-offset-from-gep<lower-gep>' < %s | FileCheck %s
+
+; These tests check that this pass does not incorrectly handle negative offsets.
+
+ at c = internal constant [4 x i32] [i32 0, i32 1, i32 2, i32 3]
+
+; FIXME: We could optimize this case, but we don't currently. This test just
+; checks that we don't miscompile it.
+define i32 @sub_positive(i32 %a, i32 %b, ptr %ptr) {
+; CHECK-LABEL: define i32 @sub_positive
+; CHECK-SAME: (i32 [[A:%.*]], i32 [[B:%.*]], ptr [[PTR:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[TMP0:%.*]] = sub nuw nsw i32 15, [[A]]
+; CHECK-NEXT: [[TMP1:%.*]] = sub nuw nsw i32 [[B]], [[TMP0]]
+; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[TMP1]] to i64
+; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds [4 x i32], ptr [[PTR]], i64 0, i64 [[TMP2]]
+; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[TMP3]], align 4
+; CHECK-NEXT: ret i32 [[TMP4]]
+;
+entry:
+ %0 = sub nuw nsw i32 15, %a
+ %1 = sub nuw nsw i32 %b, %0
+ %2 = zext i32 %1 to i64
+ %3 = getelementptr inbounds [4 x i32], ptr %ptr, i64 0, i64 %2
+ %4 = load i32, ptr %3
+ ret i32 %4
+}
More information about the llvm-commits
mailing list