[llvm] [DAG] Add freeze(assertext(x)) -> assertext(x) folds (PR #94491)
Björn Pettersson via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 5 08:48:56 PDT 2024
https://github.com/bjope created https://github.com/llvm/llvm-project/pull/94491
This is a new attempt to allow pushing freeze through AssertSext/AssertZext in some situations.
An earlier attempt tried to solve this by handling AssertSext/AssertZext in canCreateUndefOrPoison. Simply saying that result of the assert nodes only are poison/undef if the input is poison/undef. That patch was reverted in
https://github.com/llvm/llvm-project/commit/8505c3b15bfc535ff6624e71add4082680745187
What happened was that we could end up doing folds such as
freeze(assertzext(x)) -> assertzext(freeze(x))
which isn't sound for operations that may "create" poison, unless dropping "poision generating flags" from the operation that is pulled out from the freeze. That is needed to guarantee that the result isn't suddenly more poisonnous (the contraints on the result given by the flags might not hold if the input is poison, so they need to be dropped).
The same thing applies to assertSext/assertZext nodes (at least as we say that they may create poison). For a poisoned input the constraints given by the assertSext/assertZext node isn't fulfilled. So we basically need to "drop" the knowledge given by the assert if pulling assett nodes through freeze, or else the result must be treated as still maybe being poison (making the fold invalid).
This patch is limiting the
freeze(assertzext(x)) -> assertzext(freeze(x))
fold to the situation when x is known not to be undef/poison. That means that we actually can drop the freeze, so the new DAG combine in this patch is actually doing
freeze(assertzext(x)) -> assertzext(x)
For the above to be sound the documentation of AssertSext/AssertZext are updated to highlight that it is considered as immediate UB if the input isn't satisfying the asserted condition (unless it is poison).
>From a082115930e2225f501ca542b504613565a4ec29 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: Wed, 5 Jun 2024 14:55:33 +0200
Subject: [PATCH] [DAG] Add freeze(assertext(x)) -> assertext(x) folds
This is a new attempt to allow pushing freeze through
AssertSext/AssertZext in some situations.
An earlier attempt tried to solve this by handling
AssertSext/AssertZext in canCreateUndefOrPoison. Simply saying that
result of the assert nodes only are poison/undef if the input is
poison/undef. That patch was reverted in
https://github.com/llvm/llvm-project/commit/8505c3b15bfc535ff6624e71add4082680745187
What happened was that we could end up doing folds such as
freeze(assertzext(x)) -> assertzext(freeze(x))
which isn't sound for operations that may "create" poison, unless
dropping "poision generating flags" from the operation that is pulled
out from the freeze. That is needed to guarantee that the result
isn't suddenly more poisonnous (the contraints on the result given by
the flags might not hold if the input is poison, so they need to be
dropped).
The same thing applies to assertSext/assertZext nodes (at least as
we say that they may create poison). For a poisoned input the
constraints given by the assertSext/assertZext node isn't fulfilled.
So we basically need to "drop" the knowledge given by the assert if
pulling assett nodes through freeze, or else the result must be
treated as still maybe being poison (making the fold invalid).
This patch is limiting the
freeze(assertzext(x)) -> assertzext(freeze(x))
fold to the situation when x is known not to be undef/poison. That
means that we actually can drop the freeze, so the new DAG combine
in this patch is actually doing
freeze(assertzext(x)) -> assertzext(x)
For the above to be sound the documentation of AssertSext/AssertZext
are updated to highlight that it is considered as immediate UB if
the input isn't satisfying the asserted condition (unless it is
poison).
---
llvm/include/llvm/CodeGen/ISDOpcodes.h | 5 ++++-
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 14 +++++++++++++
llvm/test/CodeGen/RISCV/float-convert.ll | 20 +++++++++----------
llvm/test/CodeGen/RISCV/rv64zbb.ll | 2 +-
4 files changed, 28 insertions(+), 13 deletions(-)
diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 0f87e062e2da6..af4f39d77894f 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -57,7 +57,10 @@ enum NodeType {
/// been extended, and the second is a value type node indicating the width
/// of the extension.
/// NOTE: In case of the source value (or any vector element value) is
- /// poisoned the assertion will not be true for that value.
+ /// poisoned the assertion will not be true for that value and the
+ /// corresponding result value will be poison. If a source value isn't
+ /// satisfying the condition being asserted (while not being poison), then
+ /// this is considered as immediate undefined behavior.
AssertSext,
AssertZext,
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5148b7258257f..f0e65251b80f1 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -15592,6 +15592,20 @@ SDValue DAGCombiner::visitFREEZE(SDNode *N) {
if (DAG.isGuaranteedNotToBeUndefOrPoison(N0, /*PoisonOnly*/ false))
return N0;
+ // AssertSext/AssertZext nodes are a bit special and need custom
+ // hendling. They are used in contexts when the result is poison only if the
+ // input is poison, but they are still modelled as "canCreateUndefOrPoison" as
+ // the value range for the result is constrained by assumptions that wouldn't
+ // hold if the input is poison (and we can't drop poison-generating flags for
+ // AssertSext/AssertZext in the regular fold below).
+ if (N0.getOpcode() == ISD::AssertSext || N0.getOpcode() == ISD::AssertZext) {
+ // Fold freeze(assertext(x, ...)) -> assertext(x, ...),
+ // iff x is known not to be poison/undef.
+ if (DAG.isGuaranteedNotToBeUndefOrPoison(N0.getOperand(0),
+ /*PoisonOnly*/ false))
+ return N0;
+ }
+
// We currently avoid folding freeze over SRA/SRL, due to the problems seen
// with (freeze (assert ext)) blocking simplifications of SRA/SRL. See for
// example https://reviews.llvm.org/D136529#4120959.
diff --git a/llvm/test/CodeGen/RISCV/float-convert.ll b/llvm/test/CodeGen/RISCV/float-convert.ll
index 7eabd3f5f2273..7ce78a8d2529f 100644
--- a/llvm/test/CodeGen/RISCV/float-convert.ll
+++ b/llvm/test/CodeGen/RISCV/float-convert.ll
@@ -966,26 +966,24 @@ define i64 @fcvt_lu_s_sat(float %a) nounwind {
; RV64I-NEXT: sd ra, 24(sp) # 8-byte Folded Spill
; RV64I-NEXT: sd s0, 16(sp) # 8-byte Folded Spill
; RV64I-NEXT: sd s1, 8(sp) # 8-byte Folded Spill
-; RV64I-NEXT: sd s2, 0(sp) # 8-byte Folded Spill
; RV64I-NEXT: mv s0, a0
-; RV64I-NEXT: lui a1, 391168
-; RV64I-NEXT: addiw a1, a1, -1
-; RV64I-NEXT: call __gtsf2
-; RV64I-NEXT: sgtz a0, a0
-; RV64I-NEXT: neg s1, a0
-; RV64I-NEXT: mv a0, s0
; RV64I-NEXT: li a1, 0
; RV64I-NEXT: call __gesf2
; RV64I-NEXT: slti a0, a0, 0
-; RV64I-NEXT: addi s2, a0, -1
+; RV64I-NEXT: addi s1, a0, -1
; RV64I-NEXT: mv a0, s0
; RV64I-NEXT: call __fixunssfdi
-; RV64I-NEXT: and a0, s2, a0
-; RV64I-NEXT: or a0, s1, a0
+; RV64I-NEXT: and s1, s1, a0
+; RV64I-NEXT: lui a1, 391168
+; RV64I-NEXT: addiw a1, a1, -1
+; RV64I-NEXT: mv a0, s0
+; RV64I-NEXT: call __gtsf2
+; RV64I-NEXT: sgtz a0, a0
+; RV64I-NEXT: neg a0, a0
+; RV64I-NEXT: or a0, a0, s1
; RV64I-NEXT: ld ra, 24(sp) # 8-byte Folded Reload
; RV64I-NEXT: ld s0, 16(sp) # 8-byte Folded Reload
; RV64I-NEXT: ld s1, 8(sp) # 8-byte Folded Reload
-; RV64I-NEXT: ld s2, 0(sp) # 8-byte Folded Reload
; RV64I-NEXT: addi sp, sp, 32
; RV64I-NEXT: ret
start:
diff --git a/llvm/test/CodeGen/RISCV/rv64zbb.ll b/llvm/test/CodeGen/RISCV/rv64zbb.ll
index 4d5ef5db86057..dfbefbe27efc1 100644
--- a/llvm/test/CodeGen/RISCV/rv64zbb.ll
+++ b/llvm/test/CodeGen/RISCV/rv64zbb.ll
@@ -1336,7 +1336,7 @@ define i32 @abs_i32(i32 %x) {
define signext i32 @abs_i32_sext(i32 signext %x) {
; RV64I-LABEL: abs_i32_sext:
; RV64I: # %bb.0:
-; RV64I-NEXT: sraiw a1, a0, 31
+; RV64I-NEXT: srai a1, a0, 31
; RV64I-NEXT: xor a0, a0, a1
; RV64I-NEXT: subw a0, a0, a1
; RV64I-NEXT: ret
More information about the llvm-commits
mailing list