[llvm] [SDAG] Fix deferring constrained function calls (PR #153029)

Serge Pavlov via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 11 07:56:50 PDT 2025


https://github.com/spavloff created https://github.com/llvm/llvm-project/pull/153029

Selection DAG has a more sophisticated execution order representation than the simple sequence used in IR, so building the DAG can take into account specific properties of the nodes to better express possible parallelism. The existing implementation does this for constrained function calls, some of them are considered as independent, which can potentially improve the generated code. However this mechanism incorrectly implies that the calls with exception behavior 'ebIgnore' cannot raise floating-point exception. The purpose of this change is to fix the implementation.

In the current implementation, constrained function calls don't immediately update the DAG root. Instead, the DAG builder collects their output chains and flushes them when the root is required. Constrained function calls cannot be moved across calls of external functions and intrinsics that access floating-point environment, they work as barriers. Between the barriers, constrained function calls can be reordered, they may be considered independent from viewpoint of raising exceptions. For strictfp functions this is possible only if floating-point trapping is disabled.

This change introduces a new restriction - the calls with default exception handling cannot not be moved between strictfp function calls. Otherwise the exceptions raised by such call can disturb the expected exception sequence. It means that constrained function calls with strict exception behavior act as barriers for the calls with non-strict behavior and vice versa. Effectively it means that the entire sequence of constrained calls in IR is split into "strict" and "non-strict" regions, in which restrictions on the order of constrained calls are relaxed, but move from one region to another is not allowed. It agrees with the representation of strictfp code in high-level languages. For example, C/C++ strictfp code correspond to blocks where pragma `STDC FENV_ACCESS ON` is in effect, this restriction should help preserving the intended semantics.

When floating-point exception trapping is enabled, constrained intrinsics with 'ebStrinc' cannot be reordered, their sequence must be identical to the original source order. The current implementation does not distinguish between strictfp modes with trapping and without it. This change make assumption that the trapping is disabled. It is not correct in the general case, but is compatible with the existing implementation, which makes this change NFC.

>From bf659e5ed3c66871e0aa0eca2fcdf422c3c184e5 Mon Sep 17 00:00:00 2001
From: Serge Pavlov <sepavloff at gmail.com>
Date: Sun, 10 Aug 2025 11:26:37 +0700
Subject: [PATCH] [SDAG] Fix deferring constrained function calls

Selection DAG has a more sophisticated execution order representation
than the simple sequence used in IR, so building the DAG can take into
account specific properties of the nodes to better express possible
parallelism. The existing implementation does this for constrained
function calls, some of them are considered as independent, which can
potentially improve the generated code. However this mechanism
incorrectly implies that the calls with exception behavior 'ebIgnore'
cannot raise floating-point exception. The purpose of this change is to
fix the implementation.

In the current implementation, constrained function calls don't
immediately update the DAG root. Instead, the DAG builder collects their
output chains and flushes them when the root is required. Constrained
function calls cannot be moved across calls of external functions and
intrinsics that access floating-point environment, they work as
barriers. Between the barriers, constrained function calls can be
reordered, they may be considered independent from viewpoint of raising
exceptions. For strictfp functions this is possible only if
floating-point trapping is disabled.

This change introduces a new restriction - the calls with default
exception handling cannot not be moved between strictfp function calls.
Otherwise the exceptions raised by such call can disturb the expected
exception sequence. It means that constrained function calls with strict
exception behavior act as barriers for the calls with non-strict
behavior and vice versa. Effectively it means that the entire sequence
of constrained calls in IR is split into "strict" and "non-strict"
regions, in which restrictions on the order of constrained calls are
relaxed, but move from one region to another is not allowed. It agrees
with the representation of strictfp code in high-level languages. For
example, C/C++ strictfp code correspond to blocks where pragma
`STDC FENV_ACCESS ON` is in effect, this restriction should help
preserving the intended semantics.

When floating-point exception trapping is enabled, constrained
intrinsics with 'ebStrinc' cannot be reordered, their sequence must be
identical to the original source order. The current implementation does
not distinguish between strictfp modes with trapping and without it.
This change make assumption that the trapping is disabled. It is not
correct in the general case, but is compatible with the existing
implementation, which makes this change NFC.
---
 .../SelectionDAG/SelectionDAGBuilder.cpp      | 78 ++++++++++++-------
 .../SelectionDAG/SelectionDAGBuilder.h        |  5 ++
 2 files changed, 55 insertions(+), 28 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 7aa1fadd10dfc..c2c5d3caad30a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -8280,6 +8280,54 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
   }
 }
 
+void SelectionDAGBuilder::pushFPOpOutChain(SDValue Result,
+                                           fp::ExceptionBehavior EB) {
+  assert(Result.getNode()->getNumValues() == 2);
+  SDValue OutChain = Result.getValue(1);
+  assert(OutChain.getValueType() == MVT::Other);
+
+  // Instead of updating the root immediately, push the produced chain to the
+  // appropriate list, deferring the update until the root is requested. In this
+  // case, the nodes from the lists are chained using TokenFactor, indicating
+  // that the operations are independent.
+  //
+  // In particular, the root is updated before any call that might access the
+  // floating-point environment, except for constrained intrinsics.
+  switch (EB) {
+  case fp::ExceptionBehavior::ebMayTrap:
+  case fp::ExceptionBehavior::ebIgnore:
+    // Floating-point exceptions produced by such operations are not intended
+    // to be observed, so the sequence of these operations does not need to be
+    // preserved.
+    //
+    // They however must not be mixed with the instructions that have strict
+    // exception behavior. Placing an operation with 'ebIgnore' behavior between
+    // 'ebStrict' operations could distort the observed exception behavior.
+    if (!PendingConstrainedFPStrict.empty()) {
+      assert(PendingConstrainedFP.empty());
+      updateRoot(PendingConstrainedFPStrict);
+    }
+    PendingConstrainedFP.push_back(OutChain);
+    break;
+  case fp::ExceptionBehavior::ebStrict:
+    // Floating-point exception produced by these operations may be observed, so
+    // they must be correctly chained. If trapping on FP exceptions is
+    // disabled, the exceptions can be observed only by functions that read
+    // exception flags, like 'llvm.get_fpenv' or 'fetestexcept'. It means that
+    // the order of operations is not significant between barriers.
+    //
+    // If trapping is enabled, each operation becomes an implicit observation
+    // point, so the operations must be sequenced according their original
+    // source order.
+    if (!PendingConstrainedFP.empty()) {
+      assert(PendingConstrainedFPStrict.empty());
+      updateRoot(PendingConstrainedFP);
+    }
+    PendingConstrainedFPStrict.push_back(OutChain);
+    // TODO: Add support for trapping-enabled scenarios.
+  }
+}
+
 void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
     const ConstrainedFPIntrinsic &FPI) {
   SDLoc sdl = getCurSDLoc();
@@ -8293,32 +8341,6 @@ void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
   for (unsigned I = 0, E = FPI.getNonMetadataArgCount(); I != E; ++I)
     Opers.push_back(getValue(FPI.getArgOperand(I)));
 
-  auto pushOutChain = [this](SDValue Result, fp::ExceptionBehavior EB) {
-    assert(Result.getNode()->getNumValues() == 2);
-
-    // Push node to the appropriate list so that future instructions can be
-    // chained up correctly.
-    SDValue OutChain = Result.getValue(1);
-    switch (EB) {
-    case fp::ExceptionBehavior::ebIgnore:
-      // The only reason why ebIgnore nodes still need to be chained is that
-      // they might depend on the current rounding mode, and therefore must
-      // not be moved across instruction that may change that mode.
-      [[fallthrough]];
-    case fp::ExceptionBehavior::ebMayTrap:
-      // These must not be moved across calls or instructions that may change
-      // floating-point exception masks.
-      PendingConstrainedFP.push_back(OutChain);
-      break;
-    case fp::ExceptionBehavior::ebStrict:
-      // These must not be moved across calls or instructions that may change
-      // floating-point exception masks or read floating-point exception flags.
-      // In addition, they cannot be optimized out even if unused.
-      PendingConstrainedFPStrict.push_back(OutChain);
-      break;
-    }
-  };
-
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
   EVT VT = TLI.getValueType(DAG.getDataLayout(), FPI.getType());
   SDVTList VTs = DAG.getVTList(VT, MVT::Other);
@@ -8346,7 +8368,7 @@ void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
         !TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT)) {
       Opers.pop_back();
       SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, sdl, VTs, Opers, Flags);
-      pushOutChain(Mul, EB);
+      pushFPOpOutChain(Mul, EB);
       Opcode = ISD::STRICT_FADD;
       Opers.clear();
       Opers.push_back(Mul.getValue(1));
@@ -8377,7 +8399,7 @@ void SelectionDAGBuilder::visitConstrainedFPIntrinsic(
   }
 
   SDValue Result = DAG.getNode(Opcode, sdl, VTs, Opers, Flags);
-  pushOutChain(Result, EB);
+  pushFPOpOutChain(Result, EB);
 
   SDValue FPResult = Result.getValue(0);
   setValue(&FPI, FPResult);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
index e0835e6310357..e1d7804ce12cd 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
@@ -195,6 +195,11 @@ class SelectionDAGBuilder {
   /// Update root to include all chains from the Pending list.
   SDValue updateRoot(SmallVectorImpl<SDValue> &Pending);
 
+  /// Given a node representing a floating-point operation and its specified
+  /// exception behavior, this either updates the root or stores the node in
+  /// a list to be added to chains latter.
+  void pushFPOpOutChain(SDValue Result, fp::ExceptionBehavior EB);
+
   /// A unique monotonically increasing number used to order the SDNodes we
   /// create.
   unsigned SDNodeOrder;



More information about the llvm-commits mailing list