[clang] [llvm] Reimplement constrained 'trunc' using operand bundles (PR #118253)

Mon Dec 16 09:07:36 PST 2024

================
@@ -86,6 +86,43 @@ IRBuilderBase::createCallHelper(Function *Callee, ArrayRef<Value *> Ops,
   return CI;
 }
 
+CallInst *IRBuilderBase::CreateCall(FunctionType *FTy, Value *Callee,
+                                    ArrayRef<Value *> Args,
+                                    ArrayRef<OperandBundleDef> OpBundles,
+                                    const Twine &Name, MDNode *FPMathTag) {
+  ArrayRef<OperandBundleDef> ActualBundlesRef = OpBundles;
+  SmallVector<OperandBundleDef, 2> ActualBundles;
+
+  if (IsFPConstrained) {
+    if (const auto *Func = dyn_cast<Function>(Callee)) {
+      if (Intrinsic::ID ID = Func->getIntrinsicID()) {
+        if (IntrinsicInst::canAccessFPEnvironment(ID)) {
+          bool NeedRound = true, NeedExcept = true;
+          for (const auto &Item : OpBundles) {
+            if (NeedRound && Item.getTag() == "fpe.round")
+              NeedRound = false;
+            else if (NeedExcept && Item.getTag() == "fpe.except")
+              NeedExcept = false;
+            ActualBundles.push_back(Item);
----------------
spavloff wrote:

It depend on how we want to treat the rounding mode bundle. At least two cases are possible.

(1) The rounding mode bundle specifies the floating-point environment. That is it provides information about the current value of the rounding mode in FPCR. If optimizer can deduce this value, it may set the appropriate value in all affected instruction. For example, in the following code:
```
call @llvm.set_rounding(i32 1)
%v = float call @llvm.trunc(float %x)
```
the call to `trunc` can be replaced with:
```
%v = float call @llvm.trunc(float %x) [ "fpe.control"(metadata !"rte") ]
```
The rounding mode in this bundle does not change the meaning of `trunc`, but could be useful in some cases. The two calls:
```
%v = float call @llvm.trunc(float %x) [ "fpe.control"(metadata !"rte") ]
%v = float call @llvm.trunc(float %x) [ "fpe.control"(metadata !"rtz") ]
```
represent the same operation, but on the target where `trunc` is implemented as `round using current mode` the latter instruction is implemented as one operation, while the former generally requires three operations (`set fpcr`, `nearbyint`, `set fpcr`). This is a hypothetical example however.

It seems the meaning of current rounding metadata argument in the constrained intrinsics agrees with this model, see discussion in https://discourse.llvm.org/t/static-rounding-mode-in-ir/80621. 

In this scenario it does not make much sense to exclude unused rounding mode from allowed bundles. The bundles can be set by optimizer in a simple way, without checking if the instruction uses rounding mode. We use a similar method in clang AST, where all relevant nodes have complete `FPOptions`.

(2) The rounding mode bundle specifies the rounding mode used for evaluating the instruction. Instructions like `trunc` do not depend on the specified rounding mode, so it does not make sense to use rounding bundles for them.

This viewpoint seems more natural since rounding is considered as a parameter of an operation, similar to arguments. It also can be naturally extended to static FP control modes. Rounding as a parameter produces exactly the same effect no matter if it is read from FPCR or specified in the instruction. Other FP options, such as denormal behavior, can be handled similarly.

Neither method has a clear-cut advantage, and we need to discuss which approach to take.

https://github.com/llvm/llvm-project/pull/118253