[clang] [llvm] Implement operand bundles for floating-point operations (PR #109798)

Wed Oct 23 10:27:13 PDT 2024

kpneal wrote:

> > This sounds like a rejection of safe-by-default.
> 
> What do you mean by "safe-by-default"? Implementation must be correct, and produce the code that executes in accordance to standards and documentation. Any deviation from them is an error and should be fixed. Does it mean that such implementation would be "safe"?

When I say "safe-by-default" I'm referring to the case where the optimizers don't know about the intrinsics so the optimizers get very conservative and don't optimize. 

It is assumed that all optimizations that actually are done will be in accordance with standards and documentation. 

> > Keep in mind that "unsuitable for practical needs" doesn't mean that everyone agrees with you about what level of performance is unsuitable for one's practical needs. And a compiler that generates good performing code with random FP exceptions inserted is, in my experience, unusable if one runs with traps on.
> 
> Users have different needs and can tune compilation using appropriate options. Those who need strict exception behavior chose suitable option set and should get code that behave as expected but probably has lower performance. And there is a large number of users who need maximum performance but can sacrifice exception strictness. They should use another set of options. Does using different compilation options solves this problem?

Aren't you describing the current state of clang/llvm? Users have different compiler options to choose between FP trap safe and trap-unsafe. Compiled code that is FP trap safe has lower performance. 

The sticky point is the rounding mode. Running with a different rounding mode ideally wouldn't affect performance much at all, but it currently does. But fixing that requires fixing all the optimization passes. 

> > Trading performance for exception safety is a valid choice for some. Trading away all of the performance lost because few optimization passes know how to deal with the constrained intrinsics is still a valid choice for some.
> 
> There is no hope to produce code as performant as in default mode, but with strict exception semantics. So the only way to make a compiler suitable for all users is to provide means to tune it.

Is there a disagreement here? I don't see it. It's true that after we finish teaching all the passes about strictfp (constrained or bundles) we still won't have code that is as optimized if strict exception semantics are selected. This is inevitable since, for example, we can't hoist an FP instruction out of a loop unless we can prove it never traps when traps matter.

> > WRT eliminating the constrained intrinsics completely, I thought that operand bundles could only be attached to function calls and not regular instructions? If I'm wrong, we still have a problem because there are so many uses of the regular FP instructions that we can't be safe-by-default and still use those instructions. We'd need to keep some kind of the constrained intrinsics (or new intrinsics) that give us replacements for the regular FP instructions.
> 
> There is no intention to remove constrained intrinsics (like `experimental.constrained.fadd`), because operand bundles could only be attached to function calls. The constrained intrinsics however need to change their shape in future to use operand bundles as well.

Agreed, we'll need to keep some kind of intrinsics to replace the FP instructions. Since we're going to be replacing the metadata with bundles it might make sense to switch the names of the intrinsics while we're at it. That would also allow for parallel implementations during a switchover.

Changing to use bundles will be more flexible and can include things like the denormal mode. More flexibility sounds like a good change. It's a shame we'll lose "safe-by-default" for libm-style intrinsics, but I don't have an alternative.

> > The problem I've run into with the IR Verifier is that the strictfp mode of the IR Builder needs to be turned on in all cases needed, but currently doesn't because the IR Builder's constructor doesn't know to construct itself in strictfp mode. Now, many instantiations of the IR Builder occur with the constructor able to chase down the function definition and find out if strictfp mode is required. Many, hundreds, don't have that luxury and thus each has to be touched and corrected.
> 
> If IRBuilder does not set strictfp when it should, it usually ends up with assertion violation or verification errors. I guess in this case IR is created by some tool, probably MLIR-based, not by clang? Maybe this is a use case that we do not support enough well now and it would be helpful to investigate it.

No, there are hundreds of instantiations of the IRBuilder strewn all throughout llvm that don't properly set the strictfp mode when they should. Optimization passes that create new instructions (perhaps to replace one or more old instructions) use IRBuilder instances to create those instructions.

All IRBuilder instances need to be corrected before we can push any IR Verifier code that ensures correct use of the strictfp function and call attribute. Plus we still have tests that fail with the not-pushed-yet IR Verifier checks for strictfp, including one clang test. So people think that Verifier code is in the tree when it isn't yet.

I do have a ticket open to change the IRBuilder to check the function definition and set the strictfp mode automatically. But I pooched the branch on my end so I'll need to open a new ticket, hopefully this year. That still leaves hundreds of cases that need to be manually corrected.

https://github.com/llvm/llvm-project/pull/109798