[flang-commits] [flang] [flang] Use saturated intrinsic for floating point conversions (PR #130686)
Asher Mancinelli via flang-commits
flang-commits at lists.llvm.org
Tue Mar 11 12:25:27 PDT 2025
================
@@ -835,10 +835,20 @@ struct ConvertOpConversion : public fir::FIROpConversion<fir::ConvertOp> {
return mlir::success();
}
if (mlir::isa<mlir::IntegerType>(toTy)) {
- if (toTy.isUnsignedInteger())
- rewriter.replaceOpWithNewOp<mlir::LLVM::FPToUIOp>(convert, toTy, op0);
- else
- rewriter.replaceOpWithNewOp<mlir::LLVM::FPToSIOp>(convert, toTy, op0);
+ // NOTE: We are checking the fir type here because toTy is an LLVM type
+ // which is signless, and we need to use the intrinsic that matches the
+ // sign of the output in fir.
+ if (toFirTy.isUnsignedInteger()) {
+ auto intrinsicName =
+ mlir::StringAttr::get(convert.getContext(), "llvm.fptoui.sat");
+ rewriter.replaceOpWithNewOp<mlir::LLVM::CallIntrinsicOp>(
+ convert, toTy, intrinsicName, op0);
+ } else {
+ auto intrinsicName =
+ mlir::StringAttr::get(convert.getContext(), "llvm.fptosi.sat");
+ rewriter.replaceOpWithNewOp<mlir::LLVM::CallIntrinsicOp>(
+ convert, toTy, intrinsicName, op0);
+ }
----------------
ashermancinelli wrote:
They produce more instructions on x86 (when they cannot be const-folded away) ([x86 godbolt link, more instructions](https://godbolt.org/z/z6KKf8Yao), [aarch64 godbolt link, both using `fcvtzs`](https://godbolt.org/z/7coacsjPd)), and if someone converted reals to integers in a hot loop they might see worse performance, however I was unable to find a difference in the performance tests that I ran. I'll be watching performance numbers after this is merged in case something comes up.
> Would it be possible to use the saturation intrinsic only when necessary?
As long as we want the correct semantics for values only known at runtime, I don't think so. However, especially if performance issues come up, I think it would make sense to use the fptosi/fptoui instructions under some flag, maybe enabled by default above some optimization level. Do you think using the instructions instead of the saturated intrinsics under (for example) `-ffast-math` would be a good compromise if performance issues show up?
https://github.com/llvm/llvm-project/pull/130686
More information about the flang-commits
mailing list