[PATCH] D114964: [DAG] Create fptoui.sat from clamped fptoui
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 2 13:26:32 PST 2021
dmgreen added a comment.
In D114964#3167836 <https://reviews.llvm.org/D114964#3167836>, @spatel wrote:
> This seems fine as an extension of the previous patch.
> I haven't been following the progress in this area closely though. What prevents folding these patterns to the saturating intrinsics in IR?
> https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic
An fptoui.sat is more defined than a fptoui + umin. For the fptoui any out-of-range value produces poison - for the fptoui.sat the out of range values are defined to saturate. So the transform isn't reversible and on some architectures produces worse code.
More concretely a `iN fptoui.sat(X)` can be expanded into `fptoui(fmax(fmin(X, (float)(2^N)-1), 0))` (or something else with float compares and int selects if fmin/fmax are not available). Plus it needs to handle Nan for fptosi, making sure it becomes 0.
I would actually really like it to be done in IR. We would need to vectorize these if we can, and they are much simpler to vectorize if we already have the scalar instructions. But it needs to be a costed decision, not an inst-combine canonicalization decision. Any ideas of a good place to make that happen?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114964/new/
https://reviews.llvm.org/D114964
More information about the llvm-commits
mailing list