[PATCH] D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management.
Jatin Bhateja via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 11 09:18:02 PDT 2017
jbhateja added a comment.
In https://reviews.llvm.org/D37686#866584, @spatel wrote:
> Here's a potential test case that would show a difference from having FMF on a sqrt intrinsic:
>
> define float @fast_recip_sqrt(float %x) {
> %y = call fast float @llvm.sqrt.f32(float %x)
> %z = fdiv fast float 1.0, %y
> ret float %z
> }
> declare float @llvm.sqrt.f32(float) nounwind readonly
>
>
> ...but as I said earlier, we need to fix the DAGCombiner code where this fold is implemented to recognize the flags on the individual nodes. Currently, it just checks the global state:
>
> if (Options.UnsafeFPMath) {
>
>
> On x86 currently, this will use the full-precision sqrtss+divss, but it should be using rsqrtss followed by mulss/addss to refine the estimate.
In https://reviews.llvm.org/D37686#866521, @spatel wrote:
> There is a functional difference from this improvement (several FP intrinsics should now correctly propagate flags), but I'm not sure if it's visible without fixing something else in DAGCombiner to recognize the flags.
>
> How does this work if an instruction maps to multiple nodes? For example, the FMA intrinsic can map to 2 nodes?
This propagation happens during SelectionDAGBuilder::visit, I scanned thorugh various instructions and there is 1:1 mapping b/w instructions and initial SDNode created for it.
Like for llvm.fma (Intrinsic Instruction ) -> SDNode(ISD::FMA )
for llvm.minnum (Insturction) -> SDNode(ISD::FMINNUM/FMINNAN)
etc. There initial SDNode are later lowered/expanded during following DAG phases.
In https://reviews.llvm.org/D37686#866584, @spatel wrote:
> Here's a potential test case that would show a difference from having FMF on a sqrt intrinsic:
>
> define float @fast_recip_sqrt(float %x) {
> %y = call fast float @llvm.sqrt.f32(float %x)
> %z = fdiv fast float 1.0, %y
> ret float %z
> }
> declare float @llvm.sqrt.f32(float) nounwind readonly
>
>
> ...but as I said earlier, we need to fix the DAGCombiner code where this fold is implemented to recognize the flags on the individual nodes. Currently, it just checks the global state:
>
> if (Options.UnsafeFPMath) {
>
>
> On x86 currently, this will use the full-precision sqrtss+divss, but it should be using rsqrtss followed by mulss/addss to refine the estimate.
In https://reviews.llvm.org/D37686#866584, @spatel wrote:
> Here's a potential test case that would show a difference from having FMF on a sqrt intrinsic:
>
> define float @fast_recip_sqrt(float %x) {
> %y = call fast float @llvm.sqrt.f32(float %x)
> %z = fdiv fast float 1.0, %y
> ret float %z
> }
> declare float @llvm.sqrt.f32(float) nounwind readonly
>
>
> ...but as I said earlier, we need to fix the DAGCombiner code where this fold is implemented to recognize the flags on the individual nodes. Currently, it just checks the global state:
>
> if (Options.UnsafeFPMath) {
>
>
> On x86 currently, this will use the full-precision sqrtss+divss, but it should be using rsqrtss followed by mulss/addss to refine the estimate.
Ok, we also have another usage of Fast Maths flage in reviev https://reviews.llvm.org/D37616. Can you please file a bugzilla for this to track it.
https://reviews.llvm.org/D37686
More information about the llvm-commits
mailing list