[PATCH] Allow FMAs in safe math mode in some cases when one operand of the fmul is either exactly 0.0 or exactly 1.0.

Tue Jul 9 14:35:40 PDT 2013

On Tue, Jul 9, 2013 at 2:11 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> Stephen,
>
> Should it be 0, 1 or undef? I doubt that the undef case matters for scalars in practice, but it can have a big effect on vectors (after widening).

Is there undef at the DAG level, actually? I wasn't sure; I haven't
done much SelectionDAG hacking.

>
> +    // FIXME: handle more cases
>
> Out of curiosity, what else did you have in mind? [I think that we should either make it more specific, or remove it]

To be honest, I've only done cases that I cared about for my intended
application, and ones that I was fairly sure I understood well enough
to say that they are safe. If anyone wants to suggest more cases
before it goes in than that's fine.

I can change the comment to not say "FIXME:" and just say that the
list is non-exhaustive, though, if that's more appropriate.

>
> A few thoughts:
>
>  - For vectors, we can check if all elements are either 0 or 1 (or undef), right?
>
>  - We can look through EXTRACT_VECTOR_ELT, EXTRACT_SUBVECTOR, SCALAR_TO_VECTOR, VECTOR_SHUFFLE, SCALAR_TO_VECTOR, BUILD_VECTOR (and perhaps others)
>
>  - Likewise, when checking:
>  +    if (V->getValueType() == MVT::i1)
>    you probably want to check the scalar type (to catch v?i1)

I thought about vectors but it would require something either
recursive or very complicated to do correctly; the former is non-ideal
and the latter is bug-prone. The purpose of this function is similar
to sameNoopInput in llvm/CodeGen/Analysis.cpp, and it was difficult to
try to generalize that fully to vectors.

I ended up making it work on vectors on constrained cases (since the
vector case was tested for when the function was recursive), but it
tools a lot of thinking to convince myself I hadn't made a mistake.

Do you think the vector case will come up enough to make this
worthwhile? The problem is that the vector has to be transparently
defined in the same basic block as the fmul and fadd for it to work,
and most I think most vectors will be created using some kind of
control flow that will necessarily span more than one basic block.

>
> Also, after type legalization, i1 has probably been replaced by something larger (like i32). Nevertheless, depending on what getBooleanContents() returns, the SETCC may be restricted to returning zero or one.

Yes, but I see through truncates and extends to I think that covers a
lot of the legalization cases. I see your point of
getBooleanContents() though, I will add that case.

>
>  - We can probably also look though FP_ROUND, FP_EXTEND, (FCEIL, FTRUNC, FRINT, FNEARBYINT, FFLOOR)

OK, that makes sense. I'll add those too. Thanks for the feedback!

>
>  -Hal
>
> ----- Original Message -----
>> Hi,
>>
>> The attached patch allows FMAs to be formed in DAGCombiner even
>> without unsafe math mode when one operand is known to be either 0.0
>> or
>> 1.0 exactly (this is safe because no rounding occurs in the fmul
>> step,
>> and behavior on all limit case inputs is preserved, as far as I can
>> tell.)
>>
>> This allows formation of FMAs in cases like the following:
>>
>>     extern bool f;
>>
>>     double foo(bool a, double b, double c, double d, double e) {
>>      return (b * double(a) + c) + (d * (1.0 - double(a)) + e);
>>     }
>>
>> The intent of this patch is to address a subset of the issues which
>> required r181216 to be partially reverted (tracked as PR16164)
>> although there are still many other cases affected by that patch
>> which
>> are not resolved.
>>
>> Unfortunately, due to SelectionDAG limitations, this transformation
>> will only be done if the input can be determined to be zero or one
>> with information only in the same basic block, so depending on
>> optimization settings and phase ordering, will fail in cases like the
>> following:
>>
>>     extern bool f;
>>
>>     double foo(bool a, double b, double c, double d, double e) {
>>      if (a)
>>        return b * double(a) + c;
>>      else
>>        return d * (1.0 - double(a)) + e;
>>     }
>>
>> Please let me know if you have any feedback.
>>
>> Thanks,
>> Stephen
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory