[PATCH] Allow FMAs in safe math mode in some cases when one operand of the fmul is either exactly 0.0 or exactly 1.0.

Hal Finkel hfinkel at anl.gov
Tue Jul 9 14:11:01 PDT 2013


Stephen,

Should it be 0, 1 or undef? I doubt that the undef case matters for scalars in practice, but it can have a big effect on vectors (after widening).

+    // FIXME: handle more cases

Out of curiosity, what else did you have in mind? [I think that we should either make it more specific, or remove it]

A few thoughts:

 - For vectors, we can check if all elements are either 0 or 1 (or undef), right?

 - We can look through EXTRACT_VECTOR_ELT, EXTRACT_SUBVECTOR, SCALAR_TO_VECTOR, VECTOR_SHUFFLE, SCALAR_TO_VECTOR, BUILD_VECTOR (and perhaps others)

 - Likewise, when checking:
 +    if (V->getValueType() == MVT::i1)
   you probably want to check the scalar type (to catch v?i1)

Also, after type legalization, i1 has probably been replaced by something larger (like i32). Nevertheless, depending on what getBooleanContents() returns, the SETCC may be restricted to returning zero or one.

 - We can probably also look though FP_ROUND, FP_EXTEND, (FCEIL, FTRUNC, FRINT, FNEARBYINT, FFLOOR)

 -Hal

----- Original Message -----
> Hi,
> 
> The attached patch allows FMAs to be formed in DAGCombiner even
> without unsafe math mode when one operand is known to be either 0.0
> or
> 1.0 exactly (this is safe because no rounding occurs in the fmul
> step,
> and behavior on all limit case inputs is preserved, as far as I can
> tell.)
> 
> This allows formation of FMAs in cases like the following:
> 
>     extern bool f;
> 
>     double foo(bool a, double b, double c, double d, double e) {
>      return (b * double(a) + c) + (d * (1.0 - double(a)) + e);
>     }
> 
> The intent of this patch is to address a subset of the issues which
> required r181216 to be partially reverted (tracked as PR16164)
> although there are still many other cases affected by that patch
> which
> are not resolved.
> 
> Unfortunately, due to SelectionDAG limitations, this transformation
> will only be done if the input can be determined to be zero or one
> with information only in the same basic block, so depending on
> optimization settings and phase ordering, will fail in cases like the
> following:
> 
>     extern bool f;
> 
>     double foo(bool a, double b, double c, double d, double e) {
>      if (a)
>        return b * double(a) + c;
>      else
>        return d * (1.0 - double(a)) + e;
>     }
> 
> Please let me know if you have any feedback.
> 
> Thanks,
> Stephen
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list