[llvm-dev] Trouble when suppressing a portion of fast-math-transformations

Mon Oct 2 18:03:53 PDT 2017

On 10/02/2017 11:10 AM, Bruce Hoult via llvm-dev wrote:
> Is there anything that means, in particular, "go fast, even if it 
> means not all bits are significant"?
>
> I'm currently working on an llvm-based compiler for a GPU that is 
> optomised for OpenGL, where 16 bit FP may not be quite accurate enough 
> (or may be in some cases), but 32 bit FP is overkill. A lot of the 
> fast, built in, operations end up with a few junk bits at the end (not 
> add/sub/mul . but divide is available *only* using reciprocal).
>
> When implementing OpenCL, the specs and conformance tests require full 
> IEEE accuracy. In some cases this requires a round of Newton-Raphson 
> to clean up the accuracy, which is a significant though maybe not 
> crippling performance penalty. But in other cases we need to do a lot 
> of range reduction, some polynomial, and then generalise the result 
> again. This can be an order of magnitude or more slower than using the 
> not-quite-accurate-enough built in instruction.

This is what arcp is for (implying that you can use the reciprocal 
estimate and not worry about getting the exact answer). Now there's a 
separate question about how many Newton iterations to use, and we have a 
separate flag for that (-mrecip=...). Check out the implementation of  
TargetLoweringBase::getRecipEstimateSqrtEnabled to see how it's setup in 
backend. This is, however, per function, so we don't currently have a 
per-operation control on this.

>
> The OpenCL spec defines a number of compile flags controlling 
> optimizartions. Some seem to map well onto the flags already discussed 
> here:
>
> -cl-mad-enable
> -cl-no-signed-zeros
> -cl-finite-math-only
>
> However it looks to me that the following ones don't presently map 
> well to LLVM:
>
> -cl-unsafe-math-optimizations
> Allow optimizations for floating-point arithmetic that (a) assume that 
> arguments and results are valid, (b) may violate IEEE 754 standard and 
> (c) may violate the OpenCL numerical compliance requirements as 
> defined in the SPIR-V OpenCL environment specification for single 
> precision and double precision floating-point, and edge case behavior 
> in the SPIR-V OpenCL environment specification. This option includes 
> the -clno-signed-zeros and -cl-mad-enable options.

I think the idea is that this flag, like -funsafe-math-optimizations, 
gets mapped to an appropriate collection of finer-grained flags internally.

>
> -cl-fast-relaxed-math
> Sets the optimization options -cl-finite-math-only and 
> -cl-unsafe-math-optimizations. This allows optimizations for 
> floating-point arithmetic that may violate the IEEE 754 standard and 
> the OpenCL numerical compliance requirements for single precision and 
> double precision floating-point, as well as floating point edge case 
> behavior. This option also relaxes the precision of commonly used math 
> functions. This option causes the preprocessor macro 
> __FAST_RELAXED_MATH__ to be defined in the OpenCL program. The 
> original and modified values are defined in the SPIR-V OpenCL 
> environment specification
>
> I'd like to emphasise in the latter one: "This option also relaxes the 
> precision of commonly used math functions."

Isn't this the "libm" flag that is proposed in this thread?

  -Hal

>
>
> On Mon, Oct 2, 2017 at 4:45 PM, Ristow, Warren via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     I'm not aware of any additional bits needed.  But putting us right
>     at the edge leaves me uncomfortable.  So an implementation that
>     isn't limited by the 7 bits in SubclassOptionalData seems sensible.
>
>     Thanks,
>
>     -Warren
>
>     *From:*Sanjay Patel [mailto:spatel at rotateright.com
>     <mailto:spatel at rotateright.com>]
>     *Sent:* Monday, October 2, 2017 12:06 AM
>     *To:* Ristow, Warren
>     *Cc:* Hal Finkel; llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>
>     *Subject:* Re: [llvm-dev] Trouble when suppressing a portion of
>     fast-math-transformations
>
>     Are we confident that we just need those 7 bits to represent all
>     of the relaxed FP states that we need/want to support?
>
>     I'm asking because FMF in IR is currently mapped onto the
>     SubclassOptionalData of Value...and we have exactly 7 bits there. :)
>
>     If we're redoing the definitions, I'm wondering if we can share
>     the struct with the backend's SDNodeFlags, but that already has
>     one extra bit for vector reduction. Should we give up on
>     SubclassOptionalData for FMF? We have a MD_fpmath enum value for
>     metadata, so we could move things over there?
>
>     On Fri, Sep 29, 2017 at 8:16 PM, Ristow, Warren via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     Hi Hal,
>
>     >> 4. To fix this, I think that additional fast-math-flags are likely
>     >> needed in the IR.  Instead of the following set:
>     >>
>     >> 'nnan' + 'ninf' + 'nsz' + 'arcp' + 'contract'
>     >>
>     >> something like this:
>     >>
>     >> 'reassoc' + 'libm' + 'nnan' + 'ninf' + 'nsz' + 'arcp' + 'contract'
>     >>
>     >> would be more useful.  Related to this, the current 'fast' flag
>     which acts
>     >> as an umbrella (enabling 'nnan' + 'ninf' + 'nsz' + 'arcp' +
>     'contract') may
>     >> not be needed.  A discussion on this point was raised last
>     November on the
>     >> mailing list:
>     >>
>     >>
>     http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
>     <http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html>
>     >
>     > I agree. I'm happy to help review the patches. It will be best
>     to have
>     > only the finer-grained flags where there's no "fast" flag that
>     implies
>     > all of the others.
>
>     Thanks for the quick response, and for the willingness to review. 
>     I won't let
>     this languish so long, like the post from last November.
>
>     Happy to hear that you feel it's best not to have the umbrella
>     "fast" flag.
>
>     Thanks again,
>
>     -Warren
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171002/b8c7e7f7/attachment.html>