[LLVMdev] Representing -ffast-math at the IR level

Sat Apr 14 11:28:53 PDT 2012

The attached patch is a first attempt at representing "-ffast-math" at the IR
level, in fact on individual floating point instructions (fadd, fsub etc).  It
is done using metadata.  We already have a "fpmath" metadata type which can be
used to signal that reduced precision is OK for a floating point operation, eg

     %z = fmul float %x, %y, !fpmath !0
   ...
   !0 = metadata !{double 2.5}

indicates that the multiplication can be done in any way that doesn't introduce
more than 2.5 ULPs of error.

The first observation is that !fpmath can be extended with additional operands
in the future: operands that say things like whether it is OK to assume that
there are no NaNs and so forth.

This patch doesn't add additional operands though.  It just allows the existing
accuracy operand to be the special keyword "fast" instead of a number:

     %z = fmul float %x, %y, !fpmath !0
   ...
   !0 = metadata !{!metadata "fast"}

This indicates that accuracy loss is acceptable (just how much is unspecified)
for the sake of speed.  Thanks to Chandler for pushing me to do it this way!

It also creates a simple way of getting and setting this information: the
FPMathOperator class: you can cast appropriate instructions to this class
and then use the querying/mutating methods to get/set the accuracy, whether
2.5 or "fast".  The attached clang patch uses this to set the openCL 2.5 ULPs
accuracy rather than doing it by hand for example.

In addition it changes IRBuilder so that you can provide an accuracy when
creating floating point operations.  I don't like this so much.  It would
be more efficient to just create the metadata once and then splat it onto
each instruction.  Also, if fpmath gets a bunch more options/operands in
the future then this interface will become more and more awkward.  Opinions
welcome!

I didn't actually implement any optimizations that use this yet.

I took a look at the impact on aermod.f90, a reasonably floating point heavy
Fortran benchmark (4% of the human readable IR consists of floating point
operations).  At -O3 (the worst), the size of the bitcode increases by 0.8%.
No idea if that's acceptable - hopefully it is!

Enjoy!

Duncan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-llvm.diff
Type: text/x-patch
Size: 14251 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120414/95aa6cb6/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-clang.diff
Type: text/x-patch
Size: 2240 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120414/95aa6cb6/attachment-0001.bin>