[LLVMdev] NEON vector instructions and the fast math IR flags

Mon Jun 10 08:06:44 PDT 2013

On 06/10/2013 01:56 AM, David Tweed wrote:
> | For programs that have mixed precision requirements for floating point
> | operations we probably need to do this according to the fast math flags.
> | Until we get there, a good first step would probably be to provide a
> | global option similar to -enable-no-infs-fp-math that specifies if
> | denormals should be allowed or not. This would allow the user to specify
> | the precision requirements, without the need to alter with the feature
> | flags of a specific piece of hardware.
>
> Hi, sorry for coming in late on this. Firstly, I think what you mean is "if denormals should be required to be preserved or not". (Apart from anything else it's possible to move data between standard CPU, SIMD CPU and GPU so that even if one part of the system flushes them to zero when they occur can show up in other parts.) Clearly this implies that you can't use NEON instructions since they are specified not to preserve denormals.

True.

> Secondly, I think it would be helpful to at least try to map out which "optimizations" are going to be viewed by a per-instruction IR flag just in order to get a clearer idea if the global stuff is the right model. (Amongst other things, I'm interested in DSLs where the likelihood of knowing something about the "ideal requirements" for operations that will be transformed into LLVM IR is higher than for manually written C/Fortran.)

Sorry, I did not get this sentence. Would you mind rephrasing it?

At the moment I am mainly concerned of the code generation aspect. 
Optimizations on LLVM-IR can already reason per-instruction about
several floating point precision flags. Doing this during code 
generation is apparently difficult as we would have to decide per
instruction if we can legally lower it to NEON or not.

Tobi