[llvm-dev] Vectorization with fast-math on irregular ISA sub-sets
Renato Golin via llvm-dev
llvm-dev at lists.llvm.org
Thu Feb 11 01:53:15 PST 2016
On 11 February 2016 at 01:15, Hal Finkel <hfinkel at anl.gov> wrote:
> Rather, the user expects very specific (non-IEEE) behavior.
> I think we have two options here:
> 1. Lower these intrinsics into target-level intrinsics
That's not an option for the reasons you outline (performance), but
also because this would explode the number of intrinsics we have to
deal with, making the IR *very* opaque and hard to deal with.
> 2. Add flags (or something like that) that indicate the alternate non-IEEE semantics that ARM actually provides.
That's my idea, but I want to think about it only when we really need
to. Adding new flags always lead us to hard choices, and backwards
compatibility will be a problem here.
> We'd need to pass the fast-math flags to the cost model so that we'd get costs back that depended on whether or not we could actually use the vector instructions.
Indeed, that's the only way. But I foresee the cost model at least
doubling its complexity for those unfortunate targets. Right now, we
use heuristics to map the costs of casts, shuffles and memory
operations that normally disappear, but when loops can now use NEON
and VFP as well as scalar in the same objects, how the back-end will
emit those pseudo-operations will be anyone's guess.
In that sense, James' suggestion to create a flag for strict IEEE
semantics, locking SIMD FP out of the question entirely, is an easy
More information about the llvm-dev