[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

Tue Jan 24 02:41:34 PST 2012

On Jan 24, 2012, at 1:10 AM, Anton Korobeynikov wrote:

>> Also, the patch doesn't hook the VMLAs up to "Requires<[UseFPVMLx]>" - is there any reason for this? I know that flag isn't really used but when we do hook VMLAs up to fast-math or disable-excess-fp-precision, it'd be nice to have all implementations orthogonal.
> I think we reached the point where we should have a clean set of
> features. Given the mess we have already....
> 
> So, let's summarize. We have the following set of target features:
> 
> VFPvN {N=2,3,4}
> NEON (do we need NEONv2?)
> UseFPVMLx, flag to enable codegen in excess precision.

UseFPVMLx was originally about the usage of instructions like VMLA that are just combined multiply-adds without excess precision. The reason for the flag is that ARM CPUs have had a lot of hazards for the use of the combined instructions, and it is difficult to use them for a performance win outside of hand-tuned code. On Cortex-A9 the situation is arguably different, hence the flag.

There needs to be a new flag for the automatic use of excess precision, if that behavior is even desired.

Cameron