[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions
Cameron Zwarich
zwarich at apple.com
Tue Jan 24 02:41:34 PST 2012
On Jan 24, 2012, at 1:10 AM, Anton Korobeynikov wrote:
>> Also, the patch doesn't hook the VMLAs up to "Requires<[UseFPVMLx]>" - is there any reason for this? I know that flag isn't really used but when we do hook VMLAs up to fast-math or disable-excess-fp-precision, it'd be nice to have all implementations orthogonal.
> I think we reached the point where we should have a clean set of
> features. Given the mess we have already....
>
> So, let's summarize. We have the following set of target features:
>
> VFPvN {N=2,3,4}
> NEON (do we need NEONv2?)
> UseFPVMLx, flag to enable codegen in excess precision.
UseFPVMLx was originally about the usage of instructions like VMLA that are just combined multiply-adds without excess precision. The reason for the flag is that ARM CPUs have had a lot of hazards for the use of the combined instructions, and it is difficult to use them for a performance win outside of hand-tuned code. On Cortex-A9 the situation is arguably different, hence the flag.
There needs to be a new flag for the automatic use of excess precision, if that behavior is even desired.
Cameron
More information about the llvm-commits
mailing list