[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

Wed Jan 25 05:42:05 PST 2012

Hi Ana,

> In this update:
> - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
> - I kept setting .fpu=neon-vfpv4 code attribute because that is what the
> assembler understands.
Looks ok.

> The additional changes mentioned in the email discussions I think belong to
> a separate patch:
> - Associate VMLA/VMLS with LessPreciseFPMAD flag, and maybe with fast-math
> flag.
They should definitely not be. They are not less precise! They are
"exactly precise" as two separate ops. It's just FMA which has greater
precision than usual thanks to 1 rounding.
And it's FMA which needs to be associated with -ffast-math on VFPv2

> - VFPv3/VFPv4/NEON/NEON2 associations with FeatureFP16/FeatureD16.
Right. But in a separate patch, please.

> - Support to set -mfpu=neon2 in clang. Do you want this??
We should be compatible with gcc in this matter. What does it do?

-- 
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University