[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

Wed Jan 25 07:07:44 PST 2012

On Wed, 2012-01-25 at 17:42 +0400, Anton Korobeynikov wrote:
> Hi Ana,
> 
> > In this update:
> > - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
> > - I kept setting .fpu=neon-vfpv4 code attribute because that is what the
> > assembler understands.
> Looks ok.
> 
> > The additional changes mentioned in the email discussions I think belong to
> > a separate patch:
> > - Associate VMLA/VMLS with LessPreciseFPMAD flag, and maybe with fast-math
> > flag.
> They should definitely not be. They are not less precise! They are
> "exactly precise" as two separate ops. It's just FMA which has greater
> precision than usual thanks to 1 rounding.
> And it's FMA which needs to be associated with -ffast-math on VFPv2

Just to be clear, are you advocating associating this with UnsafeFPMath
or with !NoExcessFPPrecision? I think that it should be the latter, as
that is what the PPC backend does (and that seems to match the intent of
the TargetOptions API authors), but unlike -ffast-math
(-enable-unsafe-fp-math), this will cause the patterns to be enabled by
default.

 -Hal

> 
> > - VFPv3/VFPv4/NEON/NEON2 associations with FeatureFP16/FeatureD16.
> Right. But in a separate patch, please.
> 
> > - Support to set -mfpu=neon2 in clang. Do you want this??
> We should be compatible with gcc in this matter. What does it do?
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory