[LLVMdev] LLVM ARM VMLA instruction
suyog sarda
sardask01 at gmail.com
Thu Dec 19 00:36:53 PST 2013
Hi,
One more addition to above observation :
LLVM :
cortex-a15 + vfpv4-d16 + ffast-math option WITHOUT ffp-contract=fast option
also emits vfma instruction.
On Thu, Dec 19, 2013 at 1:30 PM, suyog sarda <sardask01 at gmail.com> wrote:
> Hi all,
>
>
> Thanks for the info. Few observations from my side :
>
>
> LLVM :
>
>
> cortex-a8 vfpv3 : no vmla or vfma instruction emitted
>
> cortex-a8 vfpv4 : no vmla or vfma instruction emitted (This is invalid
> though as cortex-a8 does not have vfpv4)
>
> cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this
> seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma
> instructions generated will be invalid )
>
>
> cortex-a15 vfpv4 : vmla instruction emitted (which is a NEON instruction)
>
> cortex-a15 vfpv4 with ffp-contract=fast vfma instruction emitted.
>
>
> GCC :
>
>
> cortex-a8 vfpv3 : vmla instruction emitted
>
> cortex-a15 vfpv4 : vfma instruction emitted
>
>
> I agree to the point that NEON and VFP instructions shouldn't be used
> interchangeably.
>
>
> However, if gcc emits vmla (NEON) instruction with cortex-a8 then
> shouldn't LLVM also emit vmla (NEON) instruction? Can someone please
> clarify on this point? The performance gain with vmla instruction is huge.
> Somewhere i read that LLVM prefers precision accuracy over performance. Is
> this true and hence LLVM is not emiting vmla instructions for cortex-a8?
>
>
>
> On Thu, Dec 19, 2013 at 6:41 AM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
>
>> Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or
>> other flags specified.
>>
>>
>> On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
>>
>>> Thanks for the explanation, Tim!
>>>
>>> gcc 4.8.1 *does* generate an fma for your code example for an x86 target
>>> that supports fma. I'd bet that the HW vendors' compilers do the same, but
>>> I don't have any of those installed at the moment to test that theory. So
>>> this is a bug in those compilers? Do you know how they justify it?
>>>
>>> I see section 6.5 "Expressions" in the C standard, and I can see that
>>> 6.5.8 would seem to agree with you assuming that a "floating expression" is
>>> a subset of "expression"...is there any other part of the standard that you
>>> know of that I can reference?
>>>
>>> This is made a little weirder by the fact that gcc and clang have a
>>> 'fast' setting for fp-contract, but the C standard that I'm looking at
>>> states that it is just an "on-off-switch".
>>>
>>>
>>> On Wed, Dec 18, 2013 at 11:17 AM, Tim Northover <t.p.northover at gmail.com
>>> > wrote:
>>>
>>>> > http://llvm.org/bugs/show_bug.cgi?id=17188
>>>> > http://llvm.org/bugs/show_bug.cgi?id=17211
>>>>
>>>> Ah, thanks. That makes a lot more sense now.
>>>>
>>>> > Correct - clang is different than gcc, icc, msvc, xlc, etc. on this.
>>>> Still
>>>> > haven't seen any explanation for how this is better though...
>>>>
>>>> That would be because it follows what C tells us a compiler has to do
>>>> by default but provides overrides in either direction if you know what
>>>> you're doing.
>>>>
>>>> The key point is that LLVM (currently) has no notion of statement
>>>> boundaries, so it would fuse the operations in this function:
>>>>
>>>> float foo(float accum, float lhs, float rhs) {
>>>> float product = lhs * rhs;
>>>> return accum + product;
>>>> }
>>>>
>>>> This isn't allowed even under FP_CONTRACT=on (the multiply and add do
>>>> not occur within a single expression), so LLVM can't in good
>>>> conscience enable these optimisations by default.
>>>>
>>>> Cheers.
>>>>
>>>> Tim.
>>>>
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
>
> --
> With regards,
> Suyog Sarda
>
--
With regards,
Suyog Sarda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131219/7f18b610/attachment.html>
More information about the llvm-dev
mailing list