[llvm-commits] [llvm] r85697 - in /llvm/trunk: lib/Target/ARM/ARMInstrNEON.td test/CodeGen/ARM/fmacs.ll test/CodeGen/ARM/fnmacs.ll test/CodeGen/Thumb2/cross-rc-coalescing-2.ll

Jim Grosbach grosbach at apple.com
Wed Nov 4 07:54:58 PST 2009


On Nov 4, 2009, at 1:29 AM, David Conrad wrote:

> On Nov 3, 2009, at 5:07 PM, Jim Grosbach wrote:
>
>>
>> On Nov 2, 2009, at 1:53 AM, David Conrad wrote:
>>
>>> Thus even without modeling the special behaviour of vmla it's always
>>> better to use it: it'll always be at least as fast as a separate  
>>> vmul
>>> +vadd. This applies to the integer versions as well.
>>
>>
>> Hi David,
>>
>> Unfortunately, this turns out not to be the case. The NEON unit  
>> will stall adjacent instructions in the presence of vmla to  
>> preserve in-order retirement. If a RAW hazard is present, the stall  
>> is 8 (possibly 7) cycles, otherwise it is 4 cycles.
>
> You're correct, sorry for the noise and wrong information.

No worries. Always better to talk it through and make sure things are  
right. Thanks for having a look and providing feedback!

Regards,
   Jim



More information about the llvm-commits mailing list