[LLVMdev] LLVM ARM VMLA instruction
Tim Northover
t.p.northover at gmail.com
Thu Dec 19 01:13:54 PST 2013
> As per Renato comment above, vmla instruction is NEON instruction while vmfa is VFP instruction. Correct me if i am wrong on this.
My version of the ARM architecture reference manual (v7 A & R) lists
versions requiring NEON and versions requiring VFP. (Section
A8.8.337). Split in just the way you'd expect (SIMD variants need
NEON).
> It may seem that total number of cycles are more or less same for single vmla
> and vmul+vadd. However, when vmul+vadd combination is used instead of vmla,
> then intermediate results will be generated which needs to be stored in memory
> for future access.
Well, it increases register pressure slightly I suppose, but there's
no need to store anything to memory unless that gets critical.
> Correct me if i am wrong on this, but my observation till date have shown this.
Perhaps. Actual data is needed, I think, if you seriously want to
change this behaviour in LLVM. The test-suite might be a good place to
start, though it'll give an incomplete picture without the externals
(SPEC & other things).
Of course, if we're just speculating we can carry on.
Cheers.
Tim.
More information about the llvm-dev
mailing list