[llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

Tue Jan 24 01:33:19 PST 2012

VFPv4 is a superset of VFPv3+fp16, same with NEONv2.

"VFPv4 and VFPv4U add both the Half-precision Extension and the fused multiply-add instructions to the features of VFPv3."

For the register set: "VFPv4 can be implemented with either thirty-two or sixteen doubleword registers"
"Where necessary, these implementation options are distinguished using the terms: VFPv4-D32 or VFPv4-D16"

"where the term VFPv4 is used it covers both options".

So, VFP4 should imply VFPv4-D16, i.e. the smaller register file variant. There should be a way to optionally enable the 32 register variant.

-----Original Message-----
From: Anton Korobeynikov [mailto:anton at korobeynikov.info] 
Sent: 24 January 2012 09:27
To: James Molloy
Cc: Ana Pazos; rajav at codeaurora.org; llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] LLVM patch to support ARM fused multiply add/subtract instructions

> Both VFPv3 and NEONv1 can have the fp16 extension.
What's about VFPv4? Is it possible to have "limited register" variant
of VFPv4/NEONv2 ?

-- 
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University