[PATCH] D57148: [X86][Btver2] Improved latency/throughput model for scalar int-to-float conversions.
Andrea Di Biagio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 24 05:07:30 PST 2019
andreadb created this revision.
andreadb added reviewers: RKSimon, spatel, courbet, mattd, craig.topper.
Herald added a subscriber: gbedwell.
This is a follow-up of D57056 <https://reviews.llvm.org/D57056>.
On Jaguar, we need to account for an additional operand latency of 6cy (caused by bypass delays) in the case of scalar_int-to-float conversions.
The latency of (V)CVTSI2S(S|D) should be `f+3`; In this context, `f` is a bypass delay of 6cy (see AMD fam16h SOG).
This patch marks the input gpr operand as `ReadIntToFpu`, so that we correctly account for that delay. That quantity has then be subtacted to the opcode latency (which should just be 3cy).
I verified that latency/throughput numbers from llvm-mca have improved, and now they better match what is reported by perf on Jaguar. That being said, I still see cases where the IPC as reported by llvm-mca doesn't quite match the IPC from perf.
Example:
vcvtsi2ss %ecx, %xmm0, %xmm0 # Should tend to IPC: 0.33. Perf reports IPC: 0.25 (one cvt every 4cy).
I suspect that local forwarding might be disabled for it; it looks like users have to wait for an extra +1cy. That would explain the 0.25. For now I decided to go with what is in the documents, so we always assume a +3cy latency.
Latency for the RM variants has changed (it has slightly improved). However, we need another patch to fix the number of opcodes (it should be 1, not 2).
https://reviews.llvm.org/D57148
Files:
lib/Target/X86/X86InstrSSE.td
lib/Target/X86/X86ScheduleBtVer2.td
test/CodeGen/X86/sse-schedule.ll
test/CodeGen/X86/sse2-schedule.ll
test/tools/llvm-mca/X86/BtVer2/int-to-fpu-forwarding-2.s
test/tools/llvm-mca/X86/BtVer2/resources-avx1.s
test/tools/llvm-mca/X86/BtVer2/resources-sse1.s
test/tools/llvm-mca/X86/BtVer2/resources-sse2.s
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D57148.183286.patch
Type: text/x-patch
Size: 13915 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190124/f305bc99/attachment.bin>
More information about the llvm-commits
mailing list