[PATCH] D57300: [X86][BdVer2] Transfer delays from the integer to the floating point unit.
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 1 02:56:07 PST 2019
lebedev.ri added a comment.
In D57300#1380363 <https://reviews.llvm.org/D57300#1380363>, @andreadb wrote:
> Thanks for running that experiment. There is clearly an 8-10cy delay.
>
> Out of curiosity, do you get the same latency if the insertion/extract is at index $1 (I.e. not at index 0)?
I did, the results appear to be consistent:
$ cat /tmp/snippet.s ; ./bin/llvm-exegesis -mode=latency -snippets-file=/tmp/snippet.s
# LLVM-EXEGESIS-DEFREG EAX 0
# LLVM-EXEGESIS-DEFREG XMM0 0
# LLVM-EXEGESIS-DEFREG XMM1 0
vpinsrb $1, %eax, %xmm0, %xmm1
vpextrb $1, %xmm1, %eax
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2b8c21.o
---
mode: latency
key:
instructions:
- 'VPINSRBrr XMM1 XMM0 EAX i_0x1'
- 'VPEXTRBrr EAX XMM1 i_0x1'
config: ''
register_initial_values:
- 'EAX=0x0'
- 'XMM0=0x0'
- 'XMM1=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: latency, value: 11.0372, per_snippet_value: 22.0744 }
error: ''
info: ''
assembled_snippet: B8000000004883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C410C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C4E37920C801C4E37914C801C3
...
$ cat /tmp/snippet.s ; ./bin/llvm-exegesis -mode=latency -snippets-file=/tmp/snippet.s
# LLVM-EXEGESIS-DEFREG EAX 0
# LLVM-EXEGESIS-DEFREG XMM0 0
# LLVM-EXEGESIS-DEFREG XMM1 0
vpinsrb $0, %eax, %xmm0, %xmm1
vpextrb $1, %xmm1, %eax
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-3b8c6f.o
---
mode: latency
key:
instructions:
- 'VPINSRBrr XMM1 XMM0 EAX i_0x0'
- 'VPEXTRBrr EAX XMM1 i_0x1'
config: ''
register_initial_values:
- 'EAX=0x0'
- 'XMM0=0x0'
- 'XMM1=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: latency, value: 11.0304, per_snippet_value: 22.0608 }
error: ''
info: ''
assembled_snippet: B8000000004883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C410C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C4E37920C800C4E37914C801C3
...
$ cat /tmp/snippet.s ; ./bin/llvm-exegesis -mode=latency -snippets-file=/tmp/snippet.s
# LLVM-EXEGESIS-DEFREG EAX 0
# LLVM-EXEGESIS-DEFREG XMM0 0
# LLVM-EXEGESIS-DEFREG XMM1 0
vpinsrb $1, %eax, %xmm0, %xmm1
vpextrb $0, %xmm1, %eax
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-5f6929.o
---
mode: latency
key:
instructions:
- 'VPINSRBrr XMM1 XMM0 EAX i_0x1'
- 'VPEXTRBrr EAX XMM1 i_0x0'
config: ''
register_initial_values:
- 'EAX=0x0'
- 'XMM0=0x0'
- 'XMM1=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: latency, value: 11.0333, per_snippet_value: 22.0666 }
error: ''
info: ''
assembled_snippet: B8000000004883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C410C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C4E37920C801C4E37914C800C3
...
> That being said. I think this change is good, and it is consistent with the latency value defined for the WriteVecMoveToGpr and WriteVecMoveFromGpr.
I suspect `ReadFpu2Int` will too be introduced?
> So, LGTM
Thank you for the review.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D57300/new/
https://reviews.llvm.org/D57300
More information about the llvm-commits
mailing list