[PATCH] D57300: [X86][BdVer2] Transfer delays from the integer to the floating point unit.

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 30 02:35:10 PST 2019


lebedev.ri added a comment.

In D57300#1375602 <https://reviews.llvm.org/D57300#1375602>, @andreadb wrote:

>         2e:       41 bf 00 00 00 00       mov    $0x0,%r15d
>         34:       c4 c3 41 20 ff 01       vpinsrb $0x1,%r15d,%xmm7,%xmm7
>         3a:       c4 c3 41 20 ff 01       vpinsrb $0x1,%r15d,%xmm7,%xmm7
>   ....
>       ea88:       c4 c3 41 20 ff 01       vpinsrb $0x1,%r15d,%xmm7,%xmm7
>
>
> If there is really a bypass delay, then that code snippet is not going to expose it.
>  The real bottleneck in that code snippet is the dependency on %xmm7. R15 is only set once at the beginning by a zero-move, and then never updated again.
>
> In this case, we have that each cycle the scheduler issues a uOp to moves R15 to the FPU. However, the vpinsrd can only be issued every other cycle due to the dependency on XMM7. That means, in the long run, any bypass delay is going to be hidden by the latency caused by the data dependency on XMM7.
>  Basically, that code snippet is not good to measure those kinds of delays...


Very nice observation.
Let's //try// something better.

  $ cat /tmp/snippet.s ; ./bin/llvm-exegesis -mode=latency -snippets-file=/tmp/snippet.s
  # LLVM-EXEGESIS-DEFREG EAX 0
  # LLVM-EXEGESIS-DEFREG XMM0 0
  # LLVM-EXEGESIS-DEFREG XMM1 0
  vpinsrb $0, %eax, %xmm0, %xmm1
  vpextrb $0, %xmm1, %eax
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-a71a33.o
  ---
  mode:            latency
  key:             
    instructions:    
      - 'VPINSRBrr XMM1 XMM0 EAX i_0x0'
      - 'VPEXTRBrr EAX XMM1 i_0x0'
    config:          ''
    register_initial_values: 
      - 'EAX=0x0'
      - 'XMM0=0x0'
      - 'XMM1=0x0'
  cpu_name:        bdver2
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 10000
  measurements:    
    - { key: latency, value: 11.0282, per_snippet_value: 22.0564 }
  error:           ''
  info:            ''
  assembled_snippet: B8000000004883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C410C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C4E37920C800C4E37914C800C3
  ...
  $ /usr/bin/objdump -d /tmp/snippet-a71a33.o
  
  /tmp/snippet-a71a33.o:     file format elf64-x86-64
  
  
  Disassembly of section .text:
  
  0000000000000000 <foo>:
         0:       b8 00 00 00 00          mov    $0x0,%eax
         5:       48 83 ec 10             sub    $0x10,%rsp
         9:       c7 04 24 00 00 00 00    movl   $0x0,(%rsp)
        10:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
        17:       00 
        18:       c7 44 24 08 00 00 00    movl   $0x0,0x8(%rsp)
        1f:       00 
        20:       c7 44 24 0c 00 00 00    movl   $0x0,0xc(%rsp)
        27:       00 
        28:       c5 fa 6f 04 24          vmovdqu (%rsp),%xmm0
        2d:       48 83 c4 10             add    $0x10,%rsp
        31:       48 83 ec 10             sub    $0x10,%rsp
        35:       c7 04 24 00 00 00 00    movl   $0x0,(%rsp)
        3c:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
        43:       00 
        44:       c7 44 24 08 00 00 00    movl   $0x0,0x8(%rsp)
        4b:       00 
        4c:       c7 44 24 0c 00 00 00    movl   $0x0,0xc(%rsp)
        53:       00 
        54:       c5 fa 6f 0c 24          vmovdqu (%rsp),%xmm1
        59:       48 83 c4 10             add    $0x10,%rsp
        5d:       c4 e3 79 20 c8 00       vpinsrb $0x0,%eax,%xmm0,%xmm1
        63:       c4 e3 79 14 c8 00       vpextrb $0x0,%xmm1,%eax
  ...
      eab1:       c4 e3 79 20 c8 00       vpinsrb $0x0,%eax,%xmm0,%xmm1
      eab7:       c4 e3 79 14 c8 00       vpextrb $0x0,%xmm1,%eax
      eabd:       c3                      retq   

Though i suppose that still have the dependency on `xmm1`.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57300/new/

https://reviews.llvm.org/D57300





More information about the llvm-commits mailing list