[llvm-bugs] [Bug 32439] New: Poor instruction scheduling for salsa20 cypher hot loop

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Mar 27 13:51:23 PDT 2017


            Bug ID: 32439
           Summary: Poor instruction scheduling  for salsa20 cypher hot
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: davide at freebsd.org
                CC: atrick at apple.com, efriedma at codeaurora.org,
                    llvm-bugs at lists.llvm.org, matze at braunis.de,
                    qcolombet at apple.com

Created attachment 18180
  --> https://bugs.llvm.org/attachment.cgi?id=18180&action=edit
dag dump pre-scheduling

This is the salsa20 benchmark from the testsuite (SingleSource).
I'm not sure if the model can be improved or this is a general issue with the
instruction scheduler heuristics.

passing -O3 `-mcpu=cortex-a53 -mtune=cortex-a53` LLVM generates the following
code for the hot loop (subset of instructions):

  400638:       0b100106        add     w6, w8, w16
  40063c:       0b0d0127        add     w7, w9, w13
  400640:       4ac66400        eor     w0, w0, w6, ror #25
  400644:       0b120166        add     w6, w11, w18
  400648:       4ac76463        eor     w3, w3, w7, ror #25
  40064c:       4ac66442        eor     w2, w2, w6, ror #25
  400650:       0b100006        add     w6, w0, w16
  400654:       0b0d0067        add     w7, w3, w13
  400658:       4ac65d8c        eor     w12, w12, w6, ror #23
  40065c:       0b120046        add     w6, w2, w18
  400660:       4ac75def        eor     w15, w15, w7, ror #23
  400664:       4ac65c84        eor     w4, w4, w6, ror #23
  400668:       0b000186        add     w6, w12, w0
  40066c:       0b0301e7        add     w7, w15, w3
  400670:       4ac64d08        eor     w8, w8, w6, ror #19
  400674:       0b020086        add     w6, w4, w2
  400678:       4ac74d29        eor     w9, w9, w7, ror #19
  40067c:       4ac64d6b        eor     w11, w11, w6, ror #19

while gcc 7:

  400688:       0b020175        add     w21, w11, w2
  40068c:       0b040214        add     w20, w16, w4
  400690:       0b050233        add     w19, w17, w5
  400694:       0b030192        add     w18, w12, w3
  400698:       4ad56508        eor     w8, w8, w21, ror #25
  40069c:       4ad464e7        eor     w7, w7, w20, ror #25
  4006a0:       4ad364c6        eor     w6, w6, w19, ror #25
  4006a4:       4ad26529        eor     w9, w9, w18, ror #25
  4006a8:       0b0b0115        add     w21, w8, w11
  4006ac:       0b1000f4        add     w20, w7, w16
  4006b0:       0b1100d3        add     w19, w6, w17
  4006b4:       0b0c0132        add     w18, w9, w12
  4006b8:       4ad55d4a        eor     w10, w10, w21, ror #23
  4006bc:       4ad45dad        eor     w13, w13, w20, ror #23
  4006c0:       4ad35def        eor     w15, w15, w19, ror #23
  4006c4:       4ad25dce        eor     w14, w14, w18, ror #23

The latter results in many more stalls and ~ 20% runtime regression.
SelectionDAG for the BB pre scheduling and initial IR attached.

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170327/43b11540/attachment.html>

More information about the llvm-bugs mailing list