[llvm-bugs] [Bug 50484] New: [SchedModel][Cortex-A55/Cortex-A53] Unexpected write latency for some ldr instructions

via llvm-bugs llvm-bugs at lists.llvm.org
Wed May 26 07:30:18 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50484

            Bug ID: 50484
           Summary: [SchedModel][Cortex-A55/Cortex-A53] Unexpected write
                    latency for some ldr instructions
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: AArch64
          Assignee: unassignedbugs at nondot.org
          Reporter: andrea.dibiagio at gmail.com
                CC: arnaud.degrandmaison at arm.com,
                    llvm-bugs at lists.llvm.org, smithp352 at googlemail.com,
                    Ties.Stuij at arm.com

Found while investigating an issue in the in-order processor simulation
pipeline of MCA.

Example:

> $ cat foo.s
> ldr     w4, [x6], #4


```
llvm-mca -mtriple=aarch64 --mcpu=cortex-a55 -debug-only=llvm-mca foo.s
```

```
                Opcode Name= LDRWpost
                SchedClassID=924
                Resource Mask=0x00000000000020, Reserved=0, #Units=1, cy=1
                Buffer Mask=0x00000000000020
                 Used Units=0x00000000000020
                Used Groups=0x00000000000000
                [Def]    OpIdx=0, Latency=0, WriteResourceID=0
                [Def]    OpIdx=1, Latency=4, WriteResourceID=0
                [Use]    OpIdx=2, UseIndex=0
                MaxLatency=4
                NumMicroOps=2
```

That particular LDR is disassembled as opcode LDRWpost.
The first write appears to have a zero latency.

That zero-latency write confuses the in-order stage, and it is the root cause
of the issue with the following timeline:

```
Timeline view:
                    0123456789
Index     0123456789          012345

[0,0]     DeeE .    .    .    .    .   ldr      w4, [x6], #4
[0,1]     .DeeeE    .    .    .    .   str      w0, [x21]
[1,0]     .    DeeE .    .    .    .   ldr      w4, [x6], #4
[1,1]     .    .DeeeE    .    .    .   str      w0, [x21]
[2,0]     .    .    DeeE .    .    .   ldr      w4, [x6], #4
[2,1]     .    .    .DeeeE    .    .   str      w0, [x21]
[3,0]     .    .    .    DeeE .    .   ldr      w4, [x6], #4
[3,1]     .    .    .    .DeeeE    .   str      w0, [x21]
[4,0]     .    .    .    .    DeeE .   ldr      w4, [x6], #4
[4,1]     .    .    .    .    .DeeeE   str      w0, [x21]
```

Under the assumption that there is no aliasing between loads and stores, each
load could start a couple of cycles earlier. However, MCA artificially delays
the loads to avoid that the load write happens before the store terminates
execution.

Note: same problem can be seen for cortex-a53.

Are we missing some InstRW?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210526/6b74daa1/attachment.html>


More information about the llvm-bugs mailing list