[all-commits] [llvm/llvm-project] 5578ec: [MCA] Fixed a bug where loads and stores were some...

Andrea Di Biagio via All-commits all-commits at lists.llvm.org
Tue May 5 02:27:02 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 5578ec32f9c4fef46adce52a2e3d22bf409b3d2c
      https://github.com/llvm/llvm-project/commit/5578ec32f9c4fef46adce52a2e3d22bf409b3d2c
  Author: Andrea Di Biagio <andrea.dibiagio at sony.com>
  Date:   2020-05-05 (Tue, 05 May 2020)

  Changed paths:
    M llvm/include/llvm/MCA/HardwareUnits/LSUnit.h
    M llvm/lib/MCA/HardwareUnits/LSUnit.cpp
    M llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st1.s
    M llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st2.s
    M llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st3.s
    M llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st4.s
    M llvm/test/tools/llvm-mca/AArch64/Exynos/float-store.s
    M llvm/test/tools/llvm-mca/AArch64/Exynos/store.s
    M llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s
    M llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s
    M llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s
    M llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s
    M llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s
    A llvm/test/tools/llvm-mca/X86/BtVer2/independent-load-stores.s
    M llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s
    A llvm/test/tools/llvm-mca/X86/Haswell/independent-load-stores.s
    A llvm/test/tools/llvm-mca/X86/SkylakeClient/independent-load-stores.s
    A llvm/test/tools/llvm-mca/X86/SkylakeServer/independent-load-stores.s

  Log Message:
  -----------
  [MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent. Fixes PR45793.

This fixes a regression introduced by a very old commit 280ac1fd1dc35 (was
llvm-svn 361950).

Commit 280ac1fd1dc35 redesigned the logic in the LSUnit with the goal of
speeding up isReady() queries, and stabilising the LSUnit API (while also making
the load store unit more customisable).

The concept of MemoryGroup (effectively an alias set) was added by that commit
to better describe and track dependencies between memory operations.  However,
that concept was not just used for alias dependencies, but it was also used for
describing memory "order" dependencies (enforced by the memory consistency
model).

Instructions of a same memory group were considered "equivalent" as in:
independent operations that can potentially execute in parallel.  The problem
was that the cost of a dependency (in terms of number of cycles) should have
been different for "order" dependency. Instructions in an order dependency
simply have to have to wait until their predecessors are "issued" to an
underlying pipeline (rather than having to wait until predecessors have beeng
fully executed). For simple "order" dependencies, this was effectively
introducing an artificial delay on the "issue" of independent loads and stores.

This patch fixes the issue and adds a new test named 'independent-load-stores.s'
to a bunch of x86 targets. That test contains the reproducible posted by Fabian
Ritter on PR45793.

I had to rerun the update-mca-tests script on several files. To avoid expected
regressions on some Exynos tests, I have added a -noalias=false flag (to match
the old strict behavior on latencies).

Some tests for processor Barcelona are improved/fixed by this change and they
now show better results.  In a few tests we were incorrectly counting the time
spent by instructions in a scheduler queue.  In one case in particular we now
correctly see a store executed out of order.  That test was affected by the same
underlying issue reported as PR45793.

Reviewers: mattd

Differential Revision: https://reviews.llvm.org/D79351




More information about the All-commits mailing list