[all-commits] [llvm/llvm-project] f92198: Rebase: [Facebook] Add clang driver options to tes...

Mon Jul 11 09:32:54 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f921985a29fc9787b3ed98dbc897146cc3fd91f7
      https://github.com/llvm/llvm-project/commit/f921985a29fc9787b3ed98dbc897146cc3fd91f7
  Author: Amir Ayupov <aaupov at fb.com>
  Date:   2022-07-11 (Mon, 11 Jul 2022)

  Changed paths:
    M clang/include/clang/Driver/Options.td
    M clang/lib/Driver/ToolChains/Gnu.cpp
    M cross-project-tests/lit.cfg.py
    M cross-project-tests/lit.site.cfg.py.in
    M lldb/test/API/lit.cfg.py
    M lldb/test/API/lit.site.cfg.py.in
    M lldb/test/Shell/helper/toolchain.py
    M lldb/test/Shell/lit.site.cfg.py.in
    M llvm/CMakeLists.txt

  Log Message:
  -----------
  Rebase: [Facebook] Add clang driver options to test debug info and BOLT

Summary:
This is an essential piece of infrastructure for us to be
continuously testing debug info with BOLT. We can't only make changes
to a test repo because we need to change debuginfo tests to call BOLT,
hence, this diff needs to sit in our opensource repo. But when upstreaming
to LLVM, this should be kept BOLT-only outside of LLVM. When upstreaming,
we need to git diff and check all folders that are being modified by our
commits and discard this one (and leave as an internal diff).

To test BOLT in debuginfo tests, configure it with -DLLVM_TEST_BOLT=ON.
Then run check-lldb and check-debuginfo.

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D29205224
https://phabricator.intern.facebook.com/D29564078
https://phabricator.intern.facebook.com/D33289118
https://phabricator.intern.facebook.com/D34957174

Test Plan:
tested locally
Configured with:
-DLLVM_ENABLE_PROJECTS="clang;lld;lldb;compiler-rt;bolt;debuginfo-tests"
-DLLVM_TEST_BOLT=ON
Ran test suite with:
ninja check-debuginfo
ninja check-lldb

Reviewers: #llvm-bolt

Subscribers: ayermolo, phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D35317341

Tasks: T92898286

  Commit: 6d0528636ae54fba75938a79ae7a98dfcc949f72
      https://github.com/llvm/llvm-project/commit/6d0528636ae54fba75938a79ae7a98dfcc949f72
  Author: Rafael Auler <rafaelauler at fb.com>
  Date:   2022-07-11 (Mon, 11 Jul 2022)

  Changed paths:
    M bolt/lib/Core/BinaryEmitter.cpp
    M llvm/include/llvm/MC/MCFragment.h
    M llvm/include/llvm/MC/MCObjectStreamer.h
    M llvm/include/llvm/MC/MCStreamer.h
    M llvm/lib/MC/MCAssembler.cpp
    M llvm/lib/MC/MCFragment.cpp
    M llvm/lib/MC/MCObjectStreamer.cpp
    M llvm/lib/MC/MCStreamer.cpp
    M llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
    A llvm/test/MC/X86/directive-avoid_end_align.s

  Log Message:
  -----------
  Rebase: [Facebook] [MC] Introduce NeverAlign fragment type

Summary:
Introduce NeverAlign fragment type.

The intended usage of this fragment is to insert it before a pair of
macro-op fusion eligible instructions. NeverAlign fragment ensures that
the next fragment (first instruction in the pair) does not end at a
given alignment boundary by emitting a minimal size nop if necessary.

In effect, it ensures that a pair of macro-fusible instructions is not
split by a given alignment boundary, which is a precondition for
macro-op fusion in modern Intel Cores (64B = cache line size, see Intel
Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode
Pipeline: Macro-Fusion).

This patch introduces functionality used by BOLT when emitting code with
MacroFusion alignment already in place.

The use case is different from BoundaryAlign and instruction bundling:
- BoundaryAlign can be extended to perform the desired alignment for the
first instruction in the macro-op fusion pair (D101817). However, this
approach has higher overhead due to reliance on relaxation as
BoundaryAlign requires in the general case - see
https://reviews.llvm.org/D97982#2710638.
- Instruction bundling: the intent of NeverAlign fragment is to prevent
the first instruction in a pair ending at a given alignment boundary, by
inserting at most one minimum size nop. It's OK if either instruction
crosses the cache line. Padding both instructions using bundles to not
cross the alignment boundary would result in excessive padding. There's
no straightforward way to request instruction bundling to avoid a given
end alignment for the first instruction in the bundle.

LLVM: https://reviews.llvm.org/D97982

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D30142613

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: phabricatorlinter

Differential Revision: https://phabricator.intern.facebook.com/D31361547

  Commit: 76029cc53e838e6d86b13b0c39152f474fb09263
      https://github.com/llvm/llvm-project/commit/76029cc53e838e6d86b13b0c39152f474fb09263
  Author: Maksim Panchenko <maks at fb.com>
  Date:   2022-07-11 (Mon, 11 Jul 2022)

  Changed paths:
    M bolt/include/bolt/Core/BinarySection.h
    M bolt/include/bolt/Core/Relocation.h
    M bolt/include/bolt/Rewrite/RewriteInstance.h
    M bolt/lib/Core/Relocation.cpp
    M bolt/lib/Rewrite/RewriteInstance.cpp
    R bolt/test/AArch64/Inputs/rels-exe.yaml
    R bolt/test/AArch64/Inputs/rels-so.yaml
    R bolt/test/AArch64/runtime-relocs.test

  Log Message:
  -----------
  Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"

Summary:
This reverts commit 729d29e167a553ee1190c310b6a510db8d8731ac.

Needed as a workaround for T112872562.

Manual rebase conflict history:
https://phabricator.intern.facebook.com/D35230076
https://phabricator.intern.facebook.com/D35681740

Test Plan: sandcastle

Reviewers: #llvm-bolt

Subscribers: spupyrev

Differential Revision: https://phabricator.intern.facebook.com/D37098481

  Commit: 7228371054746fd37a729b7f7f72f4689b68e890
      https://github.com/llvm/llvm-project/commit/7228371054746fd37a729b7f7f72f4689b68e890
  Author: spupyrev <spupyrev at fb.com>
  Date:   2022-07-11 (Mon, 11 Jul 2022)

  Changed paths:
    M bolt/lib/Passes/ExtTSPReorderAlgorithm.cpp

  Log Message:
  -----------
  [BOLT] Do not merge cold and hot chains of basic blocks

There is a post-processing in ext-tsp block reordering that merges some blocks
into chains. This allows to maintain the original block order in the absense of
profile data and can be beneficial for code size (when fallthroughs are merged).
In the earlier version we could merge hot and cold (with zero execution count)
chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The
diff eliminates the redundant merging.

It is unlikely the change will affect the performance of a binary in a
measurable way, as it is mostly operates with cold basic blocks. However, after
the diff the impact of split-all-cold is almost negligible and we can avoid the
extra function splitting.

Measuring on the clang binary (negative is good, positive is a regression):
**clang12**
benchmark1:  `0.0253`
benchmark2:  `-0.1843`
benchmark3:  `0.3234`
benchmark4:  `0.0333`

**clang10**
benchmark1  `-0.2517`
benchmark2  `-0.3703`
benchmark3  `-0.1186`
benchmark4  `-0.3822`

**clang7**
benchmark1  `0.2526`
benchmark2  `0.0500`
benchmark3  `0.3024`
benchmark4  `-0.0489`

**Overall**: `-0.0671 ± 0.1172` (insignificant)

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D129397

Compare: https://github.com/llvm/llvm-project/compare/c8a28ae214c0...722837105474