[all-commits] [llvm/llvm-project] 866ae6: [AArch64] [BranchRelaxation] Optimize for hot code...

Daniel Hoekwater via All-commits all-commits at lists.llvm.org
Wed Sep 6 13:48:27 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 866ae69cfa73ce224944c64965d0426637e31517
      https://github.com/llvm/llvm-project/commit/866ae69cfa73ce224944c64965d0426637e31517
  Author: Daniel Hoekwater <hoekwater at google.com>
  Date:   2023-09-06 (Wed, 06 Sep 2023)

  Changed paths:
    M llvm/lib/CodeGen/BranchRelaxation.cpp
    M llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    M llvm/test/CodeGen/AArch64/branch-relax-b.ll
    M llvm/test/CodeGen/AArch64/branch-relax-cross-section.mir

  Log Message:
  -----------
  [AArch64] [BranchRelaxation] Optimize for hot code size in AArch64 branch relaxation

On AArch64, it is safe to let the linker handle relaxation of
unconditional branches; in most cases, the destination is within range,
and the linker doesn't need to do anything. If the linker does insert
fixup code, it clobbers the x16 inter-procedural register, so x16 must
be available across the branch before linking. If x16 isn't available,
but some other register is, we can relax the branch either by spilling
x16 OR using the free register for a manually-inserted indirect branch.

This patch builds on D145211. While that patch is for correctness, this
one is for performance of the common case. As noted in
https://reviews.llvm.org/D145211#4537173, we can trust the linker to
relax cross-section unconditional branches across which x16 is
available.

Programs that use machine function splitting care most about the
performance of hot code at the expense of the performance of cold code,
so we prioritize minimizing hot code size.

Here's a breakdown of the cases:

   Hot -> Cold [x16 is free across the branch]
     Do nothing; let the linker relax the branch.

   Cold -> Hot [x16 is free across the branch]
     Do nothing; let the linker relax the branch.

   Hot -> Cold [x16 used across the branch, but there is a free register]
     Spill x16; let the linker relax the branch.

     Spilling requires fewer instructions than manually inserting an
     indirect branch.

   Cold -> Hot [x16 used across the branch, but there is a free register]
     Manually insert an indirect branch.

     Spilling would require adding a restore block in the hot section.

   Hot -> Cold [No free regs]
     Spill x16; let the linker relax the branch.

   Cold -> Hot [No free regs]
     Spill x16 and put the restore block at the end of the hot function; let the linker relax the branch.
     Ex:
       [Hot section]
       func.hot:
         ... hot code...
       func.restore:
         ... restore x16 ...
         B func.hot

       [Cold section]
         func.cold:
         ... spill x16 ...
         B func.restore

     Putting the restore block at the end of the function instead of
     just before the destination increases the cost of executing the
     store, but it avoids putting cold code in the middle of hot code.
     Since the restore is very rarely taken, this is a worthwhile
     tradeoff.

Differential Revision: https://reviews.llvm.org/D156767




More information about the All-commits mailing list