[all-commits] [llvm/llvm-project] d97f99: [MachineOutliner][AArch64] NFC: Split MBBs into "o...

Mon Feb 21 15:29:50 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: d97f997eb79d91b2872ac13619f49cb3a7120781
      https://github.com/llvm/llvm-project/commit/d97f997eb79d91b2872ac13619f49cb3a7120781
  Author: Jessica Paquette <jpaquette at apple.com>
  Date:   2022-02-21 (Mon, 21 Feb 2022)

  Changed paths:
    M llvm/include/llvm/CodeGen/TargetInstrInfo.h
    M llvm/lib/CodeGen/MachineOutliner.cpp
    M llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    M llvm/lib/Target/AArch64/AArch64InstrInfo.h

  Log Message:
  -----------
  [MachineOutliner][AArch64] NFC: Split MBBs into "outlinable ranges"

We found a case in the Swift benchmarks where the MachineOutliner introduces
about a 20% compile time overhead in comparison to building without the
MachineOutliner.

The origin of this slowdown is that the benchmark has long blocks which incur
lots of LRU checks for lots of candidates.

Imagine a case like this:

```
bb:
  i1
  i2
  i3
  ...
  i123456
```

Now imagine that all of the outlining candidates appear early in the block, and
that something like, say, NZCV is defined at the end of the block.

The outliner has to check liveness for certain registers across all candidates,
because outlining from areas where those registers are used is unsafe at call
boundaries.

This is fairly wasteful because in the previously-described case, the outlining
candidates will never appear in an area where those registers are live.

To avoid this, precalculate areas where we will consider outlining from.
Anything outside of these areas is mapped to illegal and not included in the
outlining search space. This allows us to reduce the size of the outliner's
suffix tree as well, giving us a potential memory win.

By precalculating areas, we can also optimize other checks too, like whether
or not LR is live across an outlining candidate.

Doing all of this is about a 16% compile time improvement on the case.

This is likely useful for other targets (e.g. ARM + RISCV) as well, but for now,
this only implements the AArch64 path. The original "is the MBB safe" method
still works as before.