[all-commits] [llvm/llvm-project] bbbe8e: [GlobalISel][Localizer] Allow localization of a sm...

Thu Jan 11 02:57:49 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bbbe8ecc17797893fae6fc3a16425b6f26670423
      https://github.com/llvm/llvm-project/commit/bbbe8ecc17797893fae6fc3a16425b6f26670423
  Author: Amara Emerson <amara at apple.com>
  Date:   2024-01-11 (Thu, 11 Jan 2024)

  Changed paths:
    M llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
    M llvm/include/llvm/CodeGen/GlobalISel/Localizer.h
    M llvm/lib/CodeGen/GlobalISel/Localizer.cpp
    M llvm/test/CodeGen/AArch64/GlobalISel/invoke-region.ll
    M llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-hoisted-constants.ll
    M llvm/test/CodeGen/AArch64/GlobalISel/localizer.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-divergent-i1-phis-no-lane-mask-merging.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memmove.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/localizer.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll

  Log Message:
  -----------
  [GlobalISel][Localizer] Allow localization of a small number of repeated phi uses. (#77566)

We previously had a heuristic that if a value V was used multiple times
in a single PHI, then to avoid potentially rematerializing into many predecessors
we bail out. The phi uses only counted as a single use in the shouldLocalize() hook
because it counted the PHI as a single instruction use, not factoring in it may
have many incoming edges.

It turns out this heuristic is slightly too pessimistic, and allowing a small number
of these uses to be localized can improve code size due to shortening live ranges,
especially if those ranges span a call.

This change results in some improvements in size on CTMark -Os:
```
Program                                       size.__text
                                              before         after           diff
kimwitu++/kc                                  451676.00      451860.00       0.0%
mafft/pairlocalalign                          241460.00      241540.00       0.0%
tramp3d-v4/tramp3d-v4                         389216.00      389208.00      -0.0%
7zip/7zip-benchmark                           587528.00      587464.00      -0.0%
Bullet/bullet                                 457424.00      457348.00      -0.0%
consumer-typeset/consumer-typeset             405472.00      405376.00      -0.0%
SPASS/SPASS                                   410288.00      410120.00      -0.0%
lencod/lencod                                 426396.00      426108.00      -0.1%
ClamAV/clamscan                               380108.00      379756.00      -0.1%
sqlite3/sqlite3                               283664.00      283372.00      -0.1%
                           Geomean difference                               -0.0%
```
I experimented with different variations and thresholds. Using 3 instead
of 2 resulted in a further 0.1% improvement on ClamAV but also regressed
sqlite3 by the same %.