[all-commits] [llvm/llvm-project] bb260e: [CodeGen] Only deduplicate PHIs on critical edges ...

Alexis Engelke via All-commits all-commits at lists.llvm.org
Wed Jul 3 02:19:27 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bb260eb87d9bebd93e64051b574fbce0eebbad30
      https://github.com/llvm/llvm-project/commit/bb260eb87d9bebd93e64051b574fbce0eebbad30
  Author: Alexis Engelke <engelke at in.tum.de>
  Date:   2024-07-03 (Wed, 03 Jul 2024)

  Changed paths:
    M llvm/lib/CodeGen/PHIElimination.cpp
    M llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
    M llvm/test/CodeGen/X86/bfloat.ll
    M llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed.ll

  Log Message:
  -----------
  [CodeGen] Only deduplicate PHIs on critical edges (#97064)

PHIElim deduplicates identical PHI nodes to reduce the number of copies
inserted. There are two cases:

1. Identical PHI nodes are in different blocks. That's the reason for
   this optimization; this can't be avoided at SSA-level. A necessary
   prerequisite for this is that the predecessors of all basic blocks
   (where such a PHI node could occur) are the same. This implies that
   all (>= 2) predecessors must have multiple successors, i.e. all edges
   into the block are critical edges.

2. Identical PHI nodes are in the same block. CSE can remove these.
   There are a few cases, however, where they still occur regardless:

   - expand-large-div-rem creates PHI nodes with large integers, which
     get lowered into one PHI per MVT. Later, some identical values
     (zeroes) get folded, resulting in identical PHI nodes.
   - peephole-opt occasionally inserts PHIs for the same value.
   - Some pseudo instruction emitters create redundant PHI nodes (e.g.,
     AVR's insertShift), merging the same values more than once.

   In any case, this happens rarely and MachineCSE handles most cases
   anyway, so that PHIElim only gets to see very few of such cases (see
   changed test files).

Currently, all PHI nodes are inserted into a DenseMap that checks
equality not by pointer but by operands. This hash map is pretty
expensive (hashing itself and the hash map), but only really useful in
the first case.

Avoid this expensive hashing most of the time by restricting it to basic
blocks with only critical input edges. This improves performance for
code with many PHI nodes, especially at -O0. (Note that Clang often
doesn't generate PHI nodes and -O0 includes no mem2reg. Other
compilers always generate PHI nodes.)



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list