[all-commits] [llvm/llvm-project] bb260e: [CodeGen] Only deduplicate PHIs on critical edges ...
Alexis Engelke via All-commits
all-commits at lists.llvm.org
Wed Jul 3 02:19:27 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: bb260eb87d9bebd93e64051b574fbce0eebbad30
https://github.com/llvm/llvm-project/commit/bb260eb87d9bebd93e64051b574fbce0eebbad30
Author: Alexis Engelke <engelke at in.tum.de>
Date: 2024-07-03 (Wed, 03 Jul 2024)
Changed paths:
M llvm/lib/CodeGen/PHIElimination.cpp
M llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
M llvm/test/CodeGen/X86/bfloat.ll
M llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed.ll
Log Message:
-----------
[CodeGen] Only deduplicate PHIs on critical edges (#97064)
PHIElim deduplicates identical PHI nodes to reduce the number of copies
inserted. There are two cases:
1. Identical PHI nodes are in different blocks. That's the reason for
this optimization; this can't be avoided at SSA-level. A necessary
prerequisite for this is that the predecessors of all basic blocks
(where such a PHI node could occur) are the same. This implies that
all (>= 2) predecessors must have multiple successors, i.e. all edges
into the block are critical edges.
2. Identical PHI nodes are in the same block. CSE can remove these.
There are a few cases, however, where they still occur regardless:
- expand-large-div-rem creates PHI nodes with large integers, which
get lowered into one PHI per MVT. Later, some identical values
(zeroes) get folded, resulting in identical PHI nodes.
- peephole-opt occasionally inserts PHIs for the same value.
- Some pseudo instruction emitters create redundant PHI nodes (e.g.,
AVR's insertShift), merging the same values more than once.
In any case, this happens rarely and MachineCSE handles most cases
anyway, so that PHIElim only gets to see very few of such cases (see
changed test files).
Currently, all PHI nodes are inserted into a DenseMap that checks
equality not by pointer but by operands. This hash map is pretty
expensive (hashing itself and the hash map), but only really useful in
the first case.
Avoid this expensive hashing most of the time by restricting it to basic
blocks with only critical input edges. This improves performance for
code with many PHI nodes, especially at -O0. (Note that Clang often
doesn't generate PHI nodes and -O0 includes no mem2reg. Other
compilers always generate PHI nodes.)
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list