[PATCH] D73152: [PHIElimination] Compile time optimization for huge functions.
Jonas Paulsson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 21 17:35:09 PST 2020
jonpa created this revision.
jonpa added reviewers: bjope, qcolombet, hfinkel.
Herald added subscribers: JDevlieghere, hiraditya.
Herald added a project: LLVM.
This is a compile-time optimization for PHIElimination (splitting of critical edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As discussed there, the way to remedy the slowdowns with huge functions seems to be to pre-compute the live-in registers for each MBB in an efficient way in PHIElimination.cpp and then pass that information along to LiveVariabless::addNewBlock().
The reported test case there achieves a significant improved compile time with this patch:
TRUNK:
time clang -O3 -march=z10 crash0.i -c -ftime-report |& grep "Eliminate PHI"
36.5362 ( 46.1%) 0.0137 ( 3.2%) 36.5499 ( 45.8%) 36.5584 ( 45.8%) Eliminate PHI nodes for register allocation
real 1m20.967s
user 1m19.779s
sys 0m0.623s
time clang -O3 -target x86_64 crash0.i -c -ftime-report |& grep "Eliminate PHI"
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0028 ( 0.1%) Eliminate PHI nodes for register allocation
real 0m3.851s
user 0m3.604s
sys 0m0.101s
PATCHED:
time clang -O3 -march=z10 crash0.i -c -ftime-report |& grep "Eliminate PHI"
0.0399 ( 0.1%) 0.0001 ( 0.0%) 0.0400 ( 0.1%) 0.0391 ( 0.1%) Eliminate PHI nodes for register allocation
real 0m43.473s
user 0m42.967s
sys 0m0.497s
time clang -O3 -target x86_64 crash0.i -c -ftime-report |& grep "Eliminate PHI"
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0010 ( 0.0%) Eliminate PHI nodes for register allocation
real 0m3.645s
user 0m3.559s
sys 0m0.090s
There is still a big slowdown for SystemZ, but the PHIElimination slowdown is gone. I would guess that this is an improvement on any given target where the number of virtual registers and basic blocks are great enough, but I have not tried this.
- Is it OK to trust that MBB numbers don't change, or would it be best to use a map from MachineBasicBlock* to its SparseBitVector?
- I choose to store the index of the virtual register instead of the register number, since that intuitively seems wiser (starting at 0), but curious if that's an improvement (but it's just one extra line of code...)
- SparseBitVector was proven faster than BitVector...
https://reviews.llvm.org/D73152
Files:
llvm/include/llvm/CodeGen/LiveVariables.h
llvm/include/llvm/CodeGen/MachineBasicBlock.h
llvm/lib/CodeGen/LiveVariables.cpp
llvm/lib/CodeGen/MachineBasicBlock.cpp
llvm/lib/CodeGen/PHIElimination.cpp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D73152.239467.patch
Type: text/x-patch
Size: 7093 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200122/1b9550ed/attachment.bin>
More information about the llvm-commits
mailing list