[PATCH] D73152: [PHIElimination] Compile time optimization for huge functions.

Jonas Paulsson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 21 17:35:09 PST 2020


jonpa created this revision.
jonpa added reviewers: bjope, qcolombet, hfinkel.
Herald added subscribers: JDevlieghere, hiraditya.
Herald added a project: LLVM.

This is a compile-time optimization for PHIElimination (splitting of critical edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As discussed there, the way to remedy the slowdowns with huge functions seems to be to pre-compute the live-in registers for each MBB in an efficient way in PHIElimination.cpp and then pass that information along to LiveVariabless::addNewBlock().

The reported test case there achieves a significant improved compile time with this patch:

  TRUNK:
  
  time clang -O3 -march=z10 crash0.i -c -ftime-report |& grep "Eliminate PHI" 
  36.5362 ( 46.1%)   0.0137 (  3.2%)  36.5499 ( 45.8%)  36.5584 ( 45.8%)  Eliminate PHI nodes for register allocation 
  real    1m20.967s
  user    1m19.779s
  sys     0m0.623s 
  
  time clang -O3 -target x86_64 crash0.i -c -ftime-report |& grep "Eliminate PHI"
  0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0028 (  0.1%)  Eliminate PHI nodes for register allocation
  real    0m3.851s
  user    0m3.604s
  sys     0m0.101s 
   
  PATCHED: 
   
  time clang -O3 -march=z10 crash0.i -c -ftime-report |& grep "Eliminate PHI"
  0.0399 (  0.1%)   0.0001 (  0.0%)   0.0400 (  0.1%)   0.0391 (  0.1%)  Eliminate PHI nodes for register allocation
  real    0m43.473s
  user    0m42.967s
  sys     0m0.497s 
    
  time clang -O3 -target x86_64 crash0.i -c -ftime-report |& grep "Eliminate PHI"
  0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)  Eliminate PHI nodes for register allocation
  real    0m3.645s  
  user    0m3.559s 
  sys     0m0.090s  

There is still a big slowdown for SystemZ, but the PHIElimination slowdown is gone. I would guess that this is an improvement on any given target where the number of virtual registers and basic blocks are great enough, but I have not tried this.

- Is it OK to trust that MBB numbers don't change, or would it be best to use a map from MachineBasicBlock* to its SparseBitVector?

- I choose to store the index of the virtual register instead of the register number, since that intuitively seems wiser (starting at 0), but curious if that's an improvement (but it's just one extra line of code...)

- SparseBitVector was proven faster than BitVector...


https://reviews.llvm.org/D73152

Files:
  llvm/include/llvm/CodeGen/LiveVariables.h
  llvm/include/llvm/CodeGen/MachineBasicBlock.h
  llvm/lib/CodeGen/LiveVariables.cpp
  llvm/lib/CodeGen/MachineBasicBlock.cpp
  llvm/lib/CodeGen/PHIElimination.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D73152.239467.patch
Type: text/x-patch
Size: 7093 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200122/1b9550ed/attachment.bin>


More information about the llvm-commits mailing list