[llvm-bugs] [Bug 37202] New: LCSSA (Loop-Closed SSA Form Pass) (and hence LICM) is extremely slow

Sun Apr 22 20:12:27 PDT 2018

https://bugs.llvm.org/show_bug.cgi?id=37202

            Bug ID: 37202
           Summary: LCSSA (Loop-Closed SSA Form Pass) (and hence LICM) is
                    extremely slow
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: sroland at vmware.com
                CC: llvm-bugs at lists.llvm.org

Created attachment 20208
  --> https://bugs.llvm.org/attachment.cgi?id=20208&action=edit
pathetic nested loop case bitcode

The loop-closed SSA Form Pass can be extremely slow.
I'm attaching a bitcode which takes roughly ~15 minutes to compile (opt -O2),
and nearly everything (roughly ~96%) of it is spent in (various) lcssa runs.

The bitcode is auto-generated by mesa llvmpipe, and I certainly don't claim the
shader causing it is doing anything useful (mostly the shader was just
consisting of deeply nested loops). We don't actually use all that many passes
in mesa llvmpipe, we're trying basically -sroa -early-cse -simplifycfg -licm
-reassociate -mem2reg -constprop -instcombine -gvn (note the public mesa code
has this a bit different as of now, omitting -early-cse and -licm before
-simplifycfg, but I think this order is better).
But licm comes with a whole bunch of other stuff, lcssa being one of it.
Note that when we last really looked at it, that was with llvm 3.3, and before
llvm 3.5 licm did not imply most of the stuff it does now (just -loops and
-loop-simplify). Our jit compilation is definitely faster than using opt -O2
for this case, with the reason basically being we only run one lcssa pass
instead of 6 (albeit from these 6 only 2 contributed most of the time), but it
still takes ~5 minutes to compile.
Probably licm with llvm 3.3 also didn't do as much as it can now, but it
doesn't seem to be possible to get some passes which can only do roughly what
licm of llvm 3.3 did, otherwise we'd try that. (In general for our jit needs in
llvmpipe, licm seems to be quite expensive, even if the structure of the shader
is a lot more sound than this case here, a pity since in general licm looks
potentially quite useful.)

>From the description, this looks somewhat similar to bug 31851 to me, but I
tried the bitcode there and from the -time-passes information the time spent
looked a lot more sane - while loop stuff was on the top, not specifically
lcssa, which was only accounting for 3.6%. The bitcode there also has ~16 times
more instructions, and compiles ~6 times faster for me, so the example I
attached is about ~100 times worse.

I've tested this with llvm 5.0, 6.0 and trunk, and it's basically all the same.
Most of the time spent seems to be in various ssa updater functions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180423/32d51696/attachment-0001.html>