[LLVMdev] SROA is slow when compiling a large basic block

Chandler Carruth chandlerc at google.com
Wed May 14 20:15:49 PDT 2014


On Wed, May 14, 2014 at 7:02 PM, Akira Hatanaka <ahatanak at gmail.com> wrote:

> If I understand this code correctly. LoadAndStorePromoter::run is called
> once per every promotable alloca and iterates over the whole list to
> determine the order of loads and stores in the basic block that access the
> alloca.
>

Yes, this is a long standing problem of SROA.


>
> This is the list of ideas I have considered or implemented that can
> possibly solve my problem:
>
>
> 1. In SROA::getAnalysisUsage, always require DominatorTreeWrapperPass.
> This will enable SROA::promoteAllocas to use mem2reg, which is fast because
> it caches the per basic-block ordering of the relevant loads and stores. If
> it's important to avoid always computing the dominator tree, computing it
> conditionally based on whether there is a huge basic block in the function
> is another idea, but I am not sure if that is possible (I don't think this
> is currently supported).
>
>
> This brings down the compilation time (using clang -emit-llvm) from 350s
> to 30s (it still takes about 23s to do GVN). It also might fix PR17855 (the
> program that used to take 65s to compile now takes just 11s):
>
>
> http://llvm.org/bugs/show_bug.cgi?id=17855
>

This is my plan, but before doing it there are a bunch of *huge*
performance improvements we can make in the more common case so that
mem2reg isn't actually slower. Also, we need to be able to preserve
analyses further which the new pass manager will allow.

Is this a pressing matter for you?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140514/b2bb1072/attachment.html>


More information about the llvm-dev mailing list