[llvm-commits] PATCH: Fix AddressSanitizer to emit basic blocks in the natural order for the CFG

Tue Jul 10 08:50:12 PDT 2012

On Tue, Jul 10, 2012 at 8:33 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:

>
> On Jul 9, 2012, at 11:34 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
>
> > This will in theory (hehe, see below) make many parts of LLVM work
> faster where the algorithms are tuned for "normal" CFG patterns in code.
> Previously ASan would create massive live ranges by inserting basic blocks
> onto the end of the function.
> >
> > Currently this would cause some minor regressions in performance, but
> that will be fixed by a subsequent patch that attaches branch weight
> metadata to the appropriate places in the generated code. These will then
> allow the block placement pass to re-order the basic blocks into the
> optimal layout at the very end of codegen.
> >
> > Sadly, this bright and shiny future is not where we are. Applying this
> patch and re-running the regression test Kostya added to PR13225 (which is
> actually a repro for PR12652) the codegen time slows down **massively**.
> The profile also looks quite different:
>
> Strange. I suppose that if you are now inserting a million abort() calls
> in line, the live ranges still get chopped into tiny pieces.
>
> LiveVariables is mostly independent of block layout, and its algorithm is
> inherently quadratic for your test case. That explains the samples you are
> seeing in LiveVariables and SparseBitVector.
>
> I would like to see a backtrace on those
> std::vector<MachineBasicBlock*>:insert samples, but it's probably
>  LiveVariables::MarkVirtRegAliveInBlock() again. (It's using insert() to
> append elements to a vector, not to insert in the middle).
>
> FWIW, I am planning to rip out LiveVariables and compute LiveIntervals
> directly from the machine code using LiveRangeCalc. This is already done
> for regunit liveness. This may help the constant factor, but it is still
> the same algorithm.

Hrm... I wonder if it's worth having an adaptive algorithm so that we bail
out quickly when the structure of the function is going to quickly go
quadratic with the algorithm? Even if it costs some optimization, that
would seem better than grinding codegen to a halt.

That said, about 40% of what the profile is showing (which  may or may not
be that accurate sadly...) isn't out of LiveVariables, its
TwoAddressInstructionPass (and its helper findLocalKills). I'm going to
work on that one and see if I can get that to go away -- it doesn't look
inherently slow at first glance.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120710/0795e1a4/attachment.html>