[llvm-commits] PATCH: Fix AddressSanitizer to emit basic blocks in the natural order for the CFG

Mon Jul 16 02:03:21 PDT 2012

Submitted as r160254 after double checking with Kostya. =]

Looking into further ideas from Jakob about how to simplify this for the
backend.

On Sun, Jul 15, 2012 at 3:52 PM, Chandler Carruth <chandlerc at gmail.com>wrote:

> Ok, I think this patch is viable to go into the tree. The timings aren't
> great, but they're not substantially worse than the old block ordering now,
> and in some cases they're better:
>
> % for ((i=$((2 ** 10)); i <= $((32 * 2 ** 10)); i*=2)); do ./asan_glob.sh
> $i > x2.c; echo $i; time ./bin/clang -faddress-sanitizer -mllvm
> -asan-max-ins-per-bb=99999999 -O3 -o /dev/null -c x2.c; done
> 1024
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    1.11s user 0.03s system 99% cpu 1.139 total
> 2048
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    2.05s user 0.04s system 99% cpu 2.090 total
> 4096
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    6.40s user 0.10s system 99% cpu 6.512 total
> 8192
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    27.39s user 0.37s system 99% cpu 27.815 total
> 16384
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    121.61s user 1.50s system 99% cpu 2:03.34 total
> 32768
> ./bin/clang -faddress-sanitizer -mllvm -asan-max-ins-per-bb=99999999 -O3
> -o    794.23s user 15.56s system 93% cpu 14:25.32 total
>
> It's still growing faster than I would like, but frankly, with 32k stores
> to a global array in a single function, there really isn't much more we can
> do here.
>
> I'm working on the patch to add branch weights, but it's incredibly
> simple, so I'm not worried about checking this one in in the interim.
>
>
> On Wed, Jul 11, 2012 at 10:26 AM, Kostya Serebryany <kcc at google.com>wrote:
>
>> The patch looks ok, thanks!
>> If it leads to performance problems, I agree we better fix them by
>> playing with "branch weight metadata".
>> (I won't be able to check the performance until next week).
>>
>> --kcc
>>
>> On Tue, Jul 10, 2012 at 10:34 AM, Chandler Carruth <chandlerc at gmail.com>wrote:
>>
>>> This will in theory (hehe, see below) make many parts of LLVM work
>>> faster where the algorithms are tuned for "normal" CFG patterns in code.
>>> Previously ASan would create massive live ranges by inserting basic blocks
>>> onto the end of the function.
>>>
>>> Currently this would cause some minor regressions in performance, but
>>> that will be fixed by a subsequent patch that attaches branch weight
>>> metadata to the appropriate places in the generated code. These will then
>>> allow the block placement pass to re-order the basic blocks into the
>>> optimal layout at the very end of codegen.
>>>
>>> Sadly, this bright and shiny future is not where we are. Applying this
>>> patch and re-running the regression test Kostya added to PR13225 (which is
>>> actually a repro for PR12652) the codegen time slows down **massively**.
>>> The profile also looks quite different:
>>>
>>>     29.03%      llc  llc                  [.] findLocalKill(unsigned
>>> int, llvm::MachineBasicBlock*, llvm::MachineInstr*,
>>> llvm::MachineRegisterInfo*, llvm::DenseMap<llvm::MachineInstr*, unsigned
>>> int, llvm::DenseMapInfo<llvm::MachineInstr*> >&)
>>>
>>>
>>>
>>>
>>>     15.41%      llc  llc                  [.] void
>>> std::vector<llvm::MachineBasicBlock*,
>>> std::allocator<llvm::MachineBasicBlock*>
>>> >::_M_range_insert<std::reverse_iterator<__gnu_cxx::__normal_iterator<llvm::MachineBasicBlock**,
>>> std::vector<llvm::MachineBasicBlock*,
>>> std::allocator<llvm::MachineBasicBlock*> > > >
>>> >(__gnu_cxx::__normal_iterator<llvm::MachineBasicBlock**,
>>> std::vector<llvm::MachineBasicBlock*,
>>> std::allocator<llvm::MachineBasicBlock*> > >,
>>> std::reverse_iterator<__gnu_cxx::__normal_iterato
>>>      8.05%      llc  llc                  [.]
>>> llvm::LiveVariables::MarkVirtRegAliveInBlock(llvm::LiveVariables::VarInfo&,
>>> llvm::MachineBasicBlock*, llvm::MachineBasicBlock*,
>>> std::vector<llvm::MachineBasicBlock*,
>>> std::allocator<llvm::MachineBasicBlock*> >&)
>>>
>>>
>>>
>>>      7.80%      llc  llc                  [.]
>>> llvm::LiveInterval::addRangeFrom(llvm::LiveRange, llvm::LiveRange*)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      6.76%      llc  llc                  [.]
>>> llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock*,
>>> llvm::MachineBasicBlock::bundle_iterator<llvm::MachineInstr,
>>> llvm::ilist_iterator<llvm::MachineInstr> >, llvm::SlotIndex,
>>> llvm::MachineOperand&, unsigned int, llvm::LiveInterval&)
>>>
>>>
>>>
>>>      4.23%      llc  llc                  [.] (anonymous
>>> namespace)::TwoAddressInstructionPass::TryInstructionTransform(llvm::MachineBasicBlock::bundle_iterator<llvm::MachineInstr,
>>> llvm::ilist_iterator<llvm::MachineInstr> >&,
>>> llvm::MachineBasicBlock::bundle_iterator<llvm::MachineInstr,
>>> llvm::ilist_iterator<llvm::MachineInstr> >&,
>>> llvm::ilist_iterator<llvm::MachineBasicBlock>&, unsigned int, unsigned int,
>>> unsigned int, llvm::SmallPtrSet<llvm::MachineInstr*, 8u>&)
>>>
>>>      3.30%      llc  llc                  [.]
>>> llvm::LiveInterval::extendIntervalEndTo(llvm::LiveRange*, llvm::SlotIndex)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      2.89%      llc  llc                  [.]
>>> llvm::SparseBitVector<128u>::test(unsigned int)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      2.68%      llc  llc                  [.]
>>> llvm::SparseBitVector<128u>::FindLowerBound(unsigned int)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      2.17%      llc  llc                  [.]
>>> llvm::SparseBitVector<128u>::set(unsigned int)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      0.80%      llc  llc                  [.]
>>> llvm::LiveVariables::HandleVirtRegUse(unsigned int,
>>> llvm::MachineBasicBlock*, llvm::MachineInstr*)
>>>
>>>
>>>
>>>
>>>
>>>      0.65%      llc  libc-2.13.so         [.] _int_malloc
>>>
>>>
>>>
>>>
>>>
>>>
>>>      0.46%      llc  llc                  [.]
>>> llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char
>>> const*, unsigned int)
>>>
>>>
>>>
>>>
>>>
>>>      0.42%      llc  llc                  [.]
>>> llvm::SparseBitVector<128u>::SparseBitVectorIterator::AdvanceToNextNonZero()
>>>
>>>
>>>
>>>
>>>
>>>
>>>      0.38%      llc  llc                  [.]
>>> llvm::SmallPtrSetImpl::insert_imp(void const*)
>>>
>>>
>>>
>>>
>>>
>>>
>>>      0.37%      llc  llc                  [.] (anonymous
>>> namespace)::DAGCombiner::visit(llvm::SDNode*)
>>>
>>>
>>>
>>>
>>>
>>>      0.37%      llc  llc                  [.]
>>> llvm::SmallVectorImpl<llvm::LiveRange>::insert(llvm::LiveRange*,
>>> llvm::LiveRange const&)
>>>
>>>
>>> So, while I think this is definitely the right way to go eventually (and
>>> if you can go ahead and review, that would be appreciated) clearly we can't
>>> commit this until this is a bit less severe... ;]
>>>
>>> I think making 'join' use my new merge logic may help some of these (the
>>> addRangeFrom in particular), others look largely unrelated.
>>>
>>> -Chandler
>>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/71ec6ed7/attachment.html>