New Solution for memory leak in tblgen

David Blaikie dblaikie at gmail.com
Tue Nov 11 18:22:22 PST 2014


OK, so you're saying all the heap allocated TreePatterns and
TreePatternNodes are currently leaked? And we create some on the stack
(that obviously don't leak)?

And from previous discussions we've seen that TreePatterns and
TreePatternNodes form unbreakable(?) cycles. (are they unbreakable - could
we use weak_ptr or raw pointer in there somewhere to break them safely?
I'll have to go back & look at the prior threads or you could copy/paste
the summary of the cycles you saw? - though the only reason this would be
worth understanding is if we actually destroy things at some point other
than the end - I don't know anything about tablegen or when nodes are
created and destroyed, I'm not sure if they're destroyed at any time other
than "we've finished doing all the work, cleanup now")

If that's all the case, then it seems obvious enough that we should just
have a big list of all the heap allocated ones, that goes away when we're
done.

On Tue, Nov 11, 2014 at 6:07 PM, wuhui1973 <wuhui1973 at 163.com> wrote:

> Sorry David:
>
> I make a mistake in yesterday explanation.
>
> Here is some data I collect during building Tablegen, the data is from x86
> target
>
> llvm[3]: Building X86.td instruction information with tblgen
> NumTPNAllocCur: 120754, NumTPNAllocAccu: 121469 <-- it indicates only
> 121469-120754=715 heap alloc TreePatternNodes get freed during processing
> (~0.5%)
> NumTPAllocCur: 9488, NumTPAllocAccu: 9488              <-- in fact *NO*
> heap alloc TreePatterns get freed during processing
> NumTPNAllocCur: 0, NumTPNDel: 121469
> NumTPAllocCur: 0, NumTPDel: 9488
> TPN dtorNum: 121469, TP dtorNum: 16821                  <-- the diff "
> NumTPAllocAccu"-"TPN dtorNum"=0 indicates that all TreePatternNodes are
> heap allocated
> TPN ctorNum: 121469, TP ctorNum: 16821                  <-- the diff "TP
> ctorNum"-"NumTPDel"=7333 indicates that 7333 TreePatterns are placed into
> stack!
>
> llvm[3]: Building X86.td DAG instruction selector implementation with
> tblgen
> NumTPNAllocCur: 180382, NumTPNAllocAccu: 197494    <-- 17112
> TreePatternNodes (~8%) get freed
> NumTPAllocCur: 9488, NumTPAllocAccu: 9488            <--  *NO* heap alloc
> TreePatterns get freed during processing
> NumTPNAllocCur: 0, NumTPNDel: 197494                   <-- "NumTPNDel" ==
> "TPN ctorNum": all TreePatternNodes all heap allocated
> NumTPAllocCur: 0, NumTPDel: 9488
> TPN dtorNum: 197494, TP dtorNum: 16821                <-- the diff 16821-
> 9488=7333 indicates that 7333 TreePatterns are placed into stack!
> TPN ctorNum: 197494, TP ctorNum: 16821
>
> So more than 43% TreePatterns are stack residue for both processing.
>
> And other target also gives similar data, for instance AMDGPU
>
> llvm[3]: Building AMDGPU.td instruction information with tblgen
> NumTPNAllocCur: 28248, NumTPNAllocAccu: 28348    <-- 100 TreePatternNodes
> (~0.3%) get freed
> NumTPAllocCur: 2422, NumTPAllocAccu: 2422              <-- *NO* heap
> alloc TreePatterns get freed during processing
> NumTPNAllocCur: 0, NumTPNDel: 28348
> NumTPAllocCur: 0, NumTPDel: 2422
> TPN dtorNum: 28348, TP dtorNum: 3792
> TPN ctorNum: 28348, TP ctorNum: 3792                       <-- 1370
> TreePatterns are inside stack (>36%)
> llvm[3]: Building AMDGPU.td DAG instruction selector implementation with
> tblgen
> NumTPNAllocCur: 44076, NumTPNAllocAccu: 47030    <-- 2954 TreePatternNodes
> (~6%) get freed
> NumTPAllocCur: 2422, NumTPAllocAccu: 2422              <-- *NO* heap
> alloc TreePatterns get freed during processing
> NumTPNAllocCur: 0, NumTPNDel: 47030
> NumTPAllocCur: 0, NumTPDel: 2422
> TPN dtorNum: 47030, TP dtorNum: 3792
> TPN ctorNum: 47030, TP ctorNum: 3792                      <-- 1370 TreePatterns
> are inside stack (>36%)
>
> There are several functions in CodeGenDAGPatterns.cpp will allocate
> TreePatternNodes & TreePatterns from heap.
> ParsePatternFragments (5 spots allocate TreePatternNodes)
> ParsePatterns (2 spots allocate TreePatternNodes, 2 spots allocate
> TreePatterns)
> CombineChildVariants (1 spot allocate TreePatternNodes)
> clone (2 spots allocate TreePatternNodes)
>
> So I don't think there is any special construction path.
>
> Regards
> Hui Wu
>
> 在 2014-11-12 06:33:37,"David Blaikie" <dblaikie at gmail.com> 写道:
>
>
>
> On Mon, Nov 10, 2014 at 11:36 PM, wuhui1973 <wuhui1973 at 163.com> wrote:
>
>> Hope I can explain it clear enough :-)
>>
>> > MemReclaimer has two containers to hold pointers of TreePatternNode &
>> TreePattern which are allocated from heap.
>> > Only one global instance of MemReclaimer  is declared -- memReclaimer.
>> > operator new & operator delete are defined for TreePatternNode &
>> TreePattern
>>           new:    push the new allocated object (pointer) into containers
>> in memReclaimer.
>>           delete: remove the pointer from the containers in memReclaimer.
>>
>>           So we can guarantee that objects allocated from heap can
>> always be recorded by memReclaimer.
>>           And no double free may incur, as some places in the source
>> delete some pointers manually (maybe we can remove these lines).
>> > So memReclaimer only keeps the objects that are leaked in current
>> design.
>> > the author of CodeGenDAGPattern  create nodes, but destroy part of
>> them (for TreePattern, the nodes destroyed manually account for more
>> than 30%,
>>
>
> Could you provide stack traces/explanation for these case where the
> TreePatterns are destroyed?
>
>
>>           for some target even over 50%; while for TreePatternNode  only
>> around 1%
>>
>
> And for the TreePatternNodes?
>
> If they're mostly just on special case construction paths, perhaps we
> could special case their removal from the list we could add to
> CodeGenDAGPattern - or, if they're close enough to construction time, we
> might be able to avoid adding them to such a list in the first place?
>
>
>> ) and make the rest leaked deliberately.
>> > dtor of CodeGenDAGPattern now is the place to destroy all not freed
>> objects
>> > move the functionality of MemReclaimer into CodeGenDAGPattern  is
>> definitely feasible I think.
>>
>
>> Regards
>> Hui Wu
>>
>>
>> At 2014-11-11 14:20:25, "David Blaikie" <dblaikie at gmail.com> wrote:
>>
>> Could you describe the high level design here?
>>
>> It looks like there's a static pool
>> (CodeGenDAGPatterns.cpp::memReclaimer) of instances that's used as some
>> kind of last-chance cleanup? Instances of TreePatternNode mostly manage
>> their own lifetime but then if any haven't been destroyed by the time
>> the CodeGenDAGPatterns dtor runs, the remaining elements are destroyed.
>> This is to handle cycles, I take it?
>>
>> It still seems like a bit of a GC-esque workaround to handle this case
>> when there might be something better... but I don't know much about
>> tablegen, perhaps there isn't.
>>
>> How often are nodes destroyed by themselves? Should we just give up
>> owning these anywhere else & move all ownership into CodeGenDAGPatterns and
>> clean them up in CodeGenDAGPatterns' dtor? (just have a vector of
>> unique_ptrs (maybe even a list or deque of nodes directly owned, rather
>> than via unique_ptr), only create nodes, never destroy them, then destroy
>> them all at the end)
>>
>> On Mon, Nov 10, 2014 at 9:58 PM, wuhui1973 <wuhui1973 at 163.com> wrote:
>>
>>> Hi Andrew, David and Hal:
>>>
>>> I have made a new solution for this memory leak issue, which is much
>>> simpler than previous one.
>>>
>>> I have tested it, it works well!
>>>
>>> Please have a look, and appreciate any comment.
>>>
>>> Thanks & Regards
>>>
>>> Hui Wu
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141111/ce33f601/attachment.html>


More information about the llvm-commits mailing list