Solution for memory leak in tblgen

David Blaikie dblaikie at gmail.com
Mon Sep 22 08:03:23 PDT 2014


On Sun, Sep 21, 2014 at 8:03 PM, wuhui1973 <wuhui1973 at 163.com> wrote:

> Hello David, Andrew:
>
> I spent this weekend in designing the solution using shared_ptr, I have
> changed tons of code, but always found some object had been released
> prematurely.
>
> Seem I had mistakenly made two shared_ptr holding the same object, but I
> got lost in the huge code change.
>
> The biggest difference between the self defined "shared_ptr" and
> std::shared_ptr is: for former the reference counter intrudes into the Tree
> Node, while for latter
>
> the reference counter is at different place. So for the self defined
> "shared_ptr", it is safe to use the pointer to Tree Node directly in most
> time, except in case of
>
> pointer self-assignment, for example, in method SimplifyTree. But using
> std::shared_ptr, it needs take care of all pointer assignments. Seems only
> using shared_ptr
>
> in place of all Tree Node pointer is safe, but it will result in huge code
> change and a not-easy implementation. Especially in some method, "this"
> pointer will be passed as
>
> parameter to other function or even as parameter of recursion, which is
> difficult to handle and error-prone if using std::shared_ptr.
>

FWIW, std::shared_ptr does support intrusive reference counting, if it's a
necessary aspect of this scheme (though I prefer to avoid it where possible
- it's not always possible or practical, especially when the goal is to fix
leaks and not to rewrite all the code). You can do this by inheriting from
std::enable_shared_from_this, then calling the inherited
"shared_from_this()" member function to acquire a std::shared_ptr when you
only have a raw pointer that you know some other code already owns.


> That is why I insist on using self-defined "shared_ptr".
>
> As for reclaiming cyclic referred objects, I have below explanation:
>
> + Here we can't just delete the nodes inside the container. Consider this
> cycle:
> +  //         A
> +  //       7|  \       if we directly delete node A, then node C will be
> freed as
> +  //      /     \      no reference now, next B. But at time B returning
> its memory,
> +  //     /      _\|    it finds the pointer to node A and try to free it
> as B is the
> +  //    B <------ C    last referrer, that comes the double delete!
> +  //                   So the correct way is to break the reference link,
> for example,
> +  // set the pointer to A in B to null which is reference counted, then B
> will free A
> +  // as it is the last referrer. In turn C and B will be freed but not
> double delete.
>

I'm still confused by this & will probably have to find some time to look
at the code, but it always boils down to:

If you haev cycles, reference counting is not the right tool.

If you are working around the limitations of reference counting because you
have cycles by defining a cleanup pass - why not just do all the ownership
and destruction there? I'm still not really following, but haven't taken
the time to look into the code in more detail.


> +  //
>
> I have tidied my code, please have a look.
>
> Thanks & Regards
>
> Hui Wu
>
>
> 在 2014-09-19 00:04:00,"David Blaikie" <dblaikie at gmail.com> 写道:
>
>
>
> On Sun, Sep 7, 2014 at 4:48 PM, Andrew Trick <atrick at apple.com> wrote:
>
>>
>> On Sep 2, 2014, at 8:08 PM, wuhui1973 <wuhui1973 at 163.com> wrote:
>>
>> Hi Andrew:
>>
>> Please find the anwser on line.
>>
>>
>>
>> At 2014-09-03 10:54:21, "Andrew Trick" <atrick at apple.com> wrote:
>>
>>
>> On Sep 2, 2014, at 7:22 PM, wuhui1973 <wuhui1973 at 163.com> wrote:
>>
>>
>> Hello Andrew, Hal:
>>
>> I have updated my solution. In these days, I have tried BumpPtrAlloc, but
>> the biggest problem of it is BumpPtrAlloc won't invoke dtor before
>> releasing the memory.
>>
>>
>> What does the dtor do beside release memory?
>>
>> [huiwu] it will invoke dtor of its member. For example, in TreePattern
>> object, its dtor will invoke dtor of following members:
>>
>> Trees (type std::vector<TreePatternNode*>)
>> NamedNodes (type StringMap<SmallVector<TreePatternNode*,1> >)
>> Args (type std::vector<std::string>)
>>
>> So if dtor if TreePattern is not invoked, only just memory occupied by
>> this TreePattern instance is freeed, memory allocated on heap by all these
>> containers leak!
>>
>>
>> I see how it might be hard to allocate all the referenced data using
>> BumpPtrAllocator. That’s unfortunate, because it would be a good way to
>> design this sort of thing.
>>
>> Given that you will continue using normal heap allocation, there is still
>> something I don’t understand about the design. It looks like your adding
>> references to all the nodes to a global container. So if there is one point
>> in the code where no TreePatterns should be alive, such as the point where
>> you would call ReclaimCycle, or destroy the BumpPtrAllocator, why don’t you
>> just explicitly delete all the TreePatterns referenced from your container
>> at that point? There would be no need to reference counting or cycle
>> detection.
>>
>
> Yeah - I'm confused by the complexity and have the same question.
>
>
>>
>> -Andy
>>
>>
>> And I also consider std::shared_ptr, but not use it for following
>> findings:
>>
>> 1> Nodes of TreePatternNode & TreePattern form cycles in reference, I
>> have designed a simple method to recycle them, which can't be used with
>> shared_ptr.
>> 2> After adding the grammar sugar operator-> & operator T(), now most
>> source files using TreePatternHolder & TreePatternNodeHolder now needn't be
>> changed. We can do that because we check reference count on every reference
>> counted objects in every related methods. So any abuse can be found earily
>> as possible.
>>
>>
>> Thanks.
>>
>>  In attachment, there are two kinds of suffix:
>>
>> *.review is for review
>> *.fix is for submit
>>
>>
>> You can run clang-format before your patch, submit that as a separate
>> patch (no need for a separate review), then submit your patch on top of it.
>>
>> -Andy
>>
>>
>>
>> The reason why need these files is because I had run clang-format on the
>> related source files which results in too many unrelated differences
>>
>>
>>
>> At 2014-08-07 11:59:44, "Andrew Trick" <atrick at apple.com> wrote:
>>
>>
>> On Aug 7, 2014, at 2:29 AM, wuhui1973 <wuhui1973 at 163.com> wrote:
>>
>> >> [hui] I am new to tablgen, I hope I can explain it clearly. At some
>> stage in building llvm, tablgen will compile .td file into .inc file (C++
>> source), it will compile .td file one after one till the stage is done. As
>> now the tree nodes are just allocated but not freed, they get lost between
>> handling .td files. So the patch aims to reclaim the memory when
>> finishing one .td file. I think pool allocator can be an option, but as I
>> know, pool allocator is just another sub-project under llvm, has it been
>> built into llvm? And it needs be done by the DSA (data structure analysis)
>> alogrithm?! Is there any example in llvm?
>>
>>
>> Sorry, I wasn’t clear what I was talking about. In LLVM, we have a class
>> called BumpPtrAllocator. You can use that to allocate a set of nodes, then
>> free them all between passes. If that accomplishes our goals for TableGen,
>> then I think it’s the ideal solution.
>>
>> -Andy
>>
>>
>>
>> <DAGISelMatcherGen.cpp.review><TableGen.cpp.fix>
>> <DAGISelMatcherGen.cpp.fix><CodeGenDAGPatterns.cpp.fix>
>> <CodeGenDAGPatterns.h.fix><CodeGenDAGPatterns.h.review>
>> <CodeGenDAGPatterns.cpp.review>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140922/e3713bcd/attachment.html>


More information about the llvm-commits mailing list