Solution for memory leak in tblgen

wuhui1973 wuhui1973 at 163.com
Thu Aug 7 02:29:37 PDT 2014


Hello, Andrew


Thanks for the reply, please find my answer in line.






在 2014-08-07 02:46:12,"Andrew Trick" <atrick at apple.com> 写道:


On Jul 28, 2014, at 1:53 AM, wuhui1973 <wuhui1973 at 163.com> wrote:


Hello Hal, Andy, Nadav, Tom & all:


I have worked out a solution for the memory leak problem in tblgen. This memory leak problem is known and deliberatly left by the author for later fix.


My solution follows the author's suggestion - using reference counter to track the TreePatternNode & TreePattern objects.


Besides, though TreePatternNodes form DAGs, but with TreePattern, they can form cyclic graph, the solution also includes


a method to break the cyclic reference. The solution detail please refer to the attachment.


I have checked the solution with following actions:


1> Check tblgen running log, it looks like this:


     llvm[3]: Building X86.td register info implementation with tblgen
     llvm[3]: Building X86.td instruction information with tblgen
    1. NumTPNAlloc: 84347, NumTPNRel: 20190            <--- TPN: TreePatternNode
    1. NumTPAlloc: 13441, NumTPRel: 9215                  <--- TP: TreePattern
    2. NumTPNAlloc: 84347, NumTPNRel: 83879
    2. NumTPAlloc: 13441, NumTPRel: 13271
    468 TPNs remain, 170 TPs remain                            <--- These are cyclic referred nodes
    Release done
    3. NumTPNAlloc: 84347, NumTPNRel: 84347
    3. NumTPAlloc: 13441, NumTPRel: 13441


    Has confirmed all allocated objects get released during the loadbuild for all supported targets.


2> Compare the generated .inc file with those generated by sainty code. Has confirmed they all the same.


3> run "make check" to verify the generated llvm-programs, gets following result:
It contains one error, but the sainty code also gives the same result! I will raise the bug report later.
FAIL: LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s (6206 of 11281)
******************** TEST 'LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s' FAILED ********************
Script:
--
/sources/new/build_llvm/Debug+Asserts/bin/llvm-mc -triple=arm64-apple-ios7.0.0 -code-model=small -relocation-model=pic -filetype=obj -o /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o /sources/new/llvm/test/ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s
sed "s,<filename>,/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o,g" /sources/new/llvm/test/ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s > /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s
/sources/new/build_llvm/Debug+Asserts/bin/llvm-rtdyld -triple=arm64-apple-ios7.0.0 -verify -check=/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o
--
Exit Code: 1


Command Output (stderr):
--
Expression 'decode_operand(adrp2, 1) = (stub_addr(/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o, __text, _ptr)[32:12] - adrp2[32:12])' is false: 0xfff00000 != 0x0
/sources/new/build_llvm/Debug+Asserts/bin/llvm-rtdyld: error: some checks in '/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s' failed
--
********************
Testing Time: 3155.15s
********************
Failing Tests (1):
    LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s


  Expected Passes    : 11144
  Expected Failures  : 101
  Unsupported Tests  : 35
  Unexpected Failures: 1


So I think my solution is correct. Could you review it? Or help to forward to someone concerned? Thanks!


Thanks for working on this. I've never looked at this code, so I'd like someone else to review it too. But here are my comments.


At a high-level this looks like a very large diff to accomplish the task at hand.


First, what are we accomplishing? If the goal is to ensure all memory is freed between "passes", then that can be much better done with a pool allocator. Is there really any need to eagerly free tree nodes before the end of a pass?


>> [hui] I am new to tablgen, I hope I can explain it clearly. At some stage in building llvm, tablgen will compile .td file into .inc file (C++ source), it will compile .td file one after one till the stage is done. As now the tree nodes are just allocated but not freed, they get lost between handling .td files. So the patch aims to reclaim the memory when finishing one .td file. I think pool allocator can be an option, but as I know, pool allocator is just another sub-project under llvm, has it been built into llvm? And it needs be done by the DSA (data structure analysis) alogrithm?! Is there any example in llvm?


Another reviewer commented that it looks just like a reimplementation of std::shared_ptr, and I did not completely follow your response. I *think* the only salient difference is that your reference counts are intrusive--they are members of the tree nodes. This allows you to check the RefCnt on any method call. The down side is that you're adding a lot of boilerplate code in the form of asserts and calls to "get()". I won't object to the patch purely for this reason, but you should know that smaller, more self-contained patches are much more likely to be accepted.


>> [hui] I know std::shared_ptr can make the patch simpler and more concise. But I want to enforce the rule that: every tree node should be managed by the holder class. Because there are two kinds of tree nodes: TreePatternNode & TreePattern. Though TreePatternNodes just form DAG among themselves, TreePatternNodes and TreePatterns will form cyclic refence graph, any tree node leakage can make the reclaim algorithm more diffcult to implement and less efficient.


I don't understand why we need derived classes TreePatternNodeHolder and TreePatternHolder. The code looks completely generic to me.


>> [hui] They have different createXXX methods, so I can't generic them. Here I only allow TreePatternNodeHolder and TreePatternHolder to create the tree node, because I want to force the allocated tree node be always managed by these classes, so it is impossible for memory leak by mistake. And more important, if there are some tree nodes leak, it can make the cyclic reference nodes reclaiming method fails to work correctly (that means we need to design a more complicate solution).


This


+  if (!TPContainer.empty())
+  {
+      do


should be 


+  while (!TPContainer.empty())


>> [hui] You are correct! This code fragment is the result of moving from one complex but incorrect implementation to this simple one, when I realize that when every tree node is correcly reference counted I can break the cycle in this simple way.


Why doesn't the copy ctor increment?


+ TreePatternNodeHolder(const TreePatternNodeHolder &rhs):
+      NodeType(rhs.Member) {}


>> [hui] the base class does the increment


Shouldn't you have a move ctor that does not increment?
>> [hui] I hadn't considered that. Is it necessary?


Why are the creation routines members of TreePatternNodeHolder rather than returning a new holder?


+TreePatternNode* TreePatternNodeHolder::createTreePatternNode(Init *val, unsigned NumResults)


>> [hui] This method works like the new operator, but leave the allocated tree node under the hand of the TreePatternNodeHolder instance


What is the purpose of changes like this:


-  if (N->getChild(0)->isLeaf() || N->getChild(0)->getOperator() != Operator)
-    Children.push_back(N->getChild(0));
+  TreePatternNode *child0 = N->getChild(0);
+  TreePatternNode *child1 = N->getChild(1);
+
+  if (child0->isLeaf() || child0->getOperator() != Operator)
+    Children.push_back(child0);


>> [hui] I once swang between let getChild return the holder and let getChild return the tree node. In fact, now below code need not be changed.


I assume you enabled build debug output to your patch for demonstration purposes. Obviously we can't really do that.


Before it gets checked in, your should run clang-format on your code.


>> [hui] how to run clang-format?


typo: remmeber


>> [hui] get it


I wasn't able to review every line. You'll need to find another reviewer.


>> [hui] who will be the Mr. right?


One suggestion to help review: divide the patch into at least two patch sets. One that introduces your new classes and algorithm. A second patch set that renames a bunch of types in the existing code, and adds "getters" if really necessary.


>> [huui] thanks the suggestion


-Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140807/e57c672a/attachment.html>


More information about the llvm-commits mailing list