<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 28, 2014, at 1:53 AM, wuhui1973 <<a href="mailto:wuhui1973@163.com" class="">wuhui1973@163.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div style="line-height: 1.7; font-size: 14px; font-family: Arial;" class=""><div class="">Hello Hal, <span style="font-family: arial; white-space: pre-wrap; line-height: 23.32400131225586px;" class="">Andy, Nadav, Tom</span><span style="line-height: 1.7;" class=""> & all:</span></div><div class=""><br class=""></div><div class="">I have worked out a solution for the memory leak problem in tblgen. This memory leak problem is known and deliberatly left by the author for later fix.</div><div class=""><br class=""></div><div class="">My solution follows the author's suggestion - using reference counter to track the TreePatternNode & TreePattern objects.</div><div class=""><br class=""></div><div class="">Besides, though <span style="line-height: 1.7;" class="">TreePatternNodes</span><span style="line-height: 1.7;" class=""> form DAGs, but with TreePattern, they can form cyclic graph, the solution also includes</span></div><div class=""><span style="line-height: 1.7;" class=""><br class=""></span></div><div class=""><span style="line-height: 1.7;" class="">a method to break the cyclic reference. </span><span style="line-height: 1.7;" class="">The solution detail please refer to the attachment.</span></div><div class=""><span style="line-height: 1.7;" class=""><br class=""></span></div><div class="">I have checked the solution with following actions:</div><div class=""><br class=""></div><div class="">1> Check tblgen running log, it looks like this:</div><div class=""><br class=""></div><div class=""> llvm[3]: Building X86.td register info implementation with tblgen</div><div class=""> llvm[3]: Building X86.td instruction information with tblgen</div><div class=""> 1. NumTPNAlloc: 84347, NumTPNRel: 20190 <--- TPN: TreePatternNode</div><div class=""> 1. NumTPAlloc: 13441, NumTPRel: 9215 <---<span style="line-height: 1.7;" class=""> TP: TreePattern</span></div><div class=""> 2. NumTPNAlloc: 84347, NumTPNRel: 83879</div><div class=""> 2. NumTPAlloc: 13441, NumTPRel: 13271</div><div class=""> 468 TPNs remain, 170 TPs remain <--- These are cyclic referred nodes</div><div class=""> Release done</div><div class=""> 3. NumTPNAlloc: 84347, NumTPNRel: 84347</div><div class=""> 3. NumTPAlloc: 13441, NumTPRel: 13441</div><div class=""><br class=""></div><div class=""> Has confirmed all allocated objects get released during the loadbuild for all supported targets.</div><div class=""><br class=""></div><div class="">2> Compare the generated .inc file with those generated by sainty code. Has confirmed they all the same.</div><div class=""><br class=""></div><div class="">3> run "make check" to verify the generated llvm-programs, gets following result:</div><div class=""><b class=""><u class="">It contains one error, but the sainty code also gives the same result! I will raise the bug report later.</u></b></div><div class="">FAIL: LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s (6206 of 11281)</div><div class="">******************** TEST 'LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s' FAILED ********************</div><div class="">Script:</div><div class="">--</div><div class="">/sources/new/build_llvm/Debug+Asserts/bin/llvm-mc -triple=arm64-apple-ios7.0.0 -code-model=small -relocation-model=pic -filetype=obj -o /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o /sources/new/llvm/test/ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s</div><div class="">sed "s,<filename>,/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o,g" /sources/new/llvm/test/ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s > /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s</div><div class="">/sources/new/build_llvm/Debug+Asserts/bin/llvm-rtdyld -triple=arm64-apple-ios7.0.0 -verify -check=/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s /sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o</div><div class="">--</div><div class="">Exit Code: 1</div><div class=""><br class=""></div><div class="">Command Output (stderr):</div><div class="">--</div><div class="">Expression 'decode_operand(adrp2, 1) = (stub_addr(/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.o, __text, _ptr)[32:12] - adrp2[32:12])' is false: 0xfff00000 != 0x0</div><div class="">/sources/new/build_llvm/Debug+Asserts/bin/llvm-rtdyld: error: some checks in '/sources/new/build_llvm/test/ExecutionEngine/RuntimeDyld/AArch64/Output/foo.s' failed</div><div class=""><span style="line-height: 1.7;" class="">--</span></div><div class="">********************</div><div class="">Testing Time: 3155.15s</div><div class="">********************</div><div class="">Failing Tests (1):</div><div class=""> LLVM :: ExecutionEngine/RuntimeDyld/AArch64/MachO_ARM64_relocations.s</div><div class=""><br class=""></div><div class=""> Expected Passes : 11144</div><div class=""> Expected Failures : 101</div><div class=""> Unsupported Tests : 35</div><div class=""> Unexpected Failures: 1</div><div class=""><br class=""></div><div class="">So I think my solution is correct. Could you review it? Or help to forward to someone concerned? <span style="line-height: 1.7;" class="">Thanks!</span></div></div></div></blockquote><br class=""></div><div><div>Thanks for working on this. I've never looked at this code, so I'd like someone else to review it too. But here are my comments.</div><div><br class=""></div><div>At a high-level this looks like a very large diff to accomplish the task at hand.</div><div><br class=""></div><div>First, what are we accomplishing? If the goal is to ensure all memory is freed between "passes", then that can be much better done with a pool allocator. Is there really any need to eagerly free tree nodes before the end of a pass?</div><div><br class=""></div><div>Another reviewer commented that it looks just like a reimplementation of std::shared_ptr, and I did not completely follow your response. I *think* the only salient difference is that your reference counts are intrusive--they are members of the tree nodes. This allows you to check the RefCnt on any method call. The down side is that you're adding a lot of boilerplate code in the form of asserts and calls to "get()". I won't object to the patch purely for this reason, but you should know that smaller, more self-contained patches are much more likely to be accepted.</div><div><br class=""></div><div>I don't understand why we need derived classes TreePatternNodeHolder and TreePatternHolder. The code looks completely generic to me.</div><div><br class=""></div><div>This</div><div><br class=""></div><div>+ if (!TPContainer.empty())</div><div>+ {</div><div>+ do</div><div><br class=""></div><div>should be </div><div><br class=""></div><div>+ while (!TPContainer.empty())</div><div><br class=""></div><div>Why doesn't the copy ctor increment?</div><div><br class=""></div><div>+ TreePatternNodeHolder(const TreePatternNodeHolder &rhs):</div><div>+ NodeType(rhs.Member) {}</div><div><br class=""></div><div>Shouldn't you have a move ctor that does not increment?</div><div><br class=""></div><div>Why are the creation routines members of TreePatternNodeHolder rather than returning a new holder?</div><div><br class=""></div><div>+TreePatternNode* TreePatternNodeHolder::createTreePatternNode(Init *val, unsigned NumResults)</div><div><br class=""></div><div>What is the purpose of changes like this:</div><div><br class=""></div><div>- if (N->getChild(0)->isLeaf() || N->getChild(0)->getOperator() != Operator)</div><div>- Children.push_back(N->getChild(0));</div><div>+ TreePatternNode *child0 = N->getChild(0);</div><div>+ TreePatternNode *child1 = N->getChild(1);</div><div>+</div><div>+ if (child0->isLeaf() || child0->getOperator() != Operator)</div><div>+ Children.push_back(child0);</div><div><br class=""></div><div>I assume you enabled build debug output to your patch for demonstration purposes. Obviously we can't really do that.</div><div><br class=""></div><div>Before it gets checked in, your should run clang-format on your code.</div><div><br class=""></div><div>typo: remmeber</div><div><br class=""></div><div>I wasn't able to review every line. You'll need to find another reviewer.</div><div><br class=""></div><div>One suggestion to help review: divide the patch into at least two patch sets. One that introduces your new classes and algorithm. A second patch set that renames a bunch of types in the existing code, and adds "getters" if really necessary.</div><div class=""><br class=""></div><div class="">-Andy</div></div><br class=""></body></html>