[llvm-dev] [GSOC 2018] Implement a single updater class for Dominators

Chijun Sima via llvm-dev llvm-dev at lists.llvm.org
Sat Mar 24 12:56:26 PDT 2018


Hi Kuba,

Thanks for your comments on the PDT update pruning algorithms part in
my proposal. I have modified my proposal according to your feedback.
Please check it out. :D

Regards,
Chijun

2018-03-22 20:38 GMT+08:00 Chijun Sima <simachijun at gmail.com>:
> Hi Kuba,
>
> Thanks for your feedback. I have made some improvements on the
> proposal according to your feedback and clarify something on the time
> availability. I left comments on the questions you asked in the doc.
> Please check it out.
>
> Thanks,
> Chijun
>
> 2018-03-22 10:37 GMT+08:00 Jakub (Kuba) Kuderski <kubakuderski at gmail.com>:
>> Hi Chijun,
>>
>> I left you my feedback in the doc (+ some nitpicks). Overall, it looks very
>> solid and shows you understand the problem well and know what to expect.
>> Your proposed timeline is reasonable and I don't see any major things to
>> improve there. Don't hesitate to reach out if you have any questions related
>> to my comments (either here or in the document).
>>
>> Thanks,
>> Kuba
>>
>> On Wed, Mar 21, 2018 at 4:23 PM, Chijun Sima <simachijun at gmail.com> wrote:
>>>
>>> Hi Kuba,
>>>
>>> Thanks for your clarification on the project in the previous letter.
>>>
>>> I have submitted a proposal draft at the GSoC website, the draft has
>>> been shared with the LLVM organization. I will appreciate it if you
>>> can give me some advice on the proposal. This draft can be viewed by
>>> the organization. (If you do not have access, please mail me, and I
>>> will give you the link.)
>>>
>>> I am looking forward to your reply.
>>>
>>> Regards,
>>> Chijun
>>>
>>> 2018-03-15 1:42 GMT+08:00 Jakub (Kuba) Kuderski <kubakuderski at gmail.com>:
>>> > Hi Chijun,
>>> >
>>> > Great, seems like you did a lot of progress and understand the issues
>>> > quite
>>> > well!
>>> >
>>> >> I have done some early sketch on the API of the new updater class.
>>> >> From my current understanding, to solve the fragmentation problem of
>>> >> the API, the new class first, need to maintain the DomTree and
>>> >> PostDomTree class and deprecate the DefferredDominance class.
>>> >
>>> > There are a couple of possibilities here. First, we can come up with a
>>> > new
>>> > updater class and gradually replace DefferredDominance and and the basic
>>> > incremental update API with it (.applyUpdates()). During this process
>>> > all
>>> > the classes can coexists which should make it easier to make the
>>> > transition
>>> > smooth instead of risking breaking multiple places in LLVM at once. The
>>> > other approach would be to modify DefferredDominance and make the
>>> > transition
>>> > happen in place, which should be also doable. I think that for
>>> > prototyping
>>> > it would be easier to clone DefferedDominance and replace one use of it
>>> > at a
>>> > time with the new updater class.
>>> >
>>> >>  Second, the API should be something
>>> >> looks like DT.update(updateStrategy::lazy,updateKind::delete,A,B) or
>>> >> DT.applyUpdates(updateStrategy::lazy,{{ updateKind::delete,A,B }}).
>>> >
>>> > Having update strategies as enums sounds very reasonable, but I'd rather
>>> > we
>>> > didn't teach the DT to do it. I think the existing interface (and the
>>> > header
>>> > file) is already pretty large and it should make sense to have a
>>> > separate
>>> > class for the updates that the DT and PDT don't know about at all. I
>>> > suspect
>>> > that the only thing that might be required would be to expose some
>>> > additional functions from DT and PDT for update pruning, but I'm not
>>> > sure
>>> > about it at this point.
>>> >
>>> >> Third, the API can auto flush() when a query happens, which can make
>>> >> the code easier to maintain. Forth, the updater class can hold an
>>> >> optional PostDomTree and make its status synchronize with the
>>> >> corresponding DomTree (can use lazy updates until query happens) and
>>> >> use information from the DomTree to prune useless updates.
>>> >
>>> > Sounds good. One issue we might hit and need to be care about is not to
>>> > have
>>> > both the updater class and DT in a single function somewhere in the
>>> > codebase. In such cases it is possible to have some pending updates that
>>> > will eventually change DT, but do .dominates() queries on the outdated
>>> > tree.
>>> > However, this problem exists even today to some extend (with DT and DDT)
>>> > and
>>> > I don't think it's very serious.
>>> >
>>> >> After I read the comments on D41302[1], in which Dave(dmgreen)
>>> >> mentions there is a need to preserve the PostDomTree in the future. I
>>> >> am wondering whether we need to preserve a PostDomTree all the time in
>>> >> the new class?
>>> >
>>> > I think that at some point we realized that preserving PDT everywhere is
>>> > quite challenging with the current data structures in LLVM. The issue is
>>> > that for postdominators you have to access predecessors, which requires
>>> > you
>>> > to track uses instead of just looking up the last instruction within a
>>> > single basic block. What's more, changes anywhere near terminator
>>> > instructions in IR can cause PDT to look completely different every time
>>> > you
>>> > do it. To do it efficiently, you have to be very lazy about updates and
>>> > have
>>> > a good strategy for pruning. This project should be a huge step towards
>>> > it,
>>> > but I think that we have to start with optional PDT anyway.
>>> > Danny and Chandler (cc'd) should remember more about the different
>>> > directions that were considered here.
>>> >
>>> >> Moreover, as mentioned in the open projects page[2],
>>> >> the API should be able to specify which trees are actually being
>>> >> updated (none, only DomTree, only PostDomTree, both), while I doubt
>>> >> that if the updater initially holds both the DomTree and PostDomTree
>>> >> for example, it is inappropriate to update only the PostDomTree, which
>>> >> will just break the synchronization between the two trees (DomTree and
>>> >> PostDomTree). Please correct me if I am wrong. :D
>>> >
>>> > My intention in the proposal was to say that the updater class can hold
>>> > (or
>>> > not) two kinds of domtrees -- DT and PDT, and the updates should affect
>>> > whatever is currently available. I hope this is what you are asking
>>> > about.
>>> > Breaking synchronization between two trees inside the updater class
>>> > sounds
>>> > like something we really want to avoid.
>>> >
>>> > Sincerely,
>>> > Jakub
>>> >
>>> > On Mon, Mar 12, 2018 at 4:02 PM, Chijun Sima <simachijun at gmail.com>
>>> > wrote:
>>> >>
>>> >> Hi Kuba,
>>> >>
>>> >> Thanks for your advice in your previous letter.
>>> >>
>>> >> During last week, I have read the documents on Doxygen and the source
>>> >> code of the DomTreeBase/DomTree/PostDomTree/DeferredDominance class, I
>>> >> believe now I have a much better understanding on the relationship
>>> >> between these classes and how DeferredDominance class performs lazy
>>> >> updates. I have also learnt the current usage and drawbacks of the
>>> >> fragmented API by looking into how several ‘transform util’ functions
>>> >> use these APIs and running the IR tests. Furthermore, I have read the
>>> >> code reviews you mentioned to get the idea of previous approaches on
>>> >> implementing the updater class and so on.
>>> >>
>>> >> I have done some early sketch on the API of the new updater class.
>>> >> From my current understanding, to solve the fragmentation problem of
>>> >> the API, the new class first, need to maintain the DomTree and
>>> >> PostDomTree class and deprecate the DeferredDominance class. This
>>> >> approach can save redundant code on calling the update function on
>>> >> both DomTree and PostDomTree. Second, the API should be something
>>> >> looks like DT.update(updateStrategy::lazy,updateKind::delete,A,B) or
>>> >> DT.applyUpdates(updateStrategy::lazy,{{ updateKind::delete,A,B }}).
>>> >> Third, the API can auto flush() when a query happens, which can make
>>> >> the code easier to maintain. Forth, the updater class can hold an
>>> >> optional PostDomTree and make its status synchronize with the
>>> >> corresponding DomTree (can use lazy updates until query happens) and
>>> >> use information from the DomTree to prune useless updates.
>>> >>
>>> >> After I read the comments on D41302[1], in which Dave(dmgreen)
>>> >> mentions there is a need to preserve the PostDomTree in the future. I
>>> >> am wondering whether we need to preserve a PostDomTree all the time in
>>> >> the new class? Moreover, as mentioned in the open projects page[2],
>>> >> the API should be able to specify which trees are actually being
>>> >> updated (none, only DomTree, only PostDomTree, both), while I doubt
>>> >> that if the updater initially holds both the DomTree and PostDomTree
>>> >> for example, it is inappropriate to update only the PostDomTree, which
>>> >> will just break the synchronization between the two trees (DomTree and
>>> >> PostDomTree). Please correct me if I am wrong. :D
>>> >>
>>> >> I am preparing to submit the proposal to GSoC this week. At the same
>>> >> time, I will try to figure out something about the way to prune
>>> >> redundant PostDomTree updates.
>>> >>
>>> >> I am looking forward to your reply.
>>> >>
>>> >>
>>> >> Regards,
>>> >>
>>> >> Chijun Sima
>>> >>
>>> >>
>>> >> [1] https://reviews.llvm.org/D41302?id=127155#inline-361616
>>> >>
>>> >> [2] https://llvm.org/OpenProjects.html#llvm_domtree_updater
>>> >>
>>> >> 2018-03-02 11:57 GMT+08:00 Jakub (Kuba) Kuderski
>>> >> <kubakuderski at gmail.com>:
>>> >> > Hi Chijun,
>>> >> >
>>> >> > Thanks for your interest in the project.
>>> >> >
>>> >> >> I have gone through most of the LLVM Kaleidoscope tutorial and I
>>> >> >> have
>>> >> >> watched the video of the presentation “Dominator Trees and
>>> >> >> incremental
>>> >> >> updates that transcend time” presented on the 2017 LLVM Developers’
>>> >> >> Meeting. I have also started to understand the algorithm mentioned
>>> >> >> in
>>> >> >> the comments of the code related to the dominator tree.
>>> >> >
>>> >> >
>>> >> > This sounds like a good start, but I don't think there's strong need
>>> >> > to
>>> >> > understand the Depth Based Search algorithm in detail -- grasping the
>>> >> > main
>>> >> > idea behind it should be enough. If you want to have some rough idea
>>> >> > on
>>> >> > how
>>> >> > it works you can try to run domtree tests with
>>> >> > -debug-only=dom-tree-builder.
>>> >> > To run just the IR tests for dominators, you can do:
>>> >> > ./unittests/IR/IRTests
>>> >> > --gtest_filter=DominatorTree*. Additionally, you can also play with
>>> >> > Holder.F->viewCFG() and DT->viewGraph() to see what's going on inside
>>> >> > the
>>> >> > tests.
>>> >> >
>>> >> >> I am wondering if there are some other bugs or materials I can go
>>> >> >> over
>>> >> >> in order to achieve a better understanding on the LLVM’s codebase
>>> >> >> (or
>>> >> >> the project mentioned above)? Which part of the codebase do you
>>> >> >> recommend to view to learn more about the usage and drawbacks of the
>>> >> >> current dominator tree related API? I would appreciate it.
>>> >> >
>>> >> >
>>> >> > I think that the programmer's manual is extremely useful when you
>>> >> > want
>>> >> > to
>>> >> > get more familiar with the code base:
>>> >> > http://llvm.org/docs/ProgrammersManual.html. I would recommend to
>>> >> > focus
>>> >> > on
>>> >> > how to interact with IR, llvm's data structures, and debug utilities.
>>> >> > The most important files related to the projects are probably:
>>> >> > GenericDomTreeConstruction.h, GenericDomTree.h, Dominators.h,
>>> >> > Dominators.cpp.
>>> >> >
>>> >> > If you want to see the incremental API in action, take a look at some
>>> >> > simpler transforms like loop deletion and aggressive dead code
>>> >> > elimination
>>> >> > (ADCE). The problems the project mentions can be seen in various
>>> >> > 'helper'/'utils' functions used by transforms. Take a look at these
>>> >> > code
>>> >> > reviews, especially the functions that can accept DT* and/or DDT*:
>>> >> > https://reviews.llvm.org/D41302
>>> >> > https://reviews.llvm.org/D40146
>>> >> > https://reviews.llvm.org/D42804
>>> >> >
>>> >> > I think that it's worth adding that a better API is only one part the
>>> >> > project; the second, perhaps bigger and more fun, is to figure out
>>> >> > how
>>> >> > to
>>> >> > prune redundant PostDomTree updates when we have a fully-updated
>>> >> > DomTree
>>> >> > available.
>>> >> >
>>> >> > Feel free to ask more questions should you have any -- I and others
>>> >> > will
>>> >> > be
>>> >> > more than happy to clarify.
>>> >> >
>>> >> > Best,
>>> >> > Kuba
>>> >> >
>>> >> > On Thu, Mar 1, 2018 at 2:29 PM, Chijun Sima via llvm-dev
>>> >> > <llvm-dev at lists.llvm.org> wrote:
>>> >> >>
>>> >> >> Hello,
>>> >> >>
>>> >> >> I’m an undergraduate student studying CS in the South China
>>> >> >> University
>>> >> >> of Technology.
>>> >> >>
>>> >> >> I have been using clang compiler and related tools since I started
>>> >> >> studying C++ and I would like to work on LLVM in this year’s GSoC. I
>>> >> >> am interested in “Implement a single updater class for Dominators”.
>>> >> >> [1] I have achieved a bronze medal in the 2017 ACM-ICPC Asia Xian
>>> >> >> Regional Contest [2] (being a member of the team “Charizard”) thus I
>>> >> >> think I have some knowledge on basic tree/graph algorithms and data
>>> >> >> structures.
>>> >> >>
>>> >> >> I have gone through most of the LLVM Kaleidoscope tutorial and I
>>> >> >> have
>>> >> >> watched the video of the presentation “Dominator Trees and
>>> >> >> incremental
>>> >> >> updates that transcend time” presented on the 2017 LLVM Developers’
>>> >> >> Meeting. I have also started to understand the algorithm mentioned
>>> >> >> in
>>> >> >> the comments of the code related to the dominator tree. I have
>>> >> >> created
>>> >> >> a Bugzilla account and I am now working on a small bug related to
>>> >> >> syntax warning (Sema).
>>> >> >>
>>> >> >> I am wondering if there are some other bugs or materials I can go
>>> >> >> over
>>> >> >> in order to achieve a better understanding on the LLVM’s codebase
>>> >> >> (or
>>> >> >> the project mentioned above)? Which part of the codebase do you
>>> >> >> recommend to view to learn more about the usage and drawbacks of the
>>> >> >> current dominator tree related API? I would appreciate it.
>>> >> >>
>>> >> >> Best regards,
>>> >> >>
>>> >> >> Chijun Sima
>>> >> >>
>>> >> >> [1] https://llvm.org/OpenProjects.html#llvm_domtree_updater
>>> >> >> [2]
>>> >> >> https://icpc.baylor.edu/regionals/finder/asia-xian-2017/standings
>>> >> >> _______________________________________________
>>> >> >> LLVM Developers mailing list
>>> >> >> llvm-dev at lists.llvm.org
>>> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Jakub Kuderski
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Jakub Kuderski
>>
>>
>>
>>
>> --
>> Jakub Kuderski


More information about the llvm-dev mailing list