[llvm-dev] [GSOC 2018] Implement a single updater class for Dominators

Jakub (Kuba) Kuderski via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 21 19:37:09 PDT 2018


Hi Chijun,

I left you my feedback in the doc (+ some nitpicks). Overall, it looks very
solid and shows you understand the problem well and know what to expect.
Your proposed timeline is reasonable and I don't see any major things to
improve there. Don't hesitate to reach out if you have any questions
related to my comments (either here or in the document).

Thanks,
Kuba

On Wed, Mar 21, 2018 at 4:23 PM, Chijun Sima <simachijun at gmail.com> wrote:

> Hi Kuba,
>
> Thanks for your clarification on the project in the previous letter.
>
> I have submitted a proposal draft at the GSoC website, the draft has
> been shared with the LLVM organization. I will appreciate it if you
> can give me some advice on the proposal. This draft can be viewed by
> the organization. (If you do not have access, please mail me, and I
> will give you the link.)
>
> I am looking forward to your reply.
>
> Regards,
> Chijun
>
> 2018-03-15 1:42 GMT+08:00 Jakub (Kuba) Kuderski <kubakuderski at gmail.com>:
> > Hi Chijun,
> >
> > Great, seems like you did a lot of progress and understand the issues
> quite
> > well!
> >
> >> I have done some early sketch on the API of the new updater class.
> >> From my current understanding, to solve the fragmentation problem of
> >> the API, the new class first, need to maintain the DomTree and
> >> PostDomTree class and deprecate the DefferredDominance class.
> >
> > There are a couple of possibilities here. First, we can come up with a
> new
> > updater class and gradually replace DefferredDominance and and the basic
> > incremental update API with it (.applyUpdates()). During this process all
> > the classes can coexists which should make it easier to make the
> transition
> > smooth instead of risking breaking multiple places in LLVM at once. The
> > other approach would be to modify DefferredDominance and make the
> transition
> > happen in place, which should be also doable. I think that for
> prototyping
> > it would be easier to clone DefferedDominance and replace one use of it
> at a
> > time with the new updater class.
> >
> >>  Second, the API should be something
> >> looks like DT.update(updateStrategy::lazy,updateKind::delete,A,B) or
> >> DT.applyUpdates(updateStrategy::lazy,{{ updateKind::delete,A,B }}).
> >
> > Having update strategies as enums sounds very reasonable, but I'd rather
> we
> > didn't teach the DT to do it. I think the existing interface (and the
> header
> > file) is already pretty large and it should make sense to have a separate
> > class for the updates that the DT and PDT don't know about at all. I
> suspect
> > that the only thing that might be required would be to expose some
> > additional functions from DT and PDT for update pruning, but I'm not sure
> > about it at this point.
> >
> >> Third, the API can auto flush() when a query happens, which can make
> >> the code easier to maintain. Forth, the updater class can hold an
> >> optional PostDomTree and make its status synchronize with the
> >> corresponding DomTree (can use lazy updates until query happens) and
> >> use information from the DomTree to prune useless updates.
> >
> > Sounds good. One issue we might hit and need to be care about is not to
> have
> > both the updater class and DT in a single function somewhere in the
> > codebase. In such cases it is possible to have some pending updates that
> > will eventually change DT, but do .dominates() queries on the outdated
> tree.
> > However, this problem exists even today to some extend (with DT and DDT)
> and
> > I don't think it's very serious.
> >
> >> After I read the comments on D41302[1], in which Dave(dmgreen)
> >> mentions there is a need to preserve the PostDomTree in the future. I
> >> am wondering whether we need to preserve a PostDomTree all the time in
> >> the new class?
> >
> > I think that at some point we realized that preserving PDT everywhere is
> > quite challenging with the current data structures in LLVM. The issue is
> > that for postdominators you have to access predecessors, which requires
> you
> > to track uses instead of just looking up the last instruction within a
> > single basic block. What's more, changes anywhere near terminator
> > instructions in IR can cause PDT to look completely different every time
> you
> > do it. To do it efficiently, you have to be very lazy about updates and
> have
> > a good strategy for pruning. This project should be a huge step towards
> it,
> > but I think that we have to start with optional PDT anyway.
> > Danny and Chandler (cc'd) should remember more about the different
> > directions that were considered here.
> >
> >> Moreover, as mentioned in the open projects page[2],
> >> the API should be able to specify which trees are actually being
> >> updated (none, only DomTree, only PostDomTree, both), while I doubt
> >> that if the updater initially holds both the DomTree and PostDomTree
> >> for example, it is inappropriate to update only the PostDomTree, which
> >> will just break the synchronization between the two trees (DomTree and
> >> PostDomTree). Please correct me if I am wrong. :D
> >
> > My intention in the proposal was to say that the updater class can hold
> (or
> > not) two kinds of domtrees -- DT and PDT, and the updates should affect
> > whatever is currently available. I hope this is what you are asking
> about.
> > Breaking synchronization between two trees inside the updater class
> sounds
> > like something we really want to avoid.
> >
> > Sincerely,
> > Jakub
> >
> > On Mon, Mar 12, 2018 at 4:02 PM, Chijun Sima <simachijun at gmail.com>
> wrote:
> >>
> >> Hi Kuba,
> >>
> >> Thanks for your advice in your previous letter.
> >>
> >> During last week, I have read the documents on Doxygen and the source
> >> code of the DomTreeBase/DomTree/PostDomTree/DeferredDominance class, I
> >> believe now I have a much better understanding on the relationship
> >> between these classes and how DeferredDominance class performs lazy
> >> updates. I have also learnt the current usage and drawbacks of the
> >> fragmented API by looking into how several ‘transform util’ functions
> >> use these APIs and running the IR tests. Furthermore, I have read the
> >> code reviews you mentioned to get the idea of previous approaches on
> >> implementing the updater class and so on.
> >>
> >> I have done some early sketch on the API of the new updater class.
> >> From my current understanding, to solve the fragmentation problem of
> >> the API, the new class first, need to maintain the DomTree and
> >> PostDomTree class and deprecate the DeferredDominance class. This
> >> approach can save redundant code on calling the update function on
> >> both DomTree and PostDomTree. Second, the API should be something
> >> looks like DT.update(updateStrategy::lazy,updateKind::delete,A,B) or
> >> DT.applyUpdates(updateStrategy::lazy,{{ updateKind::delete,A,B }}).
> >> Third, the API can auto flush() when a query happens, which can make
> >> the code easier to maintain. Forth, the updater class can hold an
> >> optional PostDomTree and make its status synchronize with the
> >> corresponding DomTree (can use lazy updates until query happens) and
> >> use information from the DomTree to prune useless updates.
> >>
> >> After I read the comments on D41302[1], in which Dave(dmgreen)
> >> mentions there is a need to preserve the PostDomTree in the future. I
> >> am wondering whether we need to preserve a PostDomTree all the time in
> >> the new class? Moreover, as mentioned in the open projects page[2],
> >> the API should be able to specify which trees are actually being
> >> updated (none, only DomTree, only PostDomTree, both), while I doubt
> >> that if the updater initially holds both the DomTree and PostDomTree
> >> for example, it is inappropriate to update only the PostDomTree, which
> >> will just break the synchronization between the two trees (DomTree and
> >> PostDomTree). Please correct me if I am wrong. :D
> >>
> >> I am preparing to submit the proposal to GSoC this week. At the same
> >> time, I will try to figure out something about the way to prune
> >> redundant PostDomTree updates.
> >>
> >> I am looking forward to your reply.
> >>
> >>
> >> Regards,
> >>
> >> Chijun Sima
> >>
> >>
> >> [1] https://reviews.llvm.org/D41302?id=127155#inline-361616
> >>
> >> [2] https://llvm.org/OpenProjects.html#llvm_domtree_updater
> >>
> >> 2018-03-02 11:57 GMT+08:00 Jakub (Kuba) Kuderski <
> kubakuderski at gmail.com>:
> >> > Hi Chijun,
> >> >
> >> > Thanks for your interest in the project.
> >> >
> >> >> I have gone through most of the LLVM Kaleidoscope tutorial and I have
> >> >> watched the video of the presentation “Dominator Trees and
> incremental
> >> >> updates that transcend time” presented on the 2017 LLVM Developers’
> >> >> Meeting. I have also started to understand the algorithm mentioned in
> >> >> the comments of the code related to the dominator tree.
> >> >
> >> >
> >> > This sounds like a good start, but I don't think there's strong need
> to
> >> > understand the Depth Based Search algorithm in detail -- grasping the
> >> > main
> >> > idea behind it should be enough. If you want to have some rough idea
> on
> >> > how
> >> > it works you can try to run domtree tests with
> >> > -debug-only=dom-tree-builder.
> >> > To run just the IR tests for dominators, you can do:
> >> > ./unittests/IR/IRTests
> >> > --gtest_filter=DominatorTree*. Additionally, you can also play with
> >> > Holder.F->viewCFG() and DT->viewGraph() to see what's going on inside
> >> > the
> >> > tests.
> >> >
> >> >> I am wondering if there are some other bugs or materials I can go
> over
> >> >> in order to achieve a better understanding on the LLVM’s codebase (or
> >> >> the project mentioned above)? Which part of the codebase do you
> >> >> recommend to view to learn more about the usage and drawbacks of the
> >> >> current dominator tree related API? I would appreciate it.
> >> >
> >> >
> >> > I think that the programmer's manual is extremely useful when you want
> >> > to
> >> > get more familiar with the code base:
> >> > http://llvm.org/docs/ProgrammersManual.html. I would recommend to
> focus
> >> > on
> >> > how to interact with IR, llvm's data structures, and debug utilities.
> >> > The most important files related to the projects are probably:
> >> > GenericDomTreeConstruction.h, GenericDomTree.h, Dominators.h,
> >> > Dominators.cpp.
> >> >
> >> > If you want to see the incremental API in action, take a look at some
> >> > simpler transforms like loop deletion and aggressive dead code
> >> > elimination
> >> > (ADCE). The problems the project mentions can be seen in various
> >> > 'helper'/'utils' functions used by transforms. Take a look at these
> code
> >> > reviews, especially the functions that can accept DT* and/or DDT*:
> >> > https://reviews.llvm.org/D41302
> >> > https://reviews.llvm.org/D40146
> >> > https://reviews.llvm.org/D42804
> >> >
> >> > I think that it's worth adding that a better API is only one part the
> >> > project; the second, perhaps bigger and more fun, is to figure out how
> >> > to
> >> > prune redundant PostDomTree updates when we have a fully-updated
> DomTree
> >> > available.
> >> >
> >> > Feel free to ask more questions should you have any -- I and others
> will
> >> > be
> >> > more than happy to clarify.
> >> >
> >> > Best,
> >> > Kuba
> >> >
> >> > On Thu, Mar 1, 2018 at 2:29 PM, Chijun Sima via llvm-dev
> >> > <llvm-dev at lists.llvm.org> wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >> I’m an undergraduate student studying CS in the South China
> University
> >> >> of Technology.
> >> >>
> >> >> I have been using clang compiler and related tools since I started
> >> >> studying C++ and I would like to work on LLVM in this year’s GSoC. I
> >> >> am interested in “Implement a single updater class for Dominators”.
> >> >> [1] I have achieved a bronze medal in the 2017 ACM-ICPC Asia Xian
> >> >> Regional Contest [2] (being a member of the team “Charizard”) thus I
> >> >> think I have some knowledge on basic tree/graph algorithms and data
> >> >> structures.
> >> >>
> >> >> I have gone through most of the LLVM Kaleidoscope tutorial and I have
> >> >> watched the video of the presentation “Dominator Trees and
> incremental
> >> >> updates that transcend time” presented on the 2017 LLVM Developers’
> >> >> Meeting. I have also started to understand the algorithm mentioned in
> >> >> the comments of the code related to the dominator tree. I have
> created
> >> >> a Bugzilla account and I am now working on a small bug related to
> >> >> syntax warning (Sema).
> >> >>
> >> >> I am wondering if there are some other bugs or materials I can go
> over
> >> >> in order to achieve a better understanding on the LLVM’s codebase (or
> >> >> the project mentioned above)? Which part of the codebase do you
> >> >> recommend to view to learn more about the usage and drawbacks of the
> >> >> current dominator tree related API? I would appreciate it.
> >> >>
> >> >> Best regards,
> >> >>
> >> >> Chijun Sima
> >> >>
> >> >> [1] https://llvm.org/OpenProjects.html#llvm_domtree_updater
> >> >> [2] https://icpc.baylor.edu/regionals/finder/asia-xian-
> 2017/standings
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> llvm-dev at lists.llvm.org
> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Jakub Kuderski
> >
> >
> >
> >
> > --
> > Jakub Kuderski
>



-- 
Jakub Kuderski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180321/e2ddaf24/attachment.html>


More information about the llvm-dev mailing list