[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

Wed Mar 18 16:35:03 PDT 2020

On 03/14, Fahad Nayyar wrote:
> Dear Stefanos,
> 
> Thanks for such a quick response! And thanks for answering my questions!
> 
> > Starting off, understanding the theory of data-flow analysis can help.
> 
> I know about some standard fix-point lattice-based data flow analysis like
> reaching definitions, live variable analysis, etc. I have done a
> course on “Program
> analysis” at my college.
> 
> > The deduction of these attributes is inter-connected, which is the whole
> point of the Attributor.
> 
> Thanks for explaining this part with the example!
> 
> > I'd suggest that you try to run the Attributor and follow a specific
> attribute's updates and see what it tries to deduce. That is, see its
> updateImpl().
> 
> Thanks for suggesting this! I will try to do this and get back to you.
> 
> > At this point, LLVM is focused on heavy inlining, which while very
> useful, you'll lose a lot of the interprocedural information.
> 
> I see. It would be great if we can come up with some specific examples
> where using these deduced attributes can improve existing inter and intra
> procedural optimization passes. I am very interested to work towards
> exploring this potential of Attributor. So I would try to include such
> examples in my GSOC proposal.

The most benefit comes often from liveness and we are now improving the
usage of the other attributes during liveness deduction. One example
that is part of the value simplification improvements (see below) would
be the following:

```
static void foo(int *A) {
  if (A == null)
    abort();
  return *A;
}
int bar(int *A) {
  *A = 1;
  return foo(A);
}
```

Since we know `A` is not null at the call site we know the call to abort
is dead.

> > Liveness is certainly something that we're currently trying to improve
> and I don't think we'll ever stop.
> 
> It would be great if you can share some of the ongoing issues or discussion
> regarding improving Liveness information deduction using Attributor.

We are working on using undefined behavior (UB) to improve liveness. We
are also working to improve value simplification (partially for the same
reason). The latter has an old version online which I will replace with
smaller changes very soon (D68934). The former has quite some discussion
here D71974.

> Thanks and regards
> 
> Fahad Nayyar
> 
> 
> 
> 
> 
> On Sat, Mar 14, 2020 at 2:44 AM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
> 
> > Hi Fahad,
> >
> > We're all happy to see you being interested in LLVM! More so in the
> > Attributor! I'm a relatively new contributor so I
> > think I can help. Please note that the Attributor, apart from Johannes
> > (who CC'd), has at least another 2 great
> > contributors, Hideto and Stefan (who I also CC'd). They were among the
> > initial creators.
> >
> > In the rest of this post I'll try to help you familiarize yourself with
> > the Attributor and maybe answer your questions.
> > Johannes can then give you specific things to do to get started.
> >
> > Starting off, understanding the theory of data-flow analysis can help. I'd
> > say don't get too hang up on it, you just
> > have to understand the idea of fix-point analyses.
> >
> > I don't how much you know about the Attributor, so I'll defer a too long
> > (or too beginner) description because you might already know
> > a lot of things. You can of course any specific questions you want:
> > A summary is:
> > The Attributor tries to deduce attributes in different points of an LLVM
> > IR program (you can see that in the video).
> > The deduction of these attributes is inter-connected, which is the whole
> > point of the Attributor. The attributes
> > "ask" one another for information. For example, one attribute tries to see
> > if a load loads from null pointer.
> > But the pointer operand might be non-constant (like %v in LLVM IR). Well,
> > another attribute, whose job is to do value simplification
> > (i.e. constant folding / propagation etc.) might have folded that (%v)
> > into the constant null. So, the former can ask him.
> > These connections give the power and the complexity.
> >
> > The attributes have a state, that changes. When the state stops changing,
> > it has reached a fixpoint, at which point
> > the deduction of it stops. From the initialization of the attribute until
> > a fixpoint is reached, the state changes
> > in updates (called updateImpl() in the source code). This is where
> > attributes try to deduce new things, ask one another
> > and eventually try to reach a fixpoint.
> >
> > Finally, a fixpoint can be enforced. Because if we for some reason never
> > stop changing, it would run forever.
> > Note however that attributes should be programmed in a way that fixpoint
> > should be able to be reached
> > (This is where theory might help a little).
> >
> > I'd suggest that you try to run the Attributor and follow a specific
> > attribute's updates and see what it tries to deduce.
> > That is, see its updateImpl(). With a couple of prints you can get a good
> > idea of what it does and what info it
> > gets from other attributes (and when it stops). You can of course ask us
> > if you're interested in a specific one, if
> > there's something you don't understand etc.
> >
> > Now, to (try to) answer your questions and hopefully other people can help.
> > > How Attributor can help for standard inter-procedural and
> > intra-procedural analysis passes of LLVm. I’ve seen the tutorial [4]. I
> > would like to discuss ways of improving other optimization passes similarly
> > (or some examples which have already been implemented).
> >
> > The Attributor AFAIK is self-contained. It's not in "production" yet and
> > so it's not connected with other passes. At this point, LLVM is focused on
> > heavy inlining, which while very useful, you'll lose a lot of the
> > interprocedural information.
> > Note that there are other transforms that do Inter-Procedural Optimization
> > (https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
> > but they don't follow the idea of the Attributor.
> > But they might follow a fix-point analysis.
> >
> > > Improve dynamic memory related capabilities of Attributor. For example
> > Improve HeapToStackConversions. Maybe such deductions can help safety
> > (dis)provers. For example, can we improve the use-after-free bug detection
> > using some attributes?
> > Stefan should know more about H2S. Regarding the use-after-free, I don't
> > think there's currently any plans for it directly, but they can be I assume.
> >
> > > Improve Liveness related capabilities of Attributor. Again I want to
> > consider whether some attribute deduction can help liveness (dis)provers.
> > For example NoReturn, WillReturn can be improved. I am sure these 2
> > attributes do not cover all the cases as it is an undecidable problem. But
> > I was wondering whether there is room for improvement in their deduction
> > mechanism. Liveness is certainly something that we're currently trying to
> > improve and I don't think we'll ever stop. Most of the attributes interact
> > with the deadness attribute (AAIsDead) both for asking it info and
> > providing it info (i.e. the undefined-behavior attribute hopefully will at
> > some point be able to tell AAIsDead that a block is dead because it
> > contains UB). > Is there any attribute that tells whether a function has
> > side-effects (does it always gives the same output for the same input? Or
> > does it affect some global variable directly or indirectly?)? No AFAIK,
> > although you might be interested in this:
> > https://reviews.llvm.org/D74691#1887983
> >
> > I hope this was helpful! Don't hesitate to ask any questions.
> >
> > Kind regards,
> > Stefanos Baziotis
> >
> > Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar via llvm-dev <
> > llvm-dev at lists.llvm.org> έγραψε:
> >
> >> Hi all,
> >>
> >> My name is Fahad Nayyar. I am an undergraduate student from India.
> >>
> >> I am interested to participate in GSOC under the project “Improve
> >> inter-procedural analyses and optimizations”.
> >>
> >> I have been using LLVM for the past 8 months. I have written various
> >> intra-procedural analysis in LLVM as FunctionPass for my course projects
> >> and research projects. But I’ve not contributed to the LLVM community yet.
> >> I am very excited to contribute to LLVM!
> >>
> >> I am not too familiar with the inter-procedural analysis infrastructure
> >> of LLVM. I have written small toy inter-procedural dataflow analysis (like
> >> taint analysis, reaching definitions, etc) for JAVA programs using SOOT
> >> tool *[5].* I am familiar with the theory of inter-procedural analysis
> >> (I’ve read some chapters of  [1],  [2] and [3] for this).
> >>
> >> I am trying to understand the LLVM’s Attributor framework. I am
> >> interested in these 3 aspects:
> >>
> >>    1.
> >>
> >>    How Attributor can help for standard inter-procedural and
> >>    intra-procedural analysis passes of LLVm. I’ve seen the tutorial [4].
> >>    I would like to discuss ways of improving other optimization passes
> >>    similarly (or some examples which have already been implemented).
> >>    2.
> >>
> >>    Improve dynamic memory related capabilities of Attributor. For
> >>    example Improve HeapToStackConversions. Maybe such deductions can
> >>    help safety (dis)provers. For example, can we improve the use-after-free
> >>    bug detection using some attributes?
> >>    3.
> >>
> >>    Improve Liveness related capabilities of Attributor. Again I want to
> >>    consider whether some attribute deduction can help liveness (dis)provers.
> >>    For example NoReturn, WillReturn can be improved. I am sure these 2
> >>    attributes do not cover all the cases as it is an undecidable problem. But
> >>    I was wondering whether there is room for improvement in their deduction
> >>    mechanism.
> >>    4.
> >>
> >>    Can we optimize the attribute deduction algorithm to reduce compile
> >>    time?
> >>    5.
> >>
> >>    Is there any attribute that tells whether a function has side-effects
> >>    (does it always gives the same output for the same input? Or does it affect
> >>    some global variable directly or indirectly?)?
> >>
> >>
> >> It would be great if Johannes can provide me some TODOs before submitting
> >> my proposal. Also please tell some specific IPO improvement goals which you
> >> have in mind for this project. I would be most interested in memory-related
> >> attributes, liveness deductions from attributes and measurable better IPO
> >> using attribute deduction.
> >>
> >> Thanks and Regards.
> >>
> >> References:
> >>
> >> [1] Principles of Program Analysis.
> >> <https://www.springer.com/gp/book/9783540654100>
> >>
> >> [2] Data Flow Analysis: Theory and Practice.
> >> <https://dl.acm.org/doi/book/10.5555/1592955>
> >>
> >> [3] Static Program Analysis. <https://cs.au.dk/~amoeller/spa/spa.pdf>
> >>
> >> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The Attributor: A
> >> Versatile Inter-procedural Fixpoint.."
> >> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
> >> [5] Soot - A Java optimization framework <https://github.com/Sable/soot>
> >>
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200318/6e447249/attachment.sig>