[llvm-dev] [GSOC] "Project: Improve inter-procedural analyses and optimisations"

Sun Mar 15 15:12:08 PDT 2020

Dear Stefan and Stefanos,

Thanks for your suggestions!

> I'd suggest that you try to run the Attributor and follow a specific
attribute's updates and see what it tries to deduce. That is, see its
updateImpl(). With a couple of prints you can get a good idea of what it
does and what info it gets from other attributes (and when it stops).

I tried to do this for the NoUnwind attribute. I printed getState(), i
sAssumedNoUnwind(), isKnownNoUnwind() in updateImpl method of classes
AANoUnwind and AANoUnwindCallSite. I run the tests in nounwind.ll
(/llvm/test/Transforms/Attributor/nounwind.ll).
I used this command to run the test: “opt -attributor
-attributor-disable=false nounwind.ll -S &> nounwind_out.ll”. But After
seeing the output I was not able to understand how the attribute is
changing for the tests. Its status was almost constant every time updateimp was
called. Please tell me what other things should I try to print to better
observe how the NoUnwind attribute is changing over the iterations of fix
point analysis. Also please verify whether I am using the correct command
to run the tests.

> Also, probably this will be a very interesting panel discussion for you:
https://www.youtube.com/watch?v=cC2cspQgSxM

Thanks for suggesting this! I watched the video and now I understand the
pros and cons of inlining. But I still think that It would take me a while
before I can come up with a very good example demonstrating the use of of
of the Attribues in some IPO pass. I know how in [1] Johaanes explained the
use of MaxObjSize and Dereferenceable in the AliasAnalysis. But I would be
happy if I could come up with some even better example.

> You are somewhat right. However, H2S is not about 'use-after-free' bug
detection, but rather its prevention. We already do this, see example.
<https://godbolt.org/z/HgrC7H>

Thanks for sharing the example. Just for clarification, was this example
demonstrating the point that we can automatically correct use-after-free
bugs using attributes? If yes, then I didn’t understand how and which
attribute helped in this correction? Also is it not wrong to change the IR
as in this example? Replacing %1 = tail call noalias i8* @malloc(i64 4) ;
tail call void @no_sync_func(i8* %1) with  %1 = alloca i8, i64 4 solved the
use-after-free bug, but doesn’t it also change the semantic of the program?

> In the meantime you could look at some TODOs in the Attributor itself and
try those you see fit.

I looked up some of the TODOs. I found AAReachability a very interesting
attribute. I can start with TODO at line no. 2605  // TODO: Return the
number of reachable queries. I can work towards this TODO. But I first want
your advice on whether it looks doable for me. I can see that
implementation of AAReachability attribute is not complete yet. I can try
to learn more about it from D70233 <https://reviews.llvm.org/D70233>and
D71617 <https://reviews.llvm.org/D71617>.

I am trying to get more familiar with Attributor’s code. Since the code is
very large right now, I thought to refer to some of the very initial
patches of attributor (D59918, <https://reviews.llvm.org/D59918> D60012
<https://reviews.llvm.org/D60012>, D63379 <https://reviews.llvm.org/D63379>).
I believe that by looking at these three I can get a better idea of the
framework as a whole. Please suggest if this is a good idea or not. Also
please suggest any other way by which I can improve my understanding of the
code.

I can see that Johanned have put up some issues for GSOC aspirants. I think
that [2] <https://github.com/llvm/llvm-project/issues/179> ([Attributor]
Cleanup and upstream `Attribute::MaxObjectSize`) will be a very good issue
for me, It seems doable and I can get familiar with the whole process of
writing a patch for an issue. How should I indicate to the community that I
have started working towards this issue (should I comment on the issue page
on github?)? I can try to work on AAReachability TODO after solving this
issue.

Thanks and Regards

References

[1] https://youtu.be/HVvvCSSLiTw
[2] https://github.com/llvm/llvm-project/issues/179

On Sat, Mar 14, 2020 at 4:12 PM Stefan Stipanovic <stefomeister at gmail.com>
wrote:

> Hi Fahad,
>
>
>> > Improve dynamic memory related capabilities of Attributor. For example
>> Improve HeapToStackConversions. Maybe such deductions can help safety
>> (dis)provers. For example, can we improve the use-after-free bug
>> detection using some attributes?
>> Stefan should know more about H2S. Regarding the use-after-free, I don't
>> think there's currently any plans for it directly, but they can be I assume.
>
>
> You are somewhat right. However, H2S is not about 'use-after-free' bug
> detection, but rather its prevention. We already do this, see example
> <https://godbolt.org/z/HgrC7H>.
>
> In the rest of this post I'll try to help you familiarize yourself with
>> the Attributor and maybe answer your questions.
>> Johannes can then give you specific things to do to get started.
>
>
> In the meantime you could look at some TODOs in the Attributor itself and
> try those you see fit.
>
> If you have any questions, don't hesitate to ask.
>
> -stefan
>
> On Fri, Mar 13, 2020 at 10:14 PM Stefanos Baziotis <
> stefanos.baziotis at gmail.com> wrote:
>
>> Hi Fahad,
>>
>> We're all happy to see you being interested in LLVM! More so in the
>> Attributor! I'm a relatively new contributor so I
>> think I can help. Please note that the Attributor, apart from Johannes
>> (who CC'd), has at least another 2 great
>> contributors, Hideto and Stefan (who I also CC'd). They were among the
>> initial creators.
>>
>> In the rest of this post I'll try to help you familiarize yourself with
>> the Attributor and maybe answer your questions.
>> Johannes can then give you specific things to do to get started.
>>
>> Starting off, understanding the theory of data-flow analysis can help.
>> I'd say don't get too hang up on it, you just
>> have to understand the idea of fix-point analyses.
>>
>> I don't how much you know about the Attributor, so I'll defer a too long
>> (or too beginner) description because you might already know
>> a lot of things. You can of course any specific questions you want:
>> A summary is:
>> The Attributor tries to deduce attributes in different points of an LLVM
>> IR program (you can see that in the video).
>> The deduction of these attributes is inter-connected, which is the whole
>> point of the Attributor. The attributes
>> "ask" one another for information. For example, one attribute tries to
>> see if a load loads from null pointer.
>> But the pointer operand might be non-constant (like %v in LLVM IR). Well,
>> another attribute, whose job is to do value simplification
>> (i.e. constant folding / propagation etc.) might have folded that (%v)
>> into the constant null. So, the former can ask him.
>> These connections give the power and the complexity.
>>
>> The attributes have a state, that changes. When the state stops changing,
>> it has reached a fixpoint, at which point
>> the deduction of it stops. From the initialization of the attribute until
>> a fixpoint is reached, the state changes
>> in updates (called updateImpl() in the source code). This is where
>> attributes try to deduce new things, ask one another
>> and eventually try to reach a fixpoint.
>>
>> Finally, a fixpoint can be enforced. Because if we for some reason never
>> stop changing, it would run forever.
>> Note however that attributes should be programmed in a way that fixpoint
>> should be able to be reached
>> (This is where theory might help a little).
>>
>> I'd suggest that you try to run the Attributor and follow a specific
>> attribute's updates and see what it tries to deduce.
>> That is, see its updateImpl(). With a couple of prints you can get a good
>> idea of what it does and what info it
>> gets from other attributes (and when it stops). You can of course ask us
>> if you're interested in a specific one, if
>> there's something you don't understand etc.
>>
>> Now, to (try to) answer your questions and hopefully other people can
>> help.
>> > How Attributor can help for standard inter-procedural and
>> intra-procedural analysis passes of LLVm. I’ve seen the tutorial [4]. I
>> would like to discuss ways of improving other optimization passes similarly
>> (or some examples which have already been implemented).
>>
>> The Attributor AFAIK is self-contained. It's not in "production" yet and
>> so it's not connected with other passes. At this point, LLVM is focused on
>> heavy inlining, which while very useful, you'll lose a lot of the
>> interprocedural information.
>> Note that there are other transforms that do Inter-Procedural
>> Optimization (
>> https://github.com/llvm/llvm-project/tree/master/llvm/lib/Transforms/IPO)
>> but they don't follow the idea of the Attributor.
>> But they might follow a fix-point analysis.
>>
>> > Improve dynamic memory related capabilities of Attributor. For example
>> Improve HeapToStackConversions. Maybe such deductions can help safety
>> (dis)provers. For example, can we improve the use-after-free bug
>> detection using some attributes?
>> Stefan should know more about H2S. Regarding the use-after-free, I don't
>> think there's currently any plans for it directly, but they can be I assume.
>>
>> > Improve Liveness related capabilities of Attributor. Again I want to
>> consider whether some attribute deduction can help liveness (dis)provers.
>> For example NoReturn, WillReturn can be improved. I am sure these 2
>> attributes do not cover all the cases as it is an undecidable problem. But
>> I was wondering whether there is room for improvement in their deduction
>> mechanism. Liveness is certainly something that we're currently trying to
>> improve and I don't think we'll ever stop. Most of the attributes interact
>> with the deadness attribute (AAIsDead) both for asking it info and
>> providing it info (i.e. the undefined-behavior attribute hopefully will at
>> some point be able to tell AAIsDead that a block is dead because it
>> contains UB). > Is there any attribute that tells whether a function has
>> side-effects (does it always gives the same output for the same input?
>> Or does it affect some global variable directly or indirectly?)? No AFAIK,
>> although you might be interested in this:
>> https://reviews.llvm.org/D74691#1887983
>>
>> I hope this was helpful! Don't hesitate to ask any questions.
>>
>> Kind regards,
>> Stefanos Baziotis
>>
>> Στις Παρ, 13 Μαρ 2020 στις 10:25 μ.μ., ο/η Fahad Nayyar via llvm-dev <
>> llvm-dev at lists.llvm.org> έγραψε:
>>
>>> Hi all,
>>>
>>> My name is Fahad Nayyar. I am an undergraduate student from India.
>>>
>>> I am interested to participate in GSOC under the project “Improve
>>> inter-procedural analyses and optimizations”.
>>>
>>> I have been using LLVM for the past 8 months. I have written various
>>> intra-procedural analysis in LLVM as FunctionPass for my course projects
>>> and research projects. But I’ve not contributed to the LLVM community yet.
>>> I am very excited to contribute to LLVM!
>>>
>>> I am not too familiar with the inter-procedural analysis infrastructure
>>> of LLVM. I have written small toy inter-procedural dataflow analysis (like
>>> taint analysis, reaching definitions, etc) for JAVA programs using SOOT
>>> tool *[5].* I am familiar with the theory of inter-procedural analysis
>>> (I’ve read some chapters of  [1],  [2] and [3] for this).
>>>
>>> I am trying to understand the LLVM’s Attributor framework. I am
>>> interested in these 3 aspects:
>>>
>>>    1.
>>>
>>>    How Attributor can help for standard inter-procedural and
>>>    intra-procedural analysis passes of LLVm. I’ve seen the tutorial [4].
>>>    I would like to discuss ways of improving other optimization passes
>>>    similarly (or some examples which have already been implemented).
>>>    2.
>>>
>>>    Improve dynamic memory related capabilities of Attributor. For
>>>    example Improve HeapToStackConversions. Maybe such deductions can
>>>    help safety (dis)provers. For example, can we improve the use-after-free
>>>    bug detection using some attributes?
>>>    3.
>>>
>>>    Improve Liveness related capabilities of Attributor. Again I want to
>>>    consider whether some attribute deduction can help liveness (dis)provers.
>>>    For example NoReturn, WillReturn can be improved. I am sure these 2
>>>    attributes do not cover all the cases as it is an undecidable problem. But
>>>    I was wondering whether there is room for improvement in their deduction
>>>    mechanism.
>>>    4.
>>>
>>>    Can we optimize the attribute deduction algorithm to reduce compile
>>>    time?
>>>    5.
>>>
>>>    Is there any attribute that tells whether a function has side-effects
>>>    (does it always gives the same output for the same input? Or does it affect
>>>    some global variable directly or indirectly?)?
>>>
>>>
>>> It would be great if Johannes can provide me some TODOs before
>>> submitting my proposal. Also please tell some specific IPO improvement
>>> goals which you have in mind for this project. I would be most interested
>>> in memory-related attributes, liveness deductions from attributes and
>>> measurable better IPO using attribute deduction.
>>>
>>> Thanks and Regards.
>>>
>>> References:
>>>
>>> [1] Principles of Program Analysis.
>>> <https://www.springer.com/gp/book/9783540654100>
>>>
>>> [2] Data Flow Analysis: Theory and Practice.
>>> <https://dl.acm.org/doi/book/10.5555/1592955>
>>>
>>> [3] Static Program Analysis. <https://cs.au.dk/~amoeller/spa/spa.pdf>
>>>
>>> [4] 2019 LLVM Developers’ Meeting: J. Doerfert “The Attributor: A
>>> Versatile Inter-procedural Fixpoint.."
>>> <https://www.youtube.com/watch?v=HVvvCSSLiTw>
>>> [5] Soot - A Java optimization framework <https://github.com/Sable/soot>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200316/5eedd25b/attachment.html>