[LLVMdev] [GSoC] Applying for GSoC 2015

Mingxing Zhang james0zan at gmail.com
Sun Mar 8 06:42:30 PDT 2015


Got it.
I'll try to find the applications for field-sensitivity (and
interprocedural, etc) AA analysis.
And I'll do some preliminary evaluation on the tracing/slicing part for the
bloat detection.

Thanks!

On 8 March 2015 at 21:34, John Criswell <jtcriswel at gmail.com> wrote:

>  On 3/8/15 8:56 AM, Mingxing Zhang wrote:
>
>  Hello John,
>
> According to the FAQ, I can submit two proposals although at most one of
> them can be accepted.
> Thus I will prepare a proposal for each of the two projects.
>
>
> Correct.  Only one proposal will be accepted.
>
>
>  And, after reading the code of cfl-aa and several related papers, I've
> listed four milestones for the AA project:
>
> 1) In order to use the fast algorithm described in PLDI'13 [1], cfl-aa
> makes a simplification on the CFL defined in POPL'08 [2], which will lead
> to a reduction on precision (I've confirmed this observation with the
> author).
> Thus a quantitative measurement on how much is the reduction is needed.
>
> 2) In cfl-aa, different fields of a same struct and the whole array are
> represented by a single node.
> This is the reason of the problem 2, 4 listed in
> http://llvm.org/OpenProjects.html.
> We should split these large nodes.
>
>
> I think the real question is whether the loss of precision matters, and if
> so, to which uses of alias analysis.  SAFECode, for example, wants field
> information to determine type safety (so that it can optimize away
> type-safe loads and stores), so field sensitivity matters.  Perhaps field
> sensitivity doesn't matter for other applications (e.g., optimization).
> There's no point in improving precision if it doesn't help the analyses
> that people care about most.
>
> As part of your project, I think you should state the uses of alias
> analysis/points-to analysis that you're aiming to improve and understand
> whether your proposed improvements will help that use.  I would also
> recommend picking a use that matters to a significant portion of the LLVM
> community.
>
>
> 3) Handling special global variables, such as errno.
>
> 4) It seems that the current version of cfl-aa is an intraprocedural
> analysis.
> If the time is enough, I think we may extend it to an interprocedural
> analysis.
> The algorithm described in [3] can be applied to scaling it.
>
> As for the bloat-detection project, the final result should be a tool that
> is verified by known bugs and a set of newly detected bugs.
>
>
> For the bloat detection tool, I would like to be convinced that dynamic
> tracing will be, or can be, sufficiently efficient to be practical.  I hate
> to ask, but I think you need to run an experiment with Giri to show that
> dynamic slicing is going to be practical for the executions that you expect
> to analyze.  Either that, or you need to explain how you can use something
> more efficient than dynamic slicing (note that dynamic slicing and dynamic
> tracing are not the same, so be sure you're correctly stating which one you
> need).
>
>
> Do you have any suggestions on these objectives?
>
>
> In your proposal, be sure to include a set of milestones and how long you
> think you will need to achieve those milestones.  I may have said that
> before, but it's worth repeating.
>
> Regards,
>
> John Criswell
>
>
>
> Thanks!
>
>
>  [1] Fast Algorithms for Dyck-CFL-Reachability with Applications to Alias
> Analysis. PLDI'13
> [2] Demand-Driven Alias Analysis for C. POPL'08
> [3] Demand-Driven Context-Sensitive Alias Analysis for Java. ISSTA'11
>
>
> On 5 March 2015 at 09:58, Mingxing Zhang <james0zan at gmail.com> wrote:
>
>>  Wow, that is cool!
>>  I'll check about it.
>>
>>  Thank you!
>>
>> On 4 March 2015 at 21:57, John Criswell <jtcriswel at gmail.com> wrote:
>>
>>>  On 3/4/15 2:18 AM, Mingxing Zhang wrote:
>>>
>>> Hello John,
>>>
>>> Thank you for your advices and congratulations~
>>>
>>> I'll read the code of cfl-aa and Giri first and make the decision of
>>> which project to pursue.
>>> The choice will be reported to this thread once I made the determination
>>> (hopefully within this week).
>>>
>>>
>>>  You should check for yourself, but I don't think anything prevents you
>>> from submitting two proposals.  If you have time to write two strong
>>> proposals, I see no problem with that.
>>>
>>> Just make sure that any proposal you write is strong: it provides a
>>> concrete explanation of what you want to do, some justification for why it
>>> would benefit the community (short or long term), and why you're the person
>>> qualified to do it.  Proposals should also include a set of milestones and
>>> expected dates for completing those milestones.
>>>
>>> Regards,
>>>
>>> John Criswell
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>> On 3 March 2015 at 23:12, John Criswell <jtcriswel at gmail.com> wrote:
>>>
>>>>  Dear Mingxing,
>>>>
>>>> I think both projects are interesting and useful.
>>>>
>>>> Points-to analysis is something that is needed by research users of
>>>> LLVM, but to the best of my knowledge, no solid implementation currently
>>>> exists (although the cfl-aa work being done at Google may provide us with
>>>> something; you should check into it before writing a proposal).  My
>>>> interest is in a points-to analysis that is robust and is useful to both
>>>> research and industry users of LLVM.  A points-to analysis proposal must
>>>> indicate how it will help both of these subsets of the LLVM community, and
>>>> it must argue why current efforts do not meet the requirements of both
>>>> subsets of the community.
>>>>
>>>> The runtime bloat tool also looks interesting, and your approach (at
>>>> least to me) is interesting.  One question in my mind, though, is whether
>>>> dynamic slicing is going to work well.  Swarup Sahoo and I built a dynamic
>>>> slicer for LLVM named Giri, and we found the tracing required for dynamic
>>>> slicing to be slow.  For our purposes, the overhead was okay as we only
>>>> needed to record execution until a crash (which happened quickly).  In your
>>>> bloat tool, the program will probably run for awhile, creating a long trace
>>>> record.  You should take a look at the Giri code, use it to trace some
>>>> programs, and see if the overheads are going to be tolerable.  If they are
>>>> not, then your first task would be to optimize Giri for your bloat tool.
>>>>
>>>> You should also be more specific about which LLVM instructions will be
>>>> traced.  For example, I wouldn't record the outputs of every LLVM
>>>> instruction; I might only record the outputs of loads and stores or the end
>>>> of a def-use chain.
>>>>
>>>> I'd be interested in mentoring either project.
>>>>
>>>> BTW, it looks like your FSE paper won an award.  Congrats.
>>>>
>>>> Regards,
>>>>
>>>> John Criswell
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 3/3/15 2:30 AM, Mingxing Zhang wrote:
>>>>
>>>>  Hi all,
>>>>
>>>> As a Ph.D. student majored in Software Reliability, I have used LLVM in
>>>> many of my projects, such as the Anticipating Invariant (
>>>> http://james0zan.github.io/AI.html) and some other undergoing ones.
>>>> Thus, it would be a great pleasure for me if I could take this
>>>> opportunity to contribute to this awesome project.
>>>>
>>>> After reading the idea list (http://llvm.org/OpenProjects.html), I was
>>>> most interested in the idea of improving the "Pointer and Alias Analysis"
>>>> passes.
>>>> Could you please give me some more tips or advices on how to get
>>>> started on working on the application?
>>>>
>>>> Simultaneously, I also have another idea about using LLVM to detect
>>>> runtime bloat, just like the ThreadSanitizer tool for data races.
>>>> If there is anyone here who would like to mentor this project, could
>>>> you please find some time to review the more detailed proposal on gist
>>>> <https://gist.github.com/james0zan/d03737c60b10d0d11d34> and give me
>>>> some feedbacks?
>>>>
>>>> P.S.
>>>>   I do prefer the bloat detection tool, but I'm not sure about whether
>>>> it is suitable for GSoC.
>>>>   Thus I will apply for the Alias Analysis one if it is not suitable.
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>>   Mingxing Zhang
>>>>
>>>>  Tel.: +86-10-62797143
>>>>  Web: http://james0zan.github.io/
>>>>  Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084,
>>>> China
>>>>
>>>>
>>>>  _______________________________________________
>>>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>  /
>>>>
>>>> --
>>>> John Criswell
>>>> Assistant Professor
>>>> Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell
>>>>
>>>>
>>>
>>>
>>> --
>>>   Mingxing Zhang
>>>
>>>  Tel.: +86-10-62797143
>>>  Web: http://james0zan.github.io/
>>>  Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084,
>>> China
>>>
>>>
>>>
>>> --
>>> John Criswell
>>> Assistant Professor
>>> Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell
>>>
>>>
>>
>>
>> --
>>   Mingxing Zhang
>>
>>  Tel.: +86-10-62797143
>>  Web: http://james0zan.github.io/
>>  Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084,
>> China
>>
>
>
>
> --
>   Mingxing Zhang
>
>  Tel.: +86-10-62797143
>  Web: http://james0zan.github.io/
>  Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084,
> China
>
>
>
> --
> John Criswell
> Assistant Professor
> Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell
>
>


-- 
Mingxing Zhang

Tel.: +86-10-62797143
Web: http://james0zan.github.io/
Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084, China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150308/ca7c068e/attachment.html>


More information about the llvm-dev mailing list