<div dir="ltr">CFLAA already has some basic interprocedural analysis built in (see: tryInterproceduralAnalysis; it basically boils down to CFLAA grabbing the StratifiedSets for each function call and unifying sets based off of that instead of unifying everything ever). The only real changes I had in mind for it were:<div><br><div>- Adding context sensitivity (which kind of requires adding context sensitivity to CFLAA first)</div><div>- Making it less terrible for {mutually,indirectly,} recursive functions (While we're building the sets for some function A(), the sets for A() aren't accessible, so any call to A() from a function called by A() looks opaque).</div><div><br></div><div>If you want to take a stab at making IPA better/adding context sensitivity, you're more than welcome to do so. I'm happy to work on other things in the meantime :)</div><div><br></div><div>George</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Mar 15, 2015 at 8:50 AM, Mingxing Zhang <span dir="ltr"><<a href="mailto:james0zan@gmail.com" target="_blank">james0zan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello Daniel,<br><br>Thank you for your comments and sorry for my mistakes, I'll revise them.<br>And I'll for sure read the paper you mentioned and survey the recent researches before deciding the implementation technique.<br><br>To George:<br>May I know the exact plan of your attempt for making cfl-aa interprocedural?<br>I do think that this is the most valuable part of my proposal, but that makes no sense to do it twice.<br><br>Maybe I can work on the porting of the flow-sensitive method proposed by Prof. Ben Hardekopf at <a href="http://www.cs.ucsb.edu/~benh/research/papers/hardekopf11flow.pdf" target="_blank">CGO11</a>.<br>It is declared in his homepage that the published source code "is written for a pre-release version of LLVM 2.5 and does not work in current versions of LLVM"<br><br>Thanks!<br><br><br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On 15 March 2015 at 08:31, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">A few notes:<br>1. "But

these standard LLVM AA passes either take a large

amount of time (Anderson Analysis at cubic time

and large memory requirements)"<div><br></div><div>Neither of these is correct. Andersen's is not cubic in practice, or large memory in practice, when implemented properly.  GCC uses it by default as the points-to implementation, and it's never even close to the top of the profile.</div><div><br></div><div>It takes about 1 second to do a million lines of code.</div><div>And one can do better (gcc's impl is no longer state of the art).</div><div><br></div><div>2. The approach to field sensitivity you mention is not going to work very well, given how much casting occurs (everything is casted). I would suggest using the approach in <a href="http://dl.acm.org/citation.cfm?id=996835" target="_blank">http://dl.acm.org/citation.cfm?id=996835</a></div><div><br></div><div>3. George, cc'd, is planning on implementing both context sensitive and context-insensitive interprocedural analysis in cfl-aa the next month or two. </div><div><br></div><div>4. Using a BDD cloning approach for CFL-AA doesn't make much sense, the whole point of CFL is not having to generate explicit points-to sets if you don't want to. Plus, followup papers and researchers have *never* been able to reproduce the scalability of Whaley's work.</div><div><br></div><div>Not to mention it's done on Java. Java is a language where doing things like field-sensitivity always increase precision, which is not true for C.</div><div><br></div><div>If you really want to attempt this, I would suggest using one of the demand driven context-sensitive approaches that will be easy to fit in CFL.</div><div><div><div><br></div><div class="gmail_quote">On Sat, Mar 14, 2015 at 5:57 AM Mingxing Zhang <<a href="mailto:james0zan@gmail.com" target="_blank">james0zan@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello John,<br><br>I've finished the first version of my proposal on enhancing alias analysis.<br>The proposal can be downloaded at <a href="http://james0zan.github.io/resource/GSoC15-Proposal-AA.pdf" target="_blank">http://james0zan.github.io/resource/GSoC15-Proposal-AA.pdf</a>.<br>I hope I've successfully justified the necessity and benefits of this project.<br>If possible, please find some time to review it and give me some more feedbacks.<br><br>Thank you very much!<br><br>P.S. I'm working on the other proposal, a couple of days is needed.<br><br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On 8 March 2015 at 21:42, Mingxing Zhang <span dir="ltr"><<a href="mailto:james0zan@gmail.com" target="_blank">james0zan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Got it.<br></div>I'll try to find the applications for field-sensitivity (and interprocedural, etc) AA analysis.<br></div>And I'll do some preliminary evaluation on the tracing/slicing part for the bloat detection.<br><br></div>Thanks!<br></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On 8 March 2015 at 21:34, John Criswell <span dir="ltr"><<a href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div text="#000000" bgcolor="#FFFFFF"><span>

    <div>On 3/8/15 8:56 AM, Mingxing Zhang

      wrote:<br>

    </div>

    </span><blockquote type="cite">

      <div dir="ltr">

        <div>Hello John,<br>

          <br><span>

          According to the FAQ, I can submit two proposals although at

          most one of them can be accepted.<br>

          Thus I will prepare a proposal for each of the two projects.<br>

        </span></div>

      </div>

    </blockquote>

    <br>

    Correct.  Only one proposal will be accepted.<span><br>

    <br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div>And, after reading the code of cfl-aa and several related

          papers, I've listed four milestones for the AA project: <br>

          <br>

          1) In order to use the fast algorithm described in PLDI'13

          [1], cfl-aa makes a simplification on the CFL defined in

          POPL'08 [2], which will lead to a reduction on precision (I've

          confirmed this observation with the author).<br>

          Thus a quantitative measurement on how much is the reduction

          is needed.<br>

          <br>

          2) In cfl-aa, different fields of a same struct and the whole

          array are represented by a single node.<br>

          This is the reason of the problem 2, 4 listed in <a href="http://llvm.org/OpenProjects.html" target="_blank">http://llvm.org/OpenProjects.html</a>.<br>

          We should split these large nodes.<br>

        </div>

      </div>

    </blockquote>

    <br></span>

    I think the real question is whether the loss of precision matters,

    and if so, to which uses of alias analysis.  SAFECode, for example,

    wants field information to determine type safety (so that it can

    optimize away type-safe loads and stores), so field sensitivity

    matters.  Perhaps field sensitivity doesn't matter for other

    applications (e.g., optimization).  There's no point in improving

    precision if it doesn't help the analyses that people care about

    most.<br>

    <br>

    As part of your project, I think you should state the uses of alias

    analysis/points-to analysis that you're aiming to improve and

    understand whether your proposed improvements will help that use.  I

    would also recommend picking a use that matters to a significant

    portion of the LLVM community.<span><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div><br>

          3) Handling special global variables, such as errno.<br>

          <br>

          4) It seems that the current version of cfl-aa is an

          intraprocedural analysis.<br>

          If the time is enough, I think we may extend it to an

          interprocedural analysis.<br>

          The algorithm described in [3] can be applied to scaling it.<br>

          <br>

          As for the bloat-detection project, the final result should be

          a tool that is verified by known bugs and a set of newly

          detected bugs.<br>

        </div>

      </div>

    </blockquote>

    <br></span>

    For the bloat detection tool, I would like to be convinced that

    dynamic tracing will be, or can be, sufficiently efficient to be

    practical.  I hate to ask, but I think you need to run an experiment

    with Giri to show that dynamic slicing is going to be practical for

    the executions that you expect to analyze.  Either that, or you need

    to explain how you can use something more efficient than dynamic

    slicing (note that dynamic slicing and dynamic tracing are not the

    same, so be sure you're correctly stating which one you need).<span><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div><br>

          Do you have any suggestions on these objectives?<br>

        </div>

      </div>

    </blockquote>

    <br></span>

    In your proposal, be sure to include a set of milestones and how

    long you think you will need to achieve those milestones.  I may

    have said that before, but it's worth repeating.<br>

    <br>

    Regards,<br>

    <br>

    John Criswell<div><div><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div><br>

          Thanks!<br>

          <br>

          <br>

        </div>

        [1] Fast Algorithms for Dyck-CFL-Reachability with Applications

        to Alias Analysis. PLDI'13<br>

        <div>[2] Demand-Driven Alias Analysis for C. POPL'08<br>

          [3] Demand-Driven Context-Sensitive Alias Analysis for Java.

          ISSTA'11<br>

          <br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On 5 March 2015 at 09:58, Mingxing

          Zhang <span dir="ltr"><<a href="mailto:james0zan@gmail.com" target="_blank">james0zan@gmail.com</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote">

            <div dir="ltr">

              <div>

                <div>Wow, that is cool!<br>

                </div>

                I'll check about it.<br>

                <br>

              </div>

              Thank you!<br>

            </div>

            <div>

              <div>

                <div class="gmail_extra"><br>

                  <div class="gmail_quote">On 4 March 2015 at 21:57,

                    John Criswell <span dir="ltr"><<a href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span>

                    wrote:<br>

                    <blockquote class="gmail_quote">

                      <div><span>

                          <div>On 3/4/15 2:18 AM, Mingxing Zhang wrote:<br>

                          </div>

                          <blockquote type="cite">

                            <div dir="ltr">Hello John,<br>

                              <br>

                              Thank you for your advices and

                              congratulations~<br>

                              <br>

                              I'll read the code of cfl-aa and Giri

                              first and make the decision of which

                              project to pursue.<br>

                              The choice will be reported to this thread

                              once I made the determination (hopefully

                              within this week).<br>

                            </div>

                          </blockquote>

                          <br>

                        </span> You should check for yourself, but I

                        don't think anything prevents you from

                        submitting two proposals.  If you have time to

                        write two strong proposals, I see no problem

                        with that.<br>

                        <br>

                        Just make sure that any proposal you write is

                        strong: it provides a concrete explanation of

                        what you want to do, some justification for why

                        it would benefit the community (short or long

                        term), and why you're the person qualified to do

                        it.  Proposals should also include a set of

                        milestones and expected dates for completing

                        those milestones.<br>

                        <br>

                        Regards,<br>

                        <br>

                        John Criswell

                        <div>

                          <div><br>

                            <br>

                            <blockquote type="cite">

                              <div dir="ltr"><br>

                                Thanks!<br>

                                <br>

                              </div>

                              <div class="gmail_extra"><br>

                                <div class="gmail_quote">On 3 March 2015

                                  at 23:12, John Criswell <span dir="ltr"><<a href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span>

                                  wrote:<br>

                                  <blockquote class="gmail_quote">

                                    <div>

                                      <div>Dear Mingxing,<br>

                                        <br>

                                        I think both projects are

                                        interesting and useful.<br>

                                        <br>

                                        Points-to analysis is something

                                        that is needed by research users

                                        of LLVM, but to the best of my

                                        knowledge, no solid

                                        implementation currently exists

                                        (although the cfl-aa work being

                                        done at Google may provide us

                                        with something; you should check

                                        into it before writing a

                                        proposal).  My interest is in a

                                        points-to analysis that is

                                        robust and is useful to both

                                        research and industry users of

                                        LLVM.  A points-to analysis

                                        proposal must indicate how it

                                        will help both of these subsets

                                        of the LLVM community, and it

                                        must argue why current efforts

                                        do not meet the requirements of

                                        both subsets of the community.<br>

                                        <br>

                                        The runtime bloat tool also

                                        looks interesting, and your

                                        approach (at least to me) is

                                        interesting.  One question in my

                                        mind, though, is whether dynamic

                                        slicing is going to work well. 

                                        Swarup Sahoo and I built a

                                        dynamic slicer for LLVM named

                                        Giri, and we found the tracing

                                        required for dynamic slicing to

                                        be slow.  For our purposes, the

                                        overhead was okay as we only

                                        needed to record execution until

                                        a crash (which happened

                                        quickly).  In your bloat tool,

                                        the program will probably run

                                        for awhile, creating a long

                                        trace record.  You should take a

                                        look at the Giri code, use it to

                                        trace some programs, and see if

                                        the overheads are going to be

                                        tolerable.  If they are not,

                                        then your first task would be to

                                        optimize Giri for your bloat

                                        tool.<br>

                                        <br>

                                        You should also be more specific

                                        about which LLVM instructions

                                        will be traced.  For example, I

                                        wouldn't record the outputs of

                                        every LLVM instruction; I might

                                        only record the outputs of loads

                                        and stores or the end of a

                                        def-use chain.<br>

                                        <br>

                                        I'd be interested in mentoring

                                        either project.<br>

                                        <br>

                                        BTW, it looks like your FSE

                                        paper won an award.  Congrats.<br>

                                        <br>

                                        Regards,<br>

                                        <br>

                                        John Criswell

                                        <div>

                                          <div><br>

                                            <br>

                                            <br>

                                            <br>

                                            <br>

                                            <br>

                                            <br>

                                            On 3/3/15 2:30 AM, Mingxing

                                            Zhang wrote:<br>

                                          </div>

                                        </div>

                                      </div>

                                      <blockquote type="cite">

                                        <div>

                                          <div>

                                            <div dir="ltr">Hi all,<br>

                                              <br>

                                              As a Ph.D. student majored

                                              in Software Reliability, I

                                              have used LLVM in many of

                                              my projects, such as the

                                              Anticipating Invariant (<a href="http://james0zan.github.io/AI.html" target="_blank">http://james0zan.github.io/AI.html</a>)

                                              and some other undergoing

                                              ones.<br>

                                              Thus, it would be a great

                                              pleasure for me if I could

                                              take this opportunity to

                                              contribute to this awesome

                                              project.<br>

                                              <br>

                                              After reading the idea

                                              list (<a href="http://llvm.org/OpenProjects.html" target="_blank">http://llvm.org/OpenProjects.html</a>),

                                              I was most interested in

                                              the idea of improving the

                                              "Pointer and Alias

                                              Analysis" passes.<br>

                                              Could you please give me

                                              some more tips or advices

                                              on how to get started on

                                              working on the

                                              application?<br>

                                              <br>

                                              Simultaneously, I also

                                              have another idea about

                                              using LLVM to detect

                                              runtime bloat, just like

                                              the ThreadSanitizer tool

                                              for data races.<br>

                                              If there is anyone here

                                              who would like to mentor

                                              this project, could you

                                              please find some time to

                                              review the <a href="https://gist.github.com/james0zan/d03737c60b10d0d11d34" target="_blank">more

                                                detailed proposal on

                                                gist</a> and give me

                                              some feedbacks?<br>

                                              <br>

                                              P.S. <br>

                                                I do prefer the bloat

                                              detection tool, but I'm

                                              not sure about whether it

                                              is suitable for GSoC.<br>

                                                Thus I will apply for

                                              the Alias Analysis one if

                                              it is not suitable.<br>

                                              <br>

                                              Thanks!<br>

                                              <br>

                                              -- <br>

                                              <div>

                                                <div dir="ltr">

                                                  <div>

                                                    <div dir="ltr">

                                                      <div>Mingxing

                                                        Zhang

                                                        <div><br>

                                                        </div>

                                                        <div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br>

                                                        </div>

                                                        <div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br>

                                                        </div>

                                                        <div>Addr: Room

                                                          3-122, FIT

                                                          Building,

                                                          Tsinghua

                                                          University,

                                                          Beijing

                                                          100084, China</div>

                                                      </div>

                                                    </div>

                                                  </div>

                                                </div>

                                              </div>

                                            </div>

                                            <br>

                                            <fieldset></fieldset>

                                            <br>

                                          </div>

                                        </div>

                                        <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

                                      </blockquote>

                                      /<span><br>

                                        <br>

                                        <pre cols="72">-- 

John Criswell

Assistant Professor

Department of Computer Science, University of Rochester

<a href="http://www.cs.rochester.edu/u/criswell" target="_blank">http://www.cs.rochester.edu/u/criswell</a></pre>

                                      </span></div>

                                  </blockquote>

                                </div>

                                <br>

                                <br>

                                <br>

                                -- <br>

                                <div>

                                  <div dir="ltr">

                                    <div>

                                      <div dir="ltr">

                                        <div>Mingxing Zhang

                                          <div><br>

                                          </div>

                                          <div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br>

                                          </div>

                                          <div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br>

                                          </div>

                                          <div>Addr: Room 3-122, FIT

                                            Building, Tsinghua

                                            University, Beijing 100084,

                                            China</div>

                                        </div>

                                      </div>

                                    </div>

                                  </div>

                                </div>

                              </div>

                            </blockquote>

                            <br>

                            <br>

                            <pre cols="72">-- 

John Criswell

Assistant Professor

Department of Computer Science, University of Rochester

<a href="http://www.cs.rochester.edu/u/criswell" target="_blank">http://www.cs.rochester.edu/u/criswell</a></pre>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </div>

                  <br>

                  <br>

                  <br>

                  -- <br>

                  <div>

                    <div dir="ltr">

                      <div>

                        <div dir="ltr">

                          <div>Mingxing Zhang

                            <div><br>

                            </div>

                            <div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br>

                            </div>

                            <div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br>

                            </div>

                            <div>Addr: Room 3-122, FIT Building,

                              Tsinghua University, Beijing 100084, China</div>

                          </div>

                        </div>

                      </div>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

        <br>

        <br>

        -- <br>

        <div>

          <div dir="ltr">

            <div>

              <div dir="ltr">

                <div>Mingxing Zhang

                  <div><br>

                  </div>

                  <div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br>

                  </div>

                  <div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br>

                  </div>

                  <div>Addr: Room 3-122, FIT Building, Tsinghua

                    University, Beijing 100084, China</div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    <br>

    <pre cols="72">-- 

John Criswell

Assistant Professor

Department of Computer Science, University of Rochester

<a href="http://www.cs.rochester.edu/u/criswell" target="_blank">http://www.cs.rochester.edu/u/criswell</a></pre>

  </div></div></div>

</blockquote></div><br><br clear="all"><br>-- <br><div><div dir="ltr"><div><div dir="ltr"><div>Mingxing Zhang<div><br></div><div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br></div><div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br></div><div>Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084, China</div></div></div></div></div></div>

</div>

</div></div></blockquote></div><br><br clear="all"><br>-- <br><div><div dir="ltr"><div><div dir="ltr"><div>Mingxing Zhang<div><br></div><div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br></div><div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br></div><div>Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084, China</div></div></div></div></div></div>

</div>

______________________________<u></u>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/<u></u>mailman/listinfo/llvmdev</a><br>

</blockquote></div></div></div></div>

</blockquote></div><br><br clear="all"><br>-- <br><div><div dir="ltr"><div><div dir="ltr"><div>Mingxing Zhang<div><br></div><div>Tel.: <a href="tel:%2B86-10-62797143" value="+861062797143" target="_blank">+86-10-62797143</a><br></div><div>Web: <a href="http://james0zan.github.io/" target="_blank">http://james0zan.github.io/</a><br></div><div>Addr: Room 3-122, FIT Building, Tsinghua University, Beijing 100084, China</div></div></div></div></div></div>

</div>

</div></div></blockquote></div><br></div>