<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 12/23/15 2:09 AM, Russell Wallace

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAH+nB+yBu-dCHfU=hg7G41Dupzui1RxpNn6Dcq4YDHhDs-2LjA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Tue, Dec 22, 2015 at 10:55 AM,

            John Criswell <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"><span class=""></span>You

                could conservatively assume that any function that has

                its address taken has a pointer to it that escapes into

                memory or external code.  </div>

            </blockquote>

            <div><br>

              Right, that's what I'm doing to start with.<br>

               </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">To make things a

                little more accurate, you could scan the uses of any

                function for which hasAddressTaken() returns true and

                see if any of its uses escapes its function or escapes

                into memory or external code.  I believe

                hasAddressTaken() returns true if the function is

                subjected to a cast instruction, and functions are often

                casted if they are used in a call that uses a different

                signature than the function's declared signature.<br>

              </div>

            </blockquote>

            <div><br>

              I'll look into that. It seems reasonable to guess that the

              major confounding factor in many C++ programs will be

              references from virtual function tables; there should be

              some way to optimize those specifically. <br>

            </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

              0.8ex;border-left:1px solid

              rgb(204,204,204);padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> <br>

                To get anything more accurate, you'll need to use alias

                analysis or points-to analysis.  DSA tracks function

                pointers in the heap and can tell you whether the

                function is called from external code.  However, DSA's

                accuracy currently suffers if it is run after LLVM's

                optimizations, and the code needs some serious TLC.<br>

              </div>

            </blockquote>

            <div><br>

              DSA presumably stands for data structure analysis. TLC =

              tender loving care? Why does DSA become less accurate if

              run after optimization?<br>

            </div>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

    DSA was built when LLVM's optimizations maintained the type

    information on GEP and other instructions (DSA existed before LLVM

    was open-source).  As such, it uses LLVM's type information to aid

    in its type-inference which, in turn, gives it field sensitivity

    which, in turn, improves its accuracy.  Over time, LLVM

    optimizations have come to modify the type information so that it is

    just simple byte-level indexing (as opposed to array-of-structure

    indexing).  DSA hasn't been updated to handle that well.  That is

    why its precision is better pre-optimization than post-optimization.<br>

    <br>

    Just out of curiosity, what are you trying to do?  I need call graph

    analysis for C/C++ code with function pointers, and so I'm writing

    an NSF proposal to seek funding to do that (among other enhancements

    to my SVA infrastructure).  If it's something that would be useful

    to you (or other LLVM community members), it would be useful for me

    to know that.<br>

    <br>

    Regards,<br>

    <br>

    John Criswell<br>

    <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

John Criswell

Assistant Professor

Department of Computer Science, University of Rochester

<a class="moz-txt-link-freetext" href="http://www.cs.rochester.edu/u/criswell">http://www.cs.rochester.edu/u/criswell</a></pre>

  </body>

</html>