<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 12/23/15 12:55 PM, Russell Wallace

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"

      type="cite">

      <meta http-equiv="Context-Type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Wed, Dec 23, 2015 at 5:35 PM, John

            Criswell <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote">

              <div>DSA was built when LLVM's optimizations maintained

                the type information on GEP and other instructions (DSA

                existed before LLVM was open-source).  As such, it uses

                LLVM's type information to aid in its type-inference

                which, in turn, gives it field sensitivity which, in

                turn, improves its accuracy.  Over time, LLVM

                optimizations have come to modify the type information

                so that it is just simple byte-level indexing (as

                opposed to array-of-structure indexing).  DSA hasn't

                been updated to handle that well.  That is why its

                precision is better pre-optimization than

                post-optimization.<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>Ah! I don't suppose you could point to some examples of

              this? E.g. a simple test program such that one could

              eyeball the intermediate code before and after

              optimization? <br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    Off the top of my head, no, I don't have an example, but I suspect

    any program with an array indexing operation with a for loop will

    do.<br>

    <br>

    <blockquote

cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote">

              <div> <br>

                Just out of curiosity, what are you trying to do?  I

                need call graph analysis for C/C++ code with function

                pointers, and so I'm writing an NSF proposal to seek

                funding to do that (among other enhancements to my SVA

                infrastructure).  If it's something that would be useful

                to you (or other LLVM community members), it would be

                useful for me to know that.<br>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

        <div class="gmail_extra">SVA?<br>

        </div>

      </div>

    </blockquote>

    <br>

    Sorry.  SVA is Secure Virtual Architecture.  It's my LLVM-based

    infrastructure for controlling operating system kernel behavior via

    compiler instrumentation and hardware configuration.  I've used it

    to build a system that protects applications from a compromised

    operating system kernel as well as to enforce memory safety and

    control-flow integrity on operating system kernel code.<br>

    <br>

    I need DSA for doing things like:<br>

    <br>

    1) Creating an accurate call graph for kernel code to enforce better

    control-flow integrity and to test our future infrastructure for

    measuring the efficacy of defenses against code reuse attacks.<br>

    <br>

    2) Analyzing the memory accesses of kernel modules to see if they

    modify kernel data structures that they should not modify (e.g., to

    find rootkits that modify the process list).<br>

    <br>

    3) For optimizing run-time checks that protect kernel data

    structure, at run-time, from other kernel components (useful for a

    number of things).<br>

    <br>

    In short, strong points-to and call graph analysis enable some

    interesting research projects.<br>

    <br>

    <blockquote

cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra"><br>

          I'm trying to write a superoptimizer that can optimize code

          based on a high-level understanding of what it's actually

          doing, so yes, call graph analysis that can deal with function

          pointers does seem likely to be one of the things that will be

          needed.<br>

        </div>

      </div>

    </blockquote>

    <br>

    Nice.<br>

    <br>

    One thing you might want to investigate is whether building a call

    graph analysis off of the TBAA metadata would work.  If TBAA works

    for lots of programs (I hear some non-conformant programs cause it

    problems), then using it as a springboard for analysis may be

    effective (as TBAA is already well maintained in the LLVM source

    tree).<br>

    <br>

    Regards,<br>

    <br>

    John Criswell<br>

    <br>

    <pre class="moz-signature" cols="72">-- 

John Criswell

Assistant Professor

Department of Computer Science, University of Rochester

<a class="moz-txt-link-freetext" href="http://www.cs.rochester.edu/u/criswell">http://www.cs.rochester.edu/u/criswell</a></pre>

  </body>

</html>