<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 12/23/15 12:55 PM, Russell Wallace
wrote:<br>
</div>
<blockquote
cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"
type="cite">
<meta http-equiv="Context-Type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Wed, Dec 23, 2015 at 5:35 PM, John
Criswell <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:jtcriswel@gmail.com" target="_blank">jtcriswel@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote">
<div>DSA was built when LLVM's optimizations maintained
the type information on GEP and other instructions (DSA
existed before LLVM was open-source). As such, it uses
LLVM's type information to aid in its type-inference
which, in turn, gives it field sensitivity which, in
turn, improves its accuracy. Over time, LLVM
optimizations have come to modify the type information
so that it is just simple byte-level indexing (as
opposed to array-of-structure indexing). DSA hasn't
been updated to handle that well. That is why its
precision is better pre-optimization than
post-optimization.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Ah! I don't suppose you could point to some examples of
this? E.g. a simple test program such that one could
eyeball the intermediate code before and after
optimization? <br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Off the top of my head, no, I don't have an example, but I suspect
any program with an array indexing operation with a for loop will
do.<br>
<br>
<blockquote
cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote">
<div> <br>
Just out of curiosity, what are you trying to do? I
need call graph analysis for C/C++ code with function
pointers, and so I'm writing an NSF proposal to seek
funding to do that (among other enhancements to my SVA
infrastructure). If it's something that would be useful
to you (or other LLVM community members), it would be
useful for me to know that.<br>
</div>
</blockquote>
</div>
<br>
</div>
<div class="gmail_extra">SVA?<br>
</div>
</div>
</blockquote>
<br>
Sorry. SVA is Secure Virtual Architecture. It's my LLVM-based
infrastructure for controlling operating system kernel behavior via
compiler instrumentation and hardware configuration. I've used it
to build a system that protects applications from a compromised
operating system kernel as well as to enforce memory safety and
control-flow integrity on operating system kernel code.<br>
<br>
I need DSA for doing things like:<br>
<br>
1) Creating an accurate call graph for kernel code to enforce better
control-flow integrity and to test our future infrastructure for
measuring the efficacy of defenses against code reuse attacks.<br>
<br>
2) Analyzing the memory accesses of kernel modules to see if they
modify kernel data structures that they should not modify (e.g., to
find rootkits that modify the process list).<br>
<br>
3) For optimizing run-time checks that protect kernel data
structure, at run-time, from other kernel components (useful for a
number of things).<br>
<br>
In short, strong points-to and call graph analysis enable some
interesting research projects.<br>
<br>
<blockquote
cite="mid:CAH+nB+w6eBuYT=g7RG_46vBg4fCkU59U95XDavVjOCPR48tQ5A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
I'm trying to write a superoptimizer that can optimize code
based on a high-level understanding of what it's actually
doing, so yes, call graph analysis that can deal with function
pointers does seem likely to be one of the things that will be
needed.<br>
</div>
</div>
</blockquote>
<br>
Nice.<br>
<br>
One thing you might want to investigate is whether building a call
graph analysis off of the TBAA metadata would work. If TBAA works
for lots of programs (I hear some non-conformant programs cause it
problems), then using it as a springboard for analysis may be
effective (as TBAA is already well maintained in the LLVM source
tree).<br>
<br>
Regards,<br>
<br>
John Criswell<br>
<br>
<pre class="moz-signature" cols="72">--
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
<a class="moz-txt-link-freetext" href="http://www.cs.rochester.edu/u/criswell">http://www.cs.rochester.edu/u/criswell</a></pre>
</body>
</html>