<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi all,<br>
<br>
I have two questions about alias analysis; see below for the
background on why I am asking these questions. First, the questions:<br>
<ol>
<li>Is there an up-to-date list somewhere of all the alias
analyses in LLVM, and what their capabilities are? I've been
told that the AliasAnalysis page in the official documentation
(<a class="moz-txt-link-freetext" href="http://llvm.org/docs/AliasAnalysis.html">http://llvm.org/docs/AliasAnalysis.html</a>) is severely out of
date. I see in particular that it makes no mention of e.g. TBAA,
which shows up in "opt -help".</li>
<li>Is it possible to override the default alias analysis
implementation when running the "llc" tool (or from clang)? I am
working from within a MachineFunctionPass, so (unfortunately) I
cannot use "opt", which would provide easy command-line options
to do this. If there is no command-line ability to specify a
different alias analysis, how could I do so in code? Presently,
I am simply using getAnalysis<AliasAnalysis>() to get the
default alias analysis, which is BasicAA.</li>
</ol>
Now for the background:<br>
<br>
I am writing an analysis pass at the machine-IR level
(MachineFunctionPass) for security research. (Before anyone asks,
yes, we've considered doing this in regular IR, and it isn't
sufficient for our needs. :-) We are studying code-reuse attacks and
need to examine the actual machine instructions available to an
attacker in the final compiled binary.)<br>
<br>
I'm currently exploring the use of IR-level alias analysis
information in my MachineFunctionPass. Specifically, I'm using
AliasSetTracker on the Value* pointers returned by
MachineMemOperand::getValue() (similar to what
MachineInstr::mayAlias() does, though we're not using that interface
because we're interested in alias sets).<br>
<br>
So far, I've been using BasicAliasAnalysis's results, since they're
the default that I get when simply using
getAnalysis<AliasAnalysis>(). However, I'd like to try some
more advanced alias analyses. Specifically, I'm looking for
something that provides better field-sensitivity within structs on
the stack. BasicAA is field-sensitive in general but I've observed
that this has trouble distinguishing between (for instance) scalar
and array fields within the same struct when the array is accessed
through a variable index. A simple motivating example of this is:<br>
<br>
<tt>struct mystruct {</tt><tt><br>
</tt><tt> short s;</tt><tt><br>
</tt><tt> int i;</tt><tt><br>
</tt><tt> int arr[5];</tt><tt><br>
</tt><tt>};</tt><tt><br>
</tt><tt><br>
</tt><tt>struct mystruct str;</tt><tt><br>
</tt><tt>str.s = (short) rand();</tt><tt><br>
</tt><tt>str.i = rand();</tt><tt><br>
</tt><tt>for (int idx = 0; idx < 5; ++idx)</tt><tt> {</tt><tt><br>
</tt><tt> str.arr[idx] = rand();</tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt><tt><br>
</tt><tt>printf("%hd %d %d %d %d %d %d", str.s, str.i, str.arr[0],
str.arr[1], str.arr[2], str.arr[3], str.arr[4]);</tt><br>
<br>
For the above code, BasicAA knows that str.s, str.i, and
str.arr[0-4] are all pairwise NoAlias, but it says that they are all
PartialAlias with str.arr[idx]. Clearly, this is because idx is a
variable with unknown value, and it doesn't fall into one the
"special cases" that BasicAA can recognize as NoAlias. (For
instance, if I write str.arr[(unsigned)idx] instead, it can figure
out that it is NoAlias with str.s, since the explicit zero-extension
lets BasicAA know that the index <b>*must*</b> be positive, and
they are far enough apart that they can't alias. Oddly, it still
thinks str.i is PartialAlias...I haven't figured this one out but
would guess it somehow slips through the cracks of BasicAA's special
cases.)<br>
<br>
Having an array alongside scalars in a struct is fairly common, and
I'd like to not have to sacrifice field sensitivity when it occurs.
My thinking, therefore, is that I need a smarter AA that knows
things about types, e.g., that it's undefined behavior to access a
scalar struct field through an out-of-bounds pointer to an array
field in the same struct. This sounds like the sort of thing TBAA
("Type-Based Alias Analysis") might be able to do - hence my
questions above.<br>
<br>
Thanks in advance for your time and assistance!<br>
<br>
Sincerely,<br>
Ethan Johnson<br>
<pre class="moz-signature" cols="72">--
Ethan J. Johnson
Computer Science PhD student, Systems group, University of Rochester
<a class="moz-txt-link-abbreviated" href="mailto:ejohns48@cs.rochester.edu">ejohns48@cs.rochester.edu</a>
<a class="moz-txt-link-abbreviated" href="mailto:ethanjohnson@acm.org">ethanjohnson@acm.org</a>
PGP public key available from public directory or on request</pre>
</body>
</html>