[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

Mon Nov 10 16:44:02 PST 2003

Chris and everyone else,

Below I summarize my understanding of what llvm does when converting to 
SSA and a clarification on why backward dataflow analyses can not be 
performed on "just" SSA.

>> Scalar variables still have a stack location associated with them,
>> don't they?
>
> No, they don't.  All scalar values in LLVM represent "virtual 
> registers":
> you cannot take their address, and they do not live in memory.  In 
> fact,
> LLVM does not have an "address of" operator.  Stack memory is all
> explicitly allocated with the 'alloca' instruction.  See Section 2.3 of
> this paper for more discussion on this topic:
>
> http://llvm.cs.uiuc.edu/pubs/2003-09-30-LifelongOptimizationTR.html
>
>>> This gives us some very nice properties: for example the C/C++
>>> front-end does not have an "SSA construction pass", instead all
>>> automatic variables live on the stack (in memory) and are accessed 
>>> via
>>> loads & stores.  As such, no values are live across basic blocks, and
>>> there are no PHI nodes in the C frontend output (making the C 
>>> frontend
>>> easier to implement).

I made various test programs, ran them through the llvm C frontend, and 
then just called the mem2reg pass, which raises some memory references 
to virtual registers (what people typically think of as scalars) to 
generate an SSA representation.  Any virtual register use satisfies the 
SSA requirements that only one definition reaches that use.

llvmgcc test6.c -o test6.ll -S
llvm-as < test6.ll | opt -mem2reg -simplifycfg | llvm-dis > 
test6_mem2reg.ll

In a sense llvm is doing a simple type of alias analysis when they do
a mem2reg pass using no other analyses.  It appears that the
following memory references are assumed to possibly be aliased and
therefore are not treated as scalar variables (ie. they are always 
loaded
and stored to memory) in the resulting SSA:
- global variables
- fields in a struct allocated with malloc and fields in local structs
- a dereference of any parameter pointer, *p
- an array reference whether the array is malloced or local
- if a local variable has its address taken as parameter to a function,
it could be aliased and therefore is not treated as a scalar
     NOTE: p = &A;
           d = *p + 10;
     is treated with some implicit copy propagation.
     quote from Chris,
     "Ok, there are certain 'optimizations' that you cannot turn off in 
LLVM.
      A trivial example of this is copy-propagation.  LLVM does not 
_even have_
      a copy instruction, so the 'q = &A' instruction is just deleted, 
and any
      references to q use &A instead.  This means that your *q turns 
into a
      direct assignment to A."
Anyone from the LLVM group should correct me if any of the above
observations are incorrect or incomplete.

I think this has the following consequences for the OpenAnalysis 
project:
- Instead of having the most naive alias analysis as default, an alias
analysis that assumes any local variables that do not have their
address taken in any way (var ref in parameter list, addressOf operator,
etc) should be considered to not alias anyone other variable references.
- To write even this most simple alias analysis in OA we must define
an interface to the SourceIR (ROSE/Sage or Open64/Whirl or LLVM) that
allows us to query the necessary information.

>>> Because of this, there was no need to implement a dataflow framework,
>>> though doing so would not be hard.  However, I'm not aware of any
>>> analysis which requires traditional dataflow, or that cannot be
>>> implemented more efficiently in SSA form.  If you are aware of one,
>>> please let me know.
>>>  :)
>>
>> Backward dataflow analyses, like live variable analysis, cannot be
>> performed with SSA [Choi91, JohnsonPingali93].
>
> I've implemented live variable analysis in LLVM, source here:
>
> http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8h-source.html
> http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8cpp-source.html
>
> It uses a very straight-forward and efficient algorithm, described in
> Appel's book.

Ok, my previously sent example was bogus. =)  Upon further 
consideration, I realized that all of the papers that said backward 
dataflow analysis can not be done with SSA meant "with only SSA".  In 
other words, an algorithm that uses both the CFG and SSA (like the live 
var analysis in LLVM) can do backward dataflow problems.  Attached to 
this email is another example of  the backward flow problem live 
variable analysis.  This time the SSA graph is shown as well as the 
CFG.  The example shows that constant propagation (a forward analysis) 
can be performed by only traversing the SSA edges, but that live var 
analysis requires the CFG as well.

Cheers,
Michelle

-------------- next part --------------
A non-text attachment was scrubbed...
Name: SSA-woCFG-backward.pdf
Type: application/pdf
Size: 36160 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20031110/05bf4f9b/attachment.pdf>
-------------- next part --------------

=======================================
  Michelle Mills Strout, Ph.D.
  Argonne National Laboratory, MCS Division
  9700 S. Cass Avenue, Bldg 221, Rm D243
  Argonne, IL  60439-4844

  http://www.mcs.anl.gov/~mstrout
  mstrout at mcs.anl.gov
=======================================