# [LLVMdev] Dependence Analysis [was: Flow-Sensitive AA]

Wojciech Matyjewicz wmatyjewicz at fastmail.fm
Thu Aug 21 14:21:33 PDT 2008

```> Great, thanks!  How much work do you think it will take to bring it up
> to date with the LLVM IR, except *ignoring* first-class structs and
> arrays for now?  I believe llvm-gcc does not generate those in most
> cases and we can do a lot without supporting those.  What else is
> missing relative to the current LLVM IR?

To my surprise, it took almost no time (only some minor changes
unrelated to the IR were necessary). Furthermore, it seems that lack of
support for first-class structs and arrays is not a matter of
correctness but a matter of precision. I think the analysis should give
correct answers even for bitcode that uses recently introduced
instructions. I have attached a version that can be built against the
SVN version.

The code contains many FIXME notes. In general, they indicate
possibilities of efficiency or precision improvements rather than bugs.
However, there is one issue I have ignored - possibility of overflow in
the index expression. Suppose, we have such a loop:
for (i8 i = 0; i != 200; ++i) {
A[2 * i + 5] = ...
... = A[2 * i + 3]
}
If both index expressions are evaluated in 8-bit arithmetic,
then the dependence equation should be solved in modular arithmetic:
2 * i + 5 == 2 * (i + delta) + 3 (mod 256)
So the dependence distance is delta == 1 or delta == -127 what gives
two direction vectors: (<), (>). My version will produce only the first
one. Taking into account overflow issue during dependence analysis seems
very difficult to me. Does anybody have some experience in this area?
Fortunately, programmers generally do not write programs in such a
nasty way.

And now a small explaination of debug output for mandel.c (included in
the attachment):

[ ... ]

Testing DMA dependence between:
DMA to %tmp95 of size 1 (m)
DMA to %tmp95 of size 1 (m)
// Testing for output dependence between pict[py * pict_width + px]
// assignments in different iterations. DMA stands for Direct Memory
// Access that means an LLVM instruction that directly accesses memory
// (the opposite is call site).

[ ... ]

CommonNestSize: 2
// Number of loops that are common for both instructions
// (In this example the first and second instructions are the same -
// we are testing for self-dependence)

LBounds1: (1049, 1679)
// Bounds for all loops containing the first instruction

LBounds2: (1049, 1679)
// Bounds for all loop containing the second instruction

ALengths: (0)
// Dimensions of accessed array, 0 = unknown

Indices1: (1680*i_0 + 1*i_1 + 0)
// Affine formula for index expression in the first instruction
// i_0 means induction variable for loop at level 0 (outermost),
// i_1 means IV for loop at level 1, etc...

Indices2: (1680*i_0 + 1*i_1 + 0)
// Affine formula for index expression in the first instruction

Group: (0)
Pos 0:
Banerjee test:
U1: (1049, 1679); U2: (1049, 1679)
A1: (1680, 1), B1: 0
A2: (1680, 1), B2: 0
Trying (*,*): (L = -1763999, M = 0, R = 1763999)1
Trying (<,*): (L = -1763999, M = 0, R = -1)0
Trying (=,*): (L = -1679, M = 0, R = 1679)1
Trying (=,<): (L = -1679, M = 0, R = -1)0
Trying (=,=): (L = 0, M = 0, R = 0)1
Trying (=,>): (L = 1, M = 0, R = 1679)0
Trying (>,*): (L = 1, M = 0, R = 1763999)0
possible dependence; direction vectors : { (=,=) }
Delta test: possible dependence; direction vectors : { (=,=) }
// Dependence tester internals

Array dependence tester: dependence; direction vectors: { (=,=) }