[PATCH] D21536: [CFLAA] Include externally-visible memory aliasing information in function summaries

Jia Chen via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 22 10:27:24 PDT 2016


grievejia marked an inline comment as done.

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:755
@@ +754,3 @@
+  // the same StratifiedIndex.
+  DenseMap<StratifiedIndex, std::vector<InterfaceValue>> InterfaceMap;
+  // Insert the given parameter/return value as well as the values below it into
----------------
george.burgess.iv wrote:
> grievejia wrote:
> > george.burgess.iv wrote:
> > > This still seems n^2-ish in some cases, which I believe ultimately makes this algorithm n^3 time, and n^2 storage. Imagine we have:
> > > 
> > > ```
> > > void foo(int **a, int ***b, int ****c) {
> > >   *c = b;
> > >   *b = a;
> > >   int *d = *a;
> > > }
> > > ```
> > > 
> > > That would give us one "chain" of
> > > 
> > > ```
> > > a
> > > ^
> > > b
> > > ^
> > > c
> > > ^
> > > d
> > > ```
> > > 
> > > When scanning these, it looks like we'll add 4 entries for `a`, 3 entries for `b`, and 2 for `c`. It also looks like we may end up with 6 entries in cases like:
> > > 
> > > ```
> > > void foo(int ***a, int ***b) {
> > >   int **ad = *a;
> > >   int *add = *ad;
> > >   int **bd = *b;
> > >   int *bdd = *bd;
> > > }
> > > ```
> > > 
> > > When I think we shouldn't have more than 2 (or 4, depending on how we want to handle attributes).
> > > 
> > > Assuming I'm not missing something, I think we can improve on this. AIUI, given two stratifiedset chains:
> > > 
> > > ```
> > > a
> > > ^
> > > b  e
> > > ^  ^
> > > c  f
> > > ^  ^
> > > d  g
> > >    ^
> > >    h
> > > ```
> > > 
> > > Adding an assignment edge between b and e (or c and f, or d and g) produces the same result as adding assignment edges between all of the aforementioned pairs.
> > > 
> > > If this is correct, we can knock down the time+space complexity a bit, I think. The rough idea I have for how to do this is:
> > > 
> > > - Build a map, M, of StratifiedIndex -> vector<Value *>, where each vector is the list of Args/Return values that are in that StratifiedSet
> > > - For each element, E, in the above map:
> > >   - Add entries to `RetParamRelations` that represent assignment between `E.second[0]` and `E.second[1..E.second.size()]` (**)
> > >   - For each set index S below `E.first`:
> > >     - If we don't find a mapping in M for S, try the next lower set.
> > >     - Otherwise, given said mapping, M', for a set N levels below E, add an entry that represents N levels of dereference between `E.second[0]` and `M'.second[0]` in `RetParamRelations`.
> > > 
> > > (** If you want, you can probably do this when building the map, and just make the map a StratifiedIndex -> Value* map. I realized this after writing the above, and am lazy :) )
> > > 
> > > With this, it looks like we'll end up with a linear number of entries (worst-case; best case, we'll have 0), and we'll end up walking N*M StratifiedSets total (N = number of args/returns, M = max(chain length))
> > Well, I'm not convinced that the current algorithm has an n^2 complexity. I think the complexity should be O(m*n), where m is the max chain length, even without your enhancement. You are right that m can sometimes be greater or equal to n, which essentially makes it O(n^2). However, normally I wouldn't expect m to be greater than 3 in most cases (at least I myself rarely write functions that handle pointers of >3 depth). So I would say m can almost be treated as a constant term.
> > 
> > I do agree that the algorithm can be improved in the way you described. If I understand correctly, you are suggesting that the InterfaceMap population algorithm should be executed in a "breadth-first" manner rather than a "depth-first" manner, since the former allows us to detect redundant entries earlier hence we can just shortcut exit if redundancies were found.
> > 
> > I'll try to change the codes as you suggested. Thanks for the comment!
> > at least I myself rarely write functions that handle pointers of >3 depth
> 
> FWIW, `RetVals[I]->getType()` goes through 3 levels of indirection, since RetVals is a reference to a vector of pointers. :)
> 
> > However, normally I wouldn't expect m to be greater than 3 in most cases
> 
> I'd buy that both m and n are generally going to stay < 8 for the vast majority of real-world code, yeah.
> 
> > the InterfaceMap population algorithm should be executed in a "breadth-first" manner rather than a "depth-first"
> 
> Sounds correct to me.
Now that I think about it, I'm not so sure if skipping shortcut exit is a safe option. My concern is a case like this:
```
a
^
b  < d
^
c
```
, where two parameters both belongs to set b, yet their dereferences belongs to set c and d, resp. If this is possible (which I assume highly likely since otherwise the analysis becomes indistinguishable from Steensgard), then "A and B aliases implies *A and *B also aliases" is going to be a false statement. 


http://reviews.llvm.org/D21536





More information about the llvm-commits mailing list