[PATCH] D21536: [CFLAA] Include externally-visible memory aliasing information in function summaries

George Burgess IV via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 21 17:03:44 PDT 2016


george.burgess.iv added inline comments.

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:755
@@ +754,3 @@
+  // the same StratifiedIndex.
+  DenseMap<StratifiedIndex, std::vector<InterfaceValue>> InterfaceMap;
+  // Insert the given parameter/return value as well as the values below it into
----------------
grievejia wrote:
> george.burgess.iv wrote:
> > This still seems n^2-ish in some cases, which I believe ultimately makes this algorithm n^3 time, and n^2 storage. Imagine we have:
> > 
> > ```
> > void foo(int **a, int ***b, int ****c) {
> >   *c = b;
> >   *b = a;
> >   int *d = *a;
> > }
> > ```
> > 
> > That would give us one "chain" of
> > 
> > ```
> > a
> > ^
> > b
> > ^
> > c
> > ^
> > d
> > ```
> > 
> > When scanning these, it looks like we'll add 4 entries for `a`, 3 entries for `b`, and 2 for `c`. It also looks like we may end up with 6 entries in cases like:
> > 
> > ```
> > void foo(int ***a, int ***b) {
> >   int **ad = *a;
> >   int *add = *ad;
> >   int **bd = *b;
> >   int *bdd = *bd;
> > }
> > ```
> > 
> > When I think we shouldn't have more than 2 (or 4, depending on how we want to handle attributes).
> > 
> > Assuming I'm not missing something, I think we can improve on this. AIUI, given two stratifiedset chains:
> > 
> > ```
> > a
> > ^
> > b  e
> > ^  ^
> > c  f
> > ^  ^
> > d  g
> >    ^
> >    h
> > ```
> > 
> > Adding an assignment edge between b and e (or c and f, or d and g) produces the same result as adding assignment edges between all of the aforementioned pairs.
> > 
> > If this is correct, we can knock down the time+space complexity a bit, I think. The rough idea I have for how to do this is:
> > 
> > - Build a map, M, of StratifiedIndex -> vector<Value *>, where each vector is the list of Args/Return values that are in that StratifiedSet
> > - For each element, E, in the above map:
> >   - Add entries to `RetParamRelations` that represent assignment between `E.second[0]` and `E.second[1..E.second.size()]` (**)
> >   - For each set index S below `E.first`:
> >     - If we don't find a mapping in M for S, try the next lower set.
> >     - Otherwise, given said mapping, M', for a set N levels below E, add an entry that represents N levels of dereference between `E.second[0]` and `M'.second[0]` in `RetParamRelations`.
> > 
> > (** If you want, you can probably do this when building the map, and just make the map a StratifiedIndex -> Value* map. I realized this after writing the above, and am lazy :) )
> > 
> > With this, it looks like we'll end up with a linear number of entries (worst-case; best case, we'll have 0), and we'll end up walking N*M StratifiedSets total (N = number of args/returns, M = max(chain length))
> Well, I'm not convinced that the current algorithm has an n^2 complexity. I think the complexity should be O(m*n), where m is the max chain length, even without your enhancement. You are right that m can sometimes be greater or equal to n, which essentially makes it O(n^2). However, normally I wouldn't expect m to be greater than 3 in most cases (at least I myself rarely write functions that handle pointers of >3 depth). So I would say m can almost be treated as a constant term.
> 
> I do agree that the algorithm can be improved in the way you described. If I understand correctly, you are suggesting that the InterfaceMap population algorithm should be executed in a "breadth-first" manner rather than a "depth-first" manner, since the former allows us to detect redundant entries earlier hence we can just shortcut exit if redundancies were found.
> 
> I'll try to change the codes as you suggested. Thanks for the comment!
> at least I myself rarely write functions that handle pointers of >3 depth

FWIW, `RetVals[I]->getType()` goes through 3 levels of indirection, since RetVals is a reference to a vector of pointers. :)

> However, normally I wouldn't expect m to be greater than 3 in most cases

I'd buy that both m and n are generally going to stay < 8 for the vast majority of real-world code, yeah.

> the InterfaceMap population algorithm should be executed in a "breadth-first" manner rather than a "depth-first"

Sounds correct to me.

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:758
@@ +757,3 @@
+  // InterfaceMap
+  auto AddToInterfaceMap = [this, &InterfaceMap](unsigned InterfaceIndex,
+                                                 StratifiedIndex SetIndex) {
----------------
grievejia wrote:
> george.burgess.iv wrote:
> > I don't care either way, but if you want to just use `[&]` for captures in the future, you're welcome to. :)
> Won't [&] unnecessarily capture things we don't need in the closure? 
I thought that initially, too -- the answer is apparently "nope". Only things that you actually use in the lambda need to be captured; if you like to read standardese, 5.1.2p12 (and bits around there) may be interesting to you.

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:792
@@ +791,3 @@
+  // RetParamRelations
+  for (const auto &mapping : InterfaceMap) {
+    auto &Interfaces = mapping.second;
----------------
grievejia wrote:
> grievejia wrote:
> > george.burgess.iv wrote:
> > > george.burgess.iv wrote:
> > > > Nit: Mapping
> > > It looks like there are cases where this won't add the correct StratifiedAttrs to sets. Consider:
> > > 
> > > ```
> > > int **g;
> > > void foo(int **a) {
> > >   *a = *g;
> > > }
> > > 
> > > void bar() {
> > >   int *p, *p2;
> > >   *g = p2;
> > > 
> > >   int **a = &p;
> > >   foo(a);
> > >   // p and p2 now alias
> > > }
> > > ```
> > > 
> > > Because there's only arg and no return values when analyzing `foo`, `Interfaces.size()` will never be > 1, so we'll end up with no `RetParamRelations`. We need to somehow apply the attributes from the set containing `a` in `foo` to the set containing `a` in `bar`, though.
> > That's the reason why there are still two xfail test cases, and that's the reason why I mentioned in the description section that more work need to be done here :)
> > 
> > I think almost all StratifiedAttr in the callee's graph is useless to the caller, except for AttrUnknown. The summary should also include a list of InterfaceValues that must be tagged with AttrUnknown. However, we currently tag all nodes below formal parameters "AttrUnknown", so doing what I suggested before would unnecessarily tag too many "AttrUnknown"s in the caller. I think we may need to introduce another StratifiedAttr here just to distinguish between "things below parameters, which are known by the caller" and "things that are truly unknown, such as globals or inttoptr". 
> > 
> > Anyway, whatever we do, that's going to be the story for the next patch :)
> After a second thought, maybe AttrEscaped is useful to the caller as well...
AttrGlobal would probably be useful, too. As would AttrUnknown under any sets tagged with AttrGlobal. :)


http://reviews.llvm.org/D21536





More information about the llvm-commits mailing list