[PATCH] D21536: [CFLAA] Include externally-visible memory aliasing information in function summaries

Jia Chen via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 21 16:04:57 PDT 2016


grievejia added inline comments.

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:755
@@ +754,3 @@
+  // the same StratifiedIndex.
+  DenseMap<StratifiedIndex, std::vector<InterfaceValue>> InterfaceMap;
+  // Insert the given parameter/return value as well as the values below it into
----------------
george.burgess.iv wrote:
> This still seems n^2-ish in some cases, which I believe ultimately makes this algorithm n^3 time, and n^2 storage. Imagine we have:
> 
> ```
> void foo(int **a, int ***b, int ****c) {
>   *c = b;
>   *b = a;
>   int *d = *a;
> }
> ```
> 
> That would give us one "chain" of
> 
> ```
> a
> ^
> b
> ^
> c
> ^
> d
> ```
> 
> When scanning these, it looks like we'll add 4 entries for `a`, 3 entries for `b`, and 2 for `c`. It also looks like we may end up with 6 entries in cases like:
> 
> ```
> void foo(int ***a, int ***b) {
>   int **ad = *a;
>   int *add = *ad;
>   int **bd = *b;
>   int *bdd = *bd;
> }
> ```
> 
> When I think we shouldn't have more than 2 (or 4, depending on how we want to handle attributes).
> 
> Assuming I'm not missing something, I think we can improve on this. AIUI, given two stratifiedset chains:
> 
> ```
> a
> ^
> b  e
> ^  ^
> c  f
> ^  ^
> d  g
>    ^
>    h
> ```
> 
> Adding an assignment edge between b and e (or c and f, or d and g) produces the same result as adding assignment edges between all of the aforementioned pairs.
> 
> If this is correct, we can knock down the time+space complexity a bit, I think. The rough idea I have for how to do this is:
> 
> - Build a map, M, of StratifiedIndex -> vector<Value *>, where each vector is the list of Args/Return values that are in that StratifiedSet
> - For each element, E, in the above map:
>   - Add entries to `RetParamRelations` that represent assignment between `E.second[0]` and `E.second[1..E.second.size()]` (**)
>   - For each set index S below `E.first`:
>     - If we don't find a mapping in M for S, try the next lower set.
>     - Otherwise, given said mapping, M', for a set N levels below E, add an entry that represents N levels of dereference between `E.second[0]` and `M'.second[0]` in `RetParamRelations`.
> 
> (** If you want, you can probably do this when building the map, and just make the map a StratifiedIndex -> Value* map. I realized this after writing the above, and am lazy :) )
> 
> With this, it looks like we'll end up with a linear number of entries (worst-case; best case, we'll have 0), and we'll end up walking N*M StratifiedSets total (N = number of args/returns, M = max(chain length))
Well, I'm not convinced that the current algorithm has an n^2 complexity. I think the complexity should be O(m*n), where m is the max chain length, even without your enhancement. You are right that m can sometimes be greater or equal to n, which essentially makes it O(n^2). However, normally I wouldn't expect m to be greater than 3 in most cases (at least I myself rarely write functions that handle pointers of >3 depth). So I would say m can almost be treated as a constant term.

I do agree that the algorithm can be improved in the way you described. If I understand correctly, you are suggesting that the InterfaceMap population algorithm should be executed in a "breadth-first" manner rather than a "depth-first" manner, since the former allows us to detect redundant entries earlier hence we can just shortcut exit if redundancies were found.

I'll try to change the codes as you suggested. Thanks for the comment!

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:758
@@ +757,3 @@
+  // InterfaceMap
+  auto AddToInterfaceMap = [this, &InterfaceMap](unsigned InterfaceIndex,
+                                                 StratifiedIndex SetIndex) {
----------------
george.burgess.iv wrote:
> I don't care either way, but if you want to just use `[&]` for captures in the future, you're welcome to. :)
Won't [&] unnecessarily capture things we don't need in the closure? 

================
Comment at: lib/Analysis/CFLAliasAnalysis.cpp:792
@@ +791,3 @@
+  // RetParamRelations
+  for (const auto &mapping : InterfaceMap) {
+    auto &Interfaces = mapping.second;
----------------
george.burgess.iv wrote:
> george.burgess.iv wrote:
> > Nit: Mapping
> It looks like there are cases where this won't add the correct StratifiedAttrs to sets. Consider:
> 
> ```
> int **g;
> void foo(int **a) {
>   *a = *g;
> }
> 
> void bar() {
>   int *p, *p2;
>   *g = p2;
> 
>   int **a = &p;
>   foo(a);
>   // p and p2 now alias
> }
> ```
> 
> Because there's only arg and no return values when analyzing `foo`, `Interfaces.size()` will never be > 1, so we'll end up with no `RetParamRelations`. We need to somehow apply the attributes from the set containing `a` in `foo` to the set containing `a` in `bar`, though.
That's the reason why there are still two xfail test cases, and that's the reason why I mentioned in the description section that more work need to be done here :)

I think almost all StratifiedAttr in the callee's graph is useless to the caller, except for AttrUnknown. The summary should also include a list of InterfaceValues that must be tagged with AttrUnknown. However, we currently tag all nodes below formal parameters "AttrUnknown", so doing what I suggested before would unnecessarily tag too many "AttrUnknown"s in the caller. I think we may need to introduce another StratifiedAttr here just to distinguish between "things below parameters, which are known by the caller" and "things that are truly unknown, such as globals or inttoptr". 

Anyway, whatever we do, that's going to be the story for the next patch :)


http://reviews.llvm.org/D21536





More information about the llvm-commits mailing list