[PATCH] D19549: Fold compares irrespective of whether allocation can be elided

Sanjoy Das via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 28 14:38:05 PDT 2016


sanjoy added inline comments.

================
Comment at: lib/Analysis/InstructionSimplify.cpp:2100
@@ +2099,3 @@
+  // compare dominates the pointer escape
+  if (MI && !PointerMayBeCaptured(MI, false, false))
+    return ConstantInt::get(GetCompareTy(LHS), !CmpInst::isTrueWhenEqual(Pred));
----------------
anna wrote:
> anna wrote:
> > sanjoy wrote:
> > > I think you need to pass `true` for both `ReturnCaptures` and `StoreCaptures`.  Can you also add a test case that would've caught this?
> > I had `ReturnCaptures` as `false` because of this case:
> > compare dominates the `ret` statement which returns the pointer. We can fold the compare since pointer has not escaped at the time of compare. 
> > 
> > In case compare does not dominate the `ret` statement, if we fold the `cmp` and some optimizations later on reordere the statements such that the `ret` no longer returns the right value, we have incorrect functionality in the IR. 
> > However, the test case should be such that the ret value is different, but the only statement causing the pointer to be captured should be the `ret`. If there were other statements causing pointer to be captured, we would have returned `false` for `PointerMayBeCaptured()`
> > 
> > TC I wrote for the case where `cmp` for `%m` does not dominate the `ret` 
> > 
> > ```
> > define i8* @compare_ret_escape(i8* %c) {
> > 
> >   %m = call i8* @malloc(i64 4)
> >   %n = call i8* @malloc(i64 4)
> >   %cmp = icmp eq i8* %n, %c
> >   br i1 %cmp, label %retst, label %chk
> > 
> > retst:
> >   ret i8* %m
> > 
> > chk:
> >   %bc = bitcast i8* %m to i32*
> >   %lgp = load i32*, i32** @gp, align 8, !nonnull !0
> >   %cmp2 = icmp eq i32* %bc, %lgp
> >   br i1 %cmp2, label %retst,  label %chk2
> > 
> > chk2:
> >   ret i8* %n
> > }
> > ```
> > Right now the `%cmp2` folds to `false`, and it looks right.  Is there a case where we really need `ReturnCaptures` to be `true`?
> > 
> > 
> Also, just to add to this, the third case is `cmp` post dominates the `ret`. In all cases, the important point is that `ret` is a terminator instruction that exits the function, so there is no way to go from the `ret` to `cmp`. 
You're right in that the optimizer needs to preserve single threaded
data dependencies.  But I think there can be problems with
cross-thread data dependencies:

```
define i8* @f() {
entry:
  %ptr = call i8* @malloc()
  %global = load atomic i8*, i8** @global_var
  %cmp = icmp ne i8* %global, %ptr
  if (%cmp)
    *%global = 42;
  else
    *%global = 50;
  ret i8* %ptr
}

define void @g() {
  %val2 = call i8* @f()
  store atomic i8* %val2, i8** @global_var_2  ;; noalias with @global_var
  ret void
}
```

opt folds `%cmp` in `@f` to `true`, and then inlines `@f` into `@g`:

```
define void @g() {
entry:
  %ptr = call i8* @malloc()
  %global = load atomic i8*, i8** @global_var
  *%global = 42;
  store atomic i8* %ptr, i8** @global_var_2  ;; noalias with @global_var
  ret void
}
```

and re-orders:

```
define void @g() {
entry:
  %ptr = call i8* @malloc()
  store atomic i8* %ptr, i8** @global_var_2  ;; noalias with @global_var
  %global = load atomic i8*, i8** @global_var
  *%global = 42;
  ret void
}
```

The `atomic` markers on the loads and stores mean that these memory
accesses are visible on other threads (without `atomic`, sharing
memory across threads is undefined behavior), but it does not imply
any specific ordering.

Let's say we also have a thread doing this:

```
define void @another_thread() {
  %val = load atomic i8*, i8** @global_var_2
  store atomic i8* %val, i8** @global_var_2
  ret void
}
```

The original program would only ever write `50` to the value stored in
`@global_var` if the value stored in `@global_var` was from the
`@malloc`, so you could later do this safely:

```
%val = load i8*, i8** @global_var
if (*%val != 50)
  // no need to free %val
```

If another thread reads (atomically) from `@global_var_2` and writes
that value to `@global_var` then in the transformed program the above
invariant is no longer true, since you could have this trace:

```
define void @g() {
entry:
  %ptr = call i8* @malloc()
  store atomic i8* %ptr, i8** @global_var_2  ;; noalias with @global_var

              %val = load atomic i8*, i8** @global_var_2
              store atomic i8* %val, i8** @global_var_2

  %global = load atomic i8*, i8** @global_var
  *%global = 42;
  ret void
}
```

and write `42` to a malloc'ed pointer stored in `@global_var` (illegal
state according to the original program).


Now I think we can get around this by restricting the load instruction
to be non-atomic, but for now I'd say it is safer to just bail out if
we see stores or returns.


Repository:
  rL LLVM

http://reviews.llvm.org/D19549





More information about the llvm-commits mailing list