[llvm-commits] [PATCH] basicaa - fix miscompilation of load from nocapture attribute (PR14045)

Richard Osborne richard at xmos.com
Fri Nov 2 13:32:30 PDT 2012


On 2 Nov 2012, at 18:45, Dmitri Gribenko <gribozavr at gmail.com> wrote:
> 
> Thank you, I did follow the original discussion.  My question was
> about the validity of such an IR by itself: as you explained it, it
> depends on *implementation* of 'test3'.  Wouldn't it be better to
> disallow to pass 'nocapture' pointers to functions that don't give
> such a guarantee?

Here's an example that doesn't pass a 'nocapture' pointer to another function but still shows the problem:

1  define i32 @f(i32* noalias nocapture %p, i1 %cond) nounwind {
2  entry:
3         %q = alloca i32*
4         store i32* %p, i32**%q
5         %r = load i32** %q
6         store i32 1, i32* %p
7         store i32 0, i32* %r
8         %0 = load i32* %p
9         store i32 0, i32* %r
10        %1 = load i32* %r
11        %2 = add i32 %0, %1
12        ret i32 %2
13 }

No copy of %p is made which outlives the function. %r aliases %q but %r is based on %q it is still valid to mark %p with noalias (at least as far as my understanding goes). When complied and run this should return 0. However if I run opt test.ll -basicaa -dse -S I get the following output

define i32 @f(i32* noalias nocapture %p, i1 %cond) nounwind {
entry:
  %q = alloca i32*
  store i32* %p, i32** %q
  %r = load i32** %q
  store i32 1, i32* %p
  %0 = load i32* %p
  store i32 0, i32* %r
  %1 = load i32* %r
  %2 = add i32 %0, %1
  ret i32 %2
}

The store on line 7 has been deleted and f now returns 1.

The reason I ran into this issue is that in our compiler we have a transform where we take regions inside a function and outline them into new functions in order to run them in parallel. Based on analysis of the original function before the transform we know it doesn't capture its pointer argument. The transform doesn't change the behaviour of the function from the point of view of the caller so we don't want to drop the nocapture attribute. However the if the region we are outlining uses the pointer argument then the transform must create an alloca for a state structure, copy the argument to that state structure and pass the state structure another function that starts the threads.

If there was a rule that a no capture pointer couldn't be passed as an argument to a function without the nocapture argument then dropping the nocapture attribute from the function containing the parallel regions wouldn't be enough. The pointer might also be marked nocaputre in the caller and in the caller's callers and so on. We would have to walk back up the call tree removing nocapture until we got back to the original object.



More information about the llvm-commits mailing list