[llvm-dev] Confusions around nocapture and sret

David Chisnall via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 22 01:50:52 PST 2021


On 21/02/2021 18:39, Johannes Doerfert via llvm-dev wrote:
> I strongly suggest to emit nocapture with sret in the frontend
> instead.

I don't think that is actually feasible.  For example, consider this C++ 
file:

```c++
#include <set>

struct Example;
std::set<Example*> live_examples;
struct Example {
	Example()
	{
		live_examples.insert(this);
	}
	~Example()
	{
		live_examples.erase(live_examples.find(this));
	}
};

Example somefn()
{
	Example e;
	return e;
}
```

In this example, guaranteed copy elision means that somefn allocates `e` 
in the space provided for it in the caller, calling the constructor, 
which then captures the value.  In the generated IR, the space for `e` 
has the `sret` attribute but it is definitely not nocapture.

You can also trigger this in C, though in the C case it is undefined 
behaviour.  Consider this example:

```c
struct Foo
{
         int a[5];
};

int x(struct Foo *);

struct Foo f(void)
{
         struct Foo foo;
         x(&foo);
         return foo;
}
```

The source-language semantics guarantee that no pointers to `foo` 
outlive the invocation of `f`, which implies that `x` must not capture 
the argument.  The optimisers take advantage of the fact that it would 
be UB to compare the address of foo after the end of `f` to any other 
allocation and we end up generating this IR after optimisation, eliding 
the copy:

```
; Function Attrs: nounwind uwtable
define dso_local void @f(%struct.Foo* noalias sret(%struct.Foo) align 4 
%0) local_unnamed_addr #0 {
   %2 = tail call i32 @x(%struct.Foo* %0) #2
   ret void
}
```

Nothing in the IR says that `x`'s argument is nocapture.  Whether this 
is permitted depends on what we want nocapture to mean.  There are two 
possible interpretations:

  - The callee does not capture the argument, if the callee does capture 
the argument then the IR is ill-formed and we have a compiler bug.
  - The caller is free to assume that the callee does not capture the 
argument, if the callee does capture the argument then it is UB.

The former allows the absence of nocapture to be interpreted as 'we 
can't statically prove that the argument is not captured'.  This is very 
useful for memory-safety work, because it allows us to trust `nocapture` 
as a security property: we can emit any further analysis.

The latter allows optimisations to be more aggressive but will sometimes 
generate more surprising code for users and may break some security 
properties if security-related transforms depend on this information.

My personal bias is towards the former: we would like to be able to use 
`nocapture` in stack temporal safety work as a strong guarantee.  As 
such, the front end could not insert it because transforms may later 
insert a capture.  Alternatively, the module verifier should be updated 
to ensure that a nocapture argument is not passed to any other function 
except via a nocapture argument.

David




More information about the llvm-dev mailing list