[cfe-dev] Clang generates calls to llvm.memcpy with overlapping arguments, but LangRef requires the arguments to not overlap

Tue Aug 25 15:06:58 PDT 2020

+ llvm-dev

On 25 Aug 2020, at 13:53, Florian Hahn wrote:
> Hi,
>
> It appears that Clang generates calls to `llvm.memcpy` with 
> potentially overlapping arguments in some cases.
>
> For the snippet below
>
> struct S
> {
>   char s[25];
> };
>
> struct S *p;
>
> void test2() {
>  ...
>   foo (&b, 1);
>   b = a;
>   b = *p;
> ...
> }
>
>
> Clang uses `llvm.memcpy` to copy the struct:
>
>   call void @foo(%struct.S* %2, i32 1)
>   %7 = bitcast %struct.S* %2 to i8*
>   %8 = bitcast %struct.S* %1 to i8*
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %7, i8* align 1 %8, 
> i64 25, i1 false)
>   %9 = load %struct.S*, %struct.S** @p, align 8
>   %10 = bitcast %struct.S* %2 to i8*
>   %11 = bitcast %struct.S* %9 to i8*
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %10, i8* align 1 
> %11, i64 25, i1 false)
>
>
> In the C example, `foo` could set `p = &b` and then `b = *p` would 
> just copy the contents from `b` into `b`. This means that the the 
> arguments to the second llvm.memcpy call may overlap, which seems not 
> allowed according to the current version of the LangRef 
> (https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic). This is 
> problematic, because the fact is used in BasicAliasAnalysis for 
> example 
> (https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L982).
>
> The full, build-able example can be found here: 
> https://godbolt.org/z/PY1vKq
>
> I might be missing something, but it appears that Clang should not 
> create call to `llvm.memcpy` unless it can guarantee the arguments 
> cannot overlap. I am not sure what the best alternative to 
> `llvm.memcpy` would be in case the arguments overlap.

C allows overlapping assignments only when the overlap is exact, i.e. 
the addresses are exactly the same.  I agree that just emitting memcpy 
isn’t strictly legal, because neither C nor LLVM allows even exact 
overlap.  On the other hand, Clang would really like to avoid emitting 
extra control flow here, because exact overlap is uncommon enough that 
emitting a no-op memcpy (if it were indeed a no-op) is definitely the 
right trade-off in practice; and LLVM probably won’t reliably remove 
branches around a memcpy call if we add them.  And it’s very hard to 
imagine a memcpy implementation that actually wouldn’t work with exact 
overlap (maybe something that tried to just remap pages?).

I think we have are four options:

1. We can relax `llvm.memcpy` to allow exact overlap.  Practically, this 
would depend on memcpy being a no-op on exact overlap; we definitely 
wouldn’t want *LLVM* to have to start inserting control flow around 
memcpy calls.
2. We can make Clang emit assignments with `llvm.memmove`.  This would 
make assignment work even with non-exact overlap, which is unnecessary; 
I’m not sure the cost is that high.
3. We can make Clang emit an explicit check and control flow around 
memcpy when there might be overlap.  Clang IRGen should already maintain 
the right information to avoid doing this when e.g. initializing a 
variable.
4. We can add a new intrinsic — or some sort of decoration of the 
existing one — that does a memcpy but allows exact overlap.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200825/6a02378b/attachment.html>