[llvm-dev] Clang generates calls to llvm.memcpy with overlapping arguments, but LangRef requires the arguments to not overlap
John McCall via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 25 15:06:58 PDT 2020
+ llvm-dev
On 25 Aug 2020, at 13:53, Florian Hahn wrote:
> Hi,
>
> It appears that Clang generates calls to `llvm.memcpy` with
> potentially overlapping arguments in some cases.
>
> For the snippet below
>
> struct S
> {
> char s[25];
> };
>
> struct S *p;
>
> void test2() {
> ...
> foo (&b, 1);
> b = a;
> b = *p;
> ...
> }
>
>
> Clang uses `llvm.memcpy` to copy the struct:
>
> call void @foo(%struct.S* %2, i32 1)
> %7 = bitcast %struct.S* %2 to i8*
> %8 = bitcast %struct.S* %1 to i8*
> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %7, i8* align 1 %8,
> i64 25, i1 false)
> %9 = load %struct.S*, %struct.S** @p, align 8
> %10 = bitcast %struct.S* %2 to i8*
> %11 = bitcast %struct.S* %9 to i8*
> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %10, i8* align 1
> %11, i64 25, i1 false)
>
>
> In the C example, `foo` could set `p = &b` and then `b = *p` would
> just copy the contents from `b` into `b`. This means that the the
> arguments to the second llvm.memcpy call may overlap, which seems not
> allowed according to the current version of the LangRef
> (https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic). This is
> problematic, because the fact is used in BasicAliasAnalysis for
> example
> (https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L982).
>
> The full, build-able example can be found here:
> https://godbolt.org/z/PY1vKq
>
> I might be missing something, but it appears that Clang should not
> create call to `llvm.memcpy` unless it can guarantee the arguments
> cannot overlap. I am not sure what the best alternative to
> `llvm.memcpy` would be in case the arguments overlap.
C allows overlapping assignments only when the overlap is exact, i.e.
the addresses are exactly the same. I agree that just emitting memcpy
isn’t strictly legal, because neither C nor LLVM allows even exact
overlap. On the other hand, Clang would really like to avoid emitting
extra control flow here, because exact overlap is uncommon enough that
emitting a no-op memcpy (if it were indeed a no-op) is definitely the
right trade-off in practice; and LLVM probably won’t reliably remove
branches around a memcpy call if we add them. And it’s very hard to
imagine a memcpy implementation that actually wouldn’t work with exact
overlap (maybe something that tried to just remap pages?).
I think we have are four options:
1. We can relax `llvm.memcpy` to allow exact overlap. Practically, this
would depend on memcpy being a no-op on exact overlap; we definitely
wouldn’t want *LLVM* to have to start inserting control flow around
memcpy calls.
2. We can make Clang emit assignments with `llvm.memmove`. This would
make assignment work even with non-exact overlap, which is unnecessary;
I’m not sure the cost is that high.
3. We can make Clang emit an explicit check and control flow around
memcpy when there might be overlap. Clang IRGen should already maintain
the right information to avoid doing this when e.g. initializing a
variable.
4. We can add a new intrinsic — or some sort of decoration of the
existing one — that does a memcpy but allows exact overlap.
John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200825/6a02378b/attachment.html>
More information about the llvm-dev
mailing list