[llvm-dev] @llvm.memcpy not honoring volatile?

Wed Jun 12 21:38:05 PDT 2019

On Tue, Jun 11, 2019 at 12:08 PM JF Bastien via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I think we want option 2.: keep volatile memcpy, and implement it as
> touching each byte exactly once. That’s unlikely to be particularly useful
> for every direct-to-hardware uses, but it behaves intuitively enough that I
> think it’s desirable.
>

As Eli pointed out, that precludes lowering a volatile memcpy into a call
the memcpy library function. The usual "memcpy" library function may well
use the same overlapping-memory trick, and there is no "volatile_memcpy"
libc function which would provide a guarantee of not touching bytes
multiple times. Perhaps it's okay to just always emit an inline loop
instead of falling back to a memcpy call.

But, possibly option 3 would be better. Maybe it's better to force
people/compiler-frontends to emit the raw load/store operations, so that
it's more clear exactly what semantics are desired.

The fundamental issue to me is that for reasonable usages of volatile, the
operand size and number of memmory instructions generated for a given
operation actually *matters*. Certainly, this is a somewhat
unfortunate situation, since the C standard explicitly doesn't forbid
implementing any volatile access with smaller memory operations. (Which,
among other issues, allows tearing as your wg21 doc nicely points out.)
Nevertheless, it _is_ an important property -- required by POSIX for
accesses of a volatile sig_atomic_t, even -- and is a property which
LLVM/Clang does provide when dealing with volatile accesses of
target-specific appropriate sizes and alignments.

But, what does that mean for volatile memcpy? What size should it use?
Always a byte-by-byte copy? May it do larger-sized reads/writes as well?
*Must* it do so? Does it have to read/write the data in order? Or can it do
so in reverse order? Can it use CPU's block-copy instructions (e.g. rep
movsb on x86) which may sometimes cause effectively-arbitrarily-sized
memory-ops, in arbitrary order, in hardware?

If we're going to keep volatile memcpy support, possibly those other
questions ought to be answered too?

I dunno...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190613/9b6863b2/attachment.html>