[cfe-dev] (not) initializing assembly outputs with -ftrivial-auto-var-init

Thu Mar 28 06:05:36 PDT 2019

On Thu, Mar 28, 2019 at 7:42 AM Alexander Potapenko <glider at google.com>
wrote:

> On Thu, Mar 28, 2019 at 10:55 AM Dmitry Vyukov <dvyukov at google.com> wrote:
> >
> > On Wed, Mar 27, 2019 at 1:43 PM Dmitry Vyukov <dvyukov at google.com>
> wrote:
> > >
> > > On Tue, Mar 26, 2019 at 11:30 PM James Y Knight <jyknight at google.com>
> wrote:
> > > >
> > > > The entirety of the named object is replaced. If you want to modify
> an object, instead of entirely replacing it, you use "+m".
> > > >
> > > > None of this is anything new or innovative -- GCC has had these
> semantics -- and been optimizing based on them -- for ages.
> > > >
> > > > E.g., here, all elements of the array are replaced, so the
> initialization goes away, and the return needs to explicitly add all 4
> values written by the inline-asm.
> > > > int out[4] = {1,2,3,4};
> > > > asm("whatever" : "=m"(out));
> > > > return out[0] + out[1] + out[2] + out[3];
> > > >
> > > > Here, only out[1] is touched by the inline asm. The other values are
> not modified, so all of the initialization can disappear, and the generated
> code can simply return 8 + out[1].
> > > > int out[4] = {1,2,3,4};
> > > > asm("whatever" : "=m"(out[1]));
> > > > return out[0] + out[1] + out[2] + out[3];
> > >
> > > Thanks!
> > >
> > > What exactly do you mean by a named object? out[1] does not refer to
> > > any named object, right? Or *(a+i).
> > >
> > > How should be a memset-like function described that writes multiple
> > > (unknown) number of elements?
> > >
> > > What if we need to pass a pointer to beginning of an array, and the
> > > asm block writes to i-th (unknown) element of the array?
> > >
> > > What if we pass a pointer to int but actually write 6 or 32 bytes at
> > > that address?
> >
> >
> > I am asking because I am seeing all of these cases in the kernel code.
> > So I am trying to understand (1) if it has format semantics and what
> > are they (2) if it's precisely analyzable (3) if developers are aware
> > and respect these semantics, or we need to allocate another year for
> > fixing incorrect code if we go this route.
>
> James, what do you think about this particular case
> (https://godbolt.org/z/Vl2bst)?
>
> =======================
> void clear_bit(long nr, volatile unsigned long *addr) {
>   asm volatile("lock; btr %1,%0"
>     : "+m"(*(volatile long *)addr)
>     : "Ir" (nr));
>   }
> unsigned long foo() {
>   unsigned long addr[2] = {1,2};
>   clear_bit(65, addr);
>   return addr[0] + addr[1];
> }
> =======================
>
> The declaration of clear_bit() is taken from the Linux kernel
> (
> https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/bitops.h#L111
> ).
> It however appears to be incorrect: GCC assumes that only addr[0] can
> be overwritten by the inline assembly, whereas the call actually
> touches addr[1].
> Is this the expected behavior?
>

Yes, this is the expected behavior of the above code. Which is to say: yes,
this code is broken.

> We can fix the situation by adding the "memory" clobber to the asm()
> directive, but maybe there's a more elegant way to tell the compiler
> we're potentially touching any byte of the array?

There is no "any byte of the array" here, because you passed it a value of
type "long". If you passed the asm a value of array type, it would treat it
as touching any byte of the array. But, in this case, there's really no way
of knowing the intended size of the data pointed to by the pointer, so
that's not workable. Given that you're passing a pointer to unsized data, I
would write this instead as simply taking the address and using a memory
clobber.

That is:
  asm volatile("lock; btr %1,%0"
    :
    : "r"(addr), "Ir" (nr)
    : "memory");
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190328/925a2798/attachment.html>