[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics
Chandler Carruth
chandlerc at google.com
Tue Jan 29 11:15:00 PST 2013
LGTM as well.
I'm particularly happy to specify that there is no guarantee about the
width of loads and/or stores resulting, as I think that's important for
frontends to not depend on (or to request a change if they need to depend
on it).
Thanks Andy far the additional clarification, that helps pin down things
even more.
On Tue, Jan 29, 2013 at 10:30 AM, Andrew Trick <atrick at apple.com> wrote:
> I can't think of a better way to do this, so I think it's ok.
>
> I also submitted a complementary patch on llvm-commits clarifying
> volatile semantics.
>
> -Andy
>
> On Jan 28, 2013, at 8:54 AM, Arnaud A. de Grandmaison <
> arnaud.allarddegrandmaison at parrot.com> wrote:
>
> > Hi All,
> >
> > In the language reference manual, the access behavior of the memcpy,
> > memmove and memset intrinsics is not well defined with respect to the
> > volatile flag. The LRM even states that "it is unwise to depend on it".
> > This forces optimization passes to be conservatively correct and prevent
> > optimizations.
> >
> > A very simple example of this is :
> >
> > $ cat test.c
> >
> > #include <stdint.h>
> >
> > struct R {
> > uint16_t a;
> > uint16_t b;
> > };
> >
> > volatile struct R * const addr = (volatile struct R *) 416;
> >
> > void test(uint16_t a)
> > {
> > struct R r = { a, 1 };
> > *addr = r;
> > }
> >
> > $ clang -O2 -o - -emit-llvm -S -c test.c
> > ; ModuleID = 'test.c'
> > target datalayout =
> "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
> > target triple = "x86_64-unknown-linux-gnu"
> >
> > %struct.R = type { i16, i16 }
> >
> > @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8
> >
> > define void @test(i16 zeroext %a) nounwind uwtable {
> > %r.sroa.0 = alloca i16, align 2
> > %r.sroa.1 = alloca i16, align 2
> > store i16 %a, i16* %r.sroa.0, align 2
> > store i16 1, i16* %r.sroa.1, align 2
> > %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2
> > store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*),
> align 32
> > %r.sroa.1.0.load2 = load volatile i16* %r.sroa.1, align 2
> > store volatile i16 %r.sroa.1.0.load2, i16* inttoptr (i64 418 to i16*),
> align 2
> > ret void
> > }
> >
> > In the generated ir, the loads are marked as volatile. In this case,
> > this is due to the new SROA being conservatively correct. We should
> > however expect the test function to look like this much simpler one
> > (which was actually the case with llvm <= 3.1) :
> >
> > define void @test(i16 zeroext %a) nounwind uwtable {
> > store volatile i16 %a, i16* inttoptr (i64 416 to i16*), align 2
> > store volatile i16 1, i16* inttoptr (i64 418 to i16*), align 2
> > ret void
> > }
> >
> >
> > I propose to specify the volatile access behavior for those intrinsics :
> > instead of one flag, they should have 2 independant volatile flags, one
> > for destination accesses, the second for the source accesses.
> >
> > If there is general agreement, I plan to proceed with the following
> steps :
> > 1. Specify the access behavior (this patch).
> > 2. Auto-upgrade the existing memcpy, memmove and memset intrinsics into
> > the more precise form by replicating the single volatile flag and rework
> > the MemIntrinsic hierarchy to provide (is|set)SrcVolatile(),
> > (is|set)DestVolatile() and implement (set|is)Volatile in terms of the
> > former 2 methods. This will conservatively preserve semantics. No
> > functional change so far. Commit 1 & 2.
> > 3. Audit all uses of isVolatile() / setVolatile() and move them to the
> > more precise form. From this point, more aggressive / precise
> > optimizations can happen. Commit 3.
> > 4. Teach clang to use the new form.
> > 5. Optionally remove the old interface form,as there should be no
> > in-tree users left. This would however be an API change breaking
> > external code.
> >
> > Cheers,
> >
> > --
> > Arnaud de Grandmaison
> >
> >
> <0001-Specify-the-access-behaviour-of-the-memcpy-memmove-a.patch>_______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130129/618d6535/attachment.html>
More information about the llvm-dev
mailing list