[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics
Andrew Trick
atrick at apple.com
Tue Jan 29 10:30:19 PST 2013
I can't think of a better way to do this, so I think it's ok.
I also submitted a complementary patch on llvm-commits clarifying volatile semantics.
-Andy
On Jan 28, 2013, at 8:54 AM, Arnaud A. de Grandmaison <arnaud.allarddegrandmaison at parrot.com> wrote:
> Hi All,
>
> In the language reference manual, the access behavior of the memcpy,
> memmove and memset intrinsics is not well defined with respect to the
> volatile flag. The LRM even states that "it is unwise to depend on it".
> This forces optimization passes to be conservatively correct and prevent
> optimizations.
>
> A very simple example of this is :
>
> $ cat test.c
>
> #include <stdint.h>
>
> struct R {
> uint16_t a;
> uint16_t b;
> };
>
> volatile struct R * const addr = (volatile struct R *) 416;
>
> void test(uint16_t a)
> {
> struct R r = { a, 1 };
> *addr = r;
> }
>
> $ clang -O2 -o - -emit-llvm -S -c test.c
> ; ModuleID = 'test.c'
> target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
> target triple = "x86_64-unknown-linux-gnu"
>
> %struct.R = type { i16, i16 }
>
> @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8
>
> define void @test(i16 zeroext %a) nounwind uwtable {
> %r.sroa.0 = alloca i16, align 2
> %r.sroa.1 = alloca i16, align 2
> store i16 %a, i16* %r.sroa.0, align 2
> store i16 1, i16* %r.sroa.1, align 2
> %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2
> store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32
> %r.sroa.1.0.load2 = load volatile i16* %r.sroa.1, align 2
> store volatile i16 %r.sroa.1.0.load2, i16* inttoptr (i64 418 to i16*), align 2
> ret void
> }
>
> In the generated ir, the loads are marked as volatile. In this case,
> this is due to the new SROA being conservatively correct. We should
> however expect the test function to look like this much simpler one
> (which was actually the case with llvm <= 3.1) :
>
> define void @test(i16 zeroext %a) nounwind uwtable {
> store volatile i16 %a, i16* inttoptr (i64 416 to i16*), align 2
> store volatile i16 1, i16* inttoptr (i64 418 to i16*), align 2
> ret void
> }
>
>
> I propose to specify the volatile access behavior for those intrinsics :
> instead of one flag, they should have 2 independant volatile flags, one
> for destination accesses, the second for the source accesses.
>
> If there is general agreement, I plan to proceed with the following steps :
> 1. Specify the access behavior (this patch).
> 2. Auto-upgrade the existing memcpy, memmove and memset intrinsics into
> the more precise form by replicating the single volatile flag and rework
> the MemIntrinsic hierarchy to provide (is|set)SrcVolatile(),
> (is|set)DestVolatile() and implement (set|is)Volatile in terms of the
> former 2 methods. This will conservatively preserve semantics. No
> functional change so far. Commit 1 & 2.
> 3. Audit all uses of isVolatile() / setVolatile() and move them to the
> more precise form. From this point, more aggressive / precise
> optimizations can happen. Commit 3.
> 4. Teach clang to use the new form.
> 5. Optionally remove the old interface form,as there should be no
> in-tree users left. This would however be an API change breaking
> external code.
>
> Cheers,
>
> --
> Arnaud de Grandmaison
>
> <0001-Specify-the-access-behaviour-of-the-memcpy-memmove-a.patch>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list