[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Tue Jan 29 10:30:19 PST 2013

I can't think of a better way to do this, so I think it's ok.

I also  submitted a complementary patch on llvm-commits clarifying volatile semantics.

-Andy

On Jan 28, 2013, at 8:54 AM, Arnaud A. de Grandmaison <arnaud.allarddegrandmaison at parrot.com> wrote:

> Hi All,
> 
> In the language reference manual, the access behavior of the memcpy,
> memmove and memset intrinsics is not well defined with respect to the
> volatile flag. The LRM even states that "it is unwise to depend on it".
> This forces optimization passes to be conservatively correct and prevent
> optimizations.
> 
> A very simple example of this is :
> 
> $ cat test.c 
> 
> #include <stdint.h>
> 
> struct R {
>  uint16_t a;
>  uint16_t b;
> };
> 
> volatile struct R * const addr = (volatile struct R *) 416;
> 
> void test(uint16_t a)
> {
>  struct R r = { a, 1 };
>  *addr = r;
> }
> 
> $ clang -O2 -o - -emit-llvm -S -c test.c 
> ; ModuleID = 'test.c'
> target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
> target triple = "x86_64-unknown-linux-gnu"
> 
> %struct.R = type { i16, i16 }
> 
> @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8
> 
> define void @test(i16 zeroext %a) nounwind uwtable {
>  %r.sroa.0 = alloca i16, align 2
>  %r.sroa.1 = alloca i16, align 2
>  store i16 %a, i16* %r.sroa.0, align 2
>  store i16 1, i16* %r.sroa.1, align 2
>  %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2
>  store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32
>  %r.sroa.1.0.load2 = load volatile i16* %r.sroa.1, align 2
>  store volatile i16 %r.sroa.1.0.load2, i16* inttoptr (i64 418 to i16*), align 2
>  ret void
> }
> 
> In the generated ir, the loads are marked as volatile. In this case,
> this is due to the new SROA being conservatively correct. We should
> however expect the test function to look like this much simpler one
> (which was actually the case with llvm <= 3.1) :
> 
> define void @test(i16 zeroext %a) nounwind uwtable {
>  store volatile i16 %a, i16* inttoptr (i64 416 to i16*), align 2
>  store volatile i16 1, i16* inttoptr (i64 418 to i16*), align 2
>  ret void
> }
> 
> 
> I propose to specify the volatile access behavior for those intrinsics :
> instead of one flag, they should have 2 independant volatile flags, one
> for destination accesses, the second for the source accesses.
> 
> If there is general agreement, I plan to proceed with the following steps :
> 1. Specify the access behavior (this patch).
> 2. Auto-upgrade the existing memcpy, memmove and memset intrinsics into
> the more precise form by replicating the single volatile flag and rework
> the MemIntrinsic hierarchy to provide (is|set)SrcVolatile(),
> (is|set)DestVolatile() and implement (set|is)Volatile in terms of the
> former 2 methods. This will conservatively preserve semantics. No
> functional change so far. Commit 1 & 2.
> 3. Audit all uses of isVolatile() / setVolatile() and move them to the
> more precise form. From this point, more aggressive / precise
> optimizations can happen. Commit 3.
> 4. Teach clang to use the new form.
> 5. Optionally remove the old interface form,as there should be no
> in-tree users left. This would however be an API change breaking
> external code.
> 
> Cheers,
> 
> -- 
> Arnaud de Grandmaison
> 
> <0001-Specify-the-access-behaviour-of-the-memcpy-memmove-a.patch>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev