[LLVMdev] Aliasing of volatile and non-volatile

Krzysztof Parzyszek kparzysz at codeaurora.org
Wed Sep 4 14:33:36 PDT 2013


A customer has reported a performance problem, which I have eventually 
tracked down to the following situation:

Consider this program:

int foo(int *p, volatile int *q, int n) {
   int i, s = 0;
   for (i = 0; i < n; ++i)
     s += *p + *q;
   return s;
}


LLVM's analysis indicates that *p and *q can alias, even though *p is 
non-volatile whereas *q is volatile.  I don't have the exact section 
from the C standard, but if I remember correctly, accessing volatile 
memory via a non-volatile object results in an undefined behavior.  This 
would suggest that volatiles and non-volatiles may be considered not to 
alias automatically, even if TBAA would not be able to prove it.

The LLVM's code (on x86) at -O2 looks like this:

         .text
         .globl  foo
         .align  16, 0x90
         .type   foo, at function
foo:                                    # @foo
         .cfi_startproc
# BB#0:                                 # %entry
         xorl    %eax, %eax
         testl   %edx, %edx
         jle     .LBB0_2
         .align  16, 0x90
.LBB0_1:                                # %for.body
         addl    (%rdi), %eax
         addl    (%rsi), %eax
         decl    %edx
         jne     .LBB0_1
.LBB0_2:                                # %for.end
         ret
.Ltmp0:
         .size   foo, .Ltmp0-foo
         .cfi_endproc
         .section        ".note.GNU-stack","", at progbits


For comparison, GCC has only one load in the loop:

         .text
         .p2align 4,,15
.globl foo
         .type   foo, @function
foo:
.LFB0:
         .cfi_startproc
         xorl    %eax, %eax
         testl   %edx, %edx
         jle     .L3
         movl    (%rdi), %r8d
         xorl    %ecx, %ecx
         .p2align 4,,10
         .p2align 3
.L4:
         movl    (%rsi), %edi
         addl    $1, %ecx
         addl    %r8d, %edi
         addl    %edi, %eax
         cmpl    %edx, %ecx
         jne     .L4
.L3:
         rep
         ret
         .cfi_endproc
.LFE0:
         .size   foo, .-foo
         .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
         .section        .note.GNU-stack,"", at progbits


The specifics in our case were that the AliasSetTracker indicated 
mod/ref for all pointers in the loop, where one of them was used in a 
volatile load, and all others were used in non-volatile loads.  As a 
result, the loop had more loads than necessary.

Any thoughts?  Has there been any consideration for this type of a 
situation?

-K


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation



More information about the llvm-dev mailing list