[LLVMdev] Aliasing of volatile and non-volatile

Sat Sep 7 01:23:15 PDT 2013

Are you sure this is an alias problem?
What is happening is LLVM is leaving the code looking like this:
int foo(int *p, volatile int *q, int n) {
  int i, s = 0;
  for (i = 0; i < n; ++i)
    s += *p + *q;
  return s;
}

but GCC is changing to code to look like this:

int foo(int *p, volatile int *q, int n) {
  int i, s = 0;
  int t;
  t = *p;
  for (i = 0; i < n; ++i)
    s += t + *q;
  return s;
}

GCC is raising the *p out of the loop, recognizing the fact that memory
access is more expensive than reg access.
What is preventing LLVM from doing the same?

It could have been even faster if it wished and changed the code to this:

int foo(int *p, volatile int *q, int n) {
  int i, s = 0;
  int t;
  t = *p;
  s += t * n;
  for (i = 0; i < n; ++i)
    s += *q;
  return s;
}

On 4 September 2013 22:33, Krzysztof Parzyszek <kparzysz at codeaurora.org>wrote:

> A customer has reported a performance problem, which I have eventually
> tracked down to the following situation:
>
> Consider this program:
>
> int foo(int *p, volatile int *q, int n) {
>   int i, s = 0;
>   for (i = 0; i < n; ++i)
>     s += *p + *q;
>   return s;
> }
>
>
> LLVM's analysis indicates that *p and *q can alias, even though *p is
> non-volatile whereas *q is volatile.  I don't have the exact section from
> the C standard, but if I remember correctly, accessing volatile memory via
> a non-volatile object results in an undefined behavior.  This would suggest
> that volatiles and non-volatiles may be considered not to alias
> automatically, even if TBAA would not be able to prove it.
>
> The LLVM's code (on x86) at -O2 looks like this:
>
>         .text
>         .globl  foo
>         .align  16, 0x90
>         .type   foo, at function
> foo:                                    # @foo
>         .cfi_startproc
> # BB#0:                                 # %entry
>         xorl    %eax, %eax
>         testl   %edx, %edx
>         jle     .LBB0_2
>         .align  16, 0x90
> .LBB0_1:                                # %for.body
>         addl    (%rdi), %eax
>         addl    (%rsi), %eax
>         decl    %edx
>         jne     .LBB0_1
> .LBB0_2:                                # %for.end
>         ret
> .Ltmp0:
>         .size   foo, .Ltmp0-foo
>         .cfi_endproc
>         .section        ".note.GNU-stack","", at progbits
>
>
> For comparison, GCC has only one load in the loop:
>
>         .text
>         .p2align 4,,15
> .globl foo
>         .type   foo, @function
> foo:
> .LFB0:
>         .cfi_startproc
>         xorl    %eax, %eax
>         testl   %edx, %edx
>         jle     .L3
>         movl    (%rdi), %r8d
>         xorl    %ecx, %ecx
>         .p2align 4,,10
>         .p2align 3
> .L4:
>         movl    (%rsi), %edi
>         addl    $1, %ecx
>         addl    %r8d, %edi
>         addl    %edi, %eax
>         cmpl    %edx, %ecx
>         jne     .L4
> .L3:
>         rep
>         ret
>         .cfi_endproc
> .LFE0:
>         .size   foo, .-foo
>         .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
>         .section        .note.GNU-stack,"", at progbits
>
>
> The specifics in our case were that the AliasSetTracker indicated mod/ref
> for all pointers in the loop, where one of them was used in a volatile
> load, and all others were used in non-volatile loads.  As a result, the
> loop had more loads than necessary.
>
> Any thoughts?  Has there been any consideration for this type of a
> situation?
>
> -K
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
> ______________________________**_________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130907/ba76d18a/attachment.html>