<div dir="ltr"><div>This might not be the workaround you want because it is only available in C, but you can use restrict to allow such optimizations.</div><div><br></div><div><a href="https://godbolt.org/z/2gQ26f">https://godbolt.org/z/2gQ26f</a></div><div><br></div><div>Alex<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 8, 2019 at 11:50 AM Michael Kruse via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<br>
char* scscx is an universal pointer and may point to anything,<br>
including itself. That is, scscx might point to itself:<br>
<br>
scscx = (char*)&scscx;<br>
<br>
such that<br>
<br>
scscx[0] = ...<br>
<br>
changes the address scscx point to. A pointer to (int*) in contrast is<br>
only allowed to point to integers in memory, it is not an universal<br>
pointer. In particular, when accessing it the compiler can assume that<br>
it is not aliasing with something that is of type char*.<br>
<br>
For more details, see e.g. Wikipedia [1] or Stackoverflow [2]<br>
<br>
[1] <a href="https://en.wikipedia.org/wiki/Pointer_aliasing#Aliasing_and_re-ordering" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/Pointer_aliasing#Aliasing_and_re-ordering</a><br>
[2] <a href="https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule" rel="noreferrer" target="_blank">https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule</a><br>
<br>
Michael<br>
<br>
<br>
Am Do., 8. Aug. 2019 um 10:19 Uhr schrieb Joan Lluch via llvm-dev<br>
<<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>:<br>
><br>
> I found a something that I quite not understand when compiling a common piece of code using the -Os flags.<br>
> I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code:<br>
><br>
> char pp[3];<br>
> char *scscx = pp;<br>
> int tst( char i, char j, char k )<br>
> {<br>
> scscx[0] = i;<br>
> scscx[1] = j;<br>
> scscx[2] = k;<br>
> return 0;<br>
> }<br>
><br>
> The above gets compiled for the x86 architecture like this:<br>
><br>
> ; Function Attrs: nofree norecurse nounwind optsize uwtable<br>
> define i32 @tst(i8 signext %i, i8 signext %j, i8 signext %k) local_unnamed_addr #1 {<br>
> entry:<br>
> %0 = load i8*, i8** @scscx, align 8, !tbaa !11<br>
> store i8 %i, i8* %0, align 1, !tbaa !13<br>
> %1 = load i8*, i8** @scscx, align 8, !tbaa !11<br>
> %arrayidx1 = getelementptr inbounds i8, i8* %1, i64 1<br>
> store i8 %j, i8* %arrayidx1, align 1, !tbaa !13<br>
> %2 = load i8*, i8** @scscx, align 8, !tbaa !11<br>
> %arrayidx2 = getelementptr inbounds i8, i8* %2, i64 2<br>
> store i8 %k, i8* %arrayidx2, align 1, !tbaa !13<br>
> ret i32 0<br>
> }<br>
><br>
> According to that, the variable ‘scscx’ is loaded three times despite it’s never modified. The resulting assembly code is this:<br>
><br>
> .globl _tst<br>
> _tst:<br>
> .cfi_startproc<br>
> pushl %ebp<br>
> .cfi_def_cfa_offset 8<br>
> .cfi_offset %ebp, -8<br>
> movl %esp, %ebp<br>
> .cfi_def_cfa_register %ebp<br>
> pushl %esi<br>
> .cfi_offset %esi, -12<br>
> movb 16(%ebp), %al<br>
> movb 12(%ebp), %cl<br>
> movb 8(%ebp), %dl<br>
> movl _scscx, %esi<br>
> movb %dl, (%esi)<br>
> movl _scscx, %edx<br>
> movb %cl, 1(%edx)<br>
> movl _scscx, %ecx<br>
> movb %al, 2(%ecx)<br>
> xorl %eax, %eax<br>
> popl %esi<br>
> popl %ebp<br>
> retl<br>
> .cfi_endproc<br>
><br>
> .comm _pp,3,0<br>
> .section __DATA,__data<br>
> .globl _scscx<br>
> .p2align 3<br>
> _scscx:<br>
> .long _pp<br>
><br>
><br>
> Again, the _scscx is loaded three times instead of reusing a register, which is suboptimal.<br>
><br>
><br>
> NOW, if I replace the original code by this:<br>
><br>
> int pp[3];<br>
> int *scscx = pp;<br>
> int tst( int i, int j, int k )<br>
> {<br>
> scscx[0] = i;<br>
> scscx[1] = j;<br>
> scscx[2] = k;<br>
> return 0;<br>
> }<br>
><br>
> I get the following:<br>
><br>
><br>
> ; Function Attrs: nofree norecurse nounwind optsize uwtable<br>
> define i32 @tst(i32 %i, i32 %j, i32 %k) local_unnamed_addr #1 {<br>
> entry:<br>
> %0 = load i32*, i32** @scscx, align 8, !tbaa !11<br>
> store i32 %i, i32* %0, align 4, !tbaa !13<br>
> %arrayidx1 = getelementptr inbounds i32, i32* %0, i64 1<br>
> store i32 %j, i32* %arrayidx1, align 4, !tbaa !13<br>
> %arrayidx2 = getelementptr inbounds i32, i32* %0, i64 2<br>
> store i32 %k, i32* %arrayidx2, align 4, !tbaa !13<br>
> ret i32 0<br>
> }<br>
><br>
><br>
> .globl _tst<br>
> _tst:<br>
> .cfi_startproc<br>
> pushl %ebp<br>
> .cfi_def_cfa_offset 8<br>
> .cfi_offset %ebp, -8<br>
> movl %esp, %ebp<br>
> .cfi_def_cfa_register %ebp<br>
> pushl %esi<br>
> .cfi_offset %esi, -12<br>
> movl 16(%ebp), %eax<br>
> movl 12(%ebp), %ecx<br>
> movl 8(%ebp), %edx<br>
> movl _scscx, %esi<br>
> movl %edx, (%esi)<br>
> movl %ecx, 4(%esi)<br>
> movl %eax, 8(%esi)<br>
> xorl %eax, %eax<br>
> popl %esi<br>
> popl %ebp<br>
> retl<br>
> .cfi_endproc<br>
><br>
> .comm _pp,12,2<br>
> .section __DATA,__data<br>
> .globl _scscx<br>
> .p2align 3<br>
> _scscx:<br>
> .long _pp<br>
><br>
><br>
> In this case the compiler optimises the load of _scscx into a register and reuses its value instead of loading the variable multiple times. This results in a cleaner and more optimal code, specially when compared with the first case.<br>
><br>
> I would like to understand why this happens, and whether there’s a way (or workaround) to improve it?<br>
><br>
> Should I file a bug report for that?<br>
><br>
> Thanks.<br>
><br>
> Joan<br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>