[LLVMdev] Another missed optimization opportunity?
Dan Gohman
dan433584 at gmail.com
Wed Apr 24 13:31:19 PDT 2013
The semantic reason is that the optimizer is required to assume that the
i32 stores could be storing to the storage of myarray. LLVM IR does not
permit optimizers to optimize based on the nominal types of memory objects
or memory accesses.
This gets optimized in C, because the C compiler adds special TBAA metadata
annotations to the loads and stores which say that the stores of "int" do
not interfere with the loads of "pointer". It also gets optimized if
myarray is const, because the optimizer knows that const memory is not
modified by stores. It also gets optimized if myarray is an actual array,
because then the address of the array is constant, rather than being a
value loaded from memory.
Dan
On Wed, Apr 24, 2013 at 10:40 AM, Scott Pakin <pakin at lanl.gov> wrote:
> I was suprised to find that some bitcode I'm generating isn't getting
> optimized. Here, I'm doing the equivalent of "myarray[5]++" (on an
> "extern int *myarray"), repeated three times:
>
> @myarray = external global i32*
>
> define void @update_array() #0 {
> %1 = load i32** @myarray, align 8
> %2 = getelementptr inbounds i32* %1, i64 5
> %3 = load i32* %2, align 4
> %4 = add nsw i32 %3, 1
> store i32 %4, i32* %2, align 4
> %5 = load i32** @myarray, align 8
> %6 = getelementptr inbounds i32* %5, i64 5
> %7 = load i32* %6, align 4
> %8 = add nsw i32 %7, 1
> store i32 %8, i32* %6, align 4
> %9 = load i32** @myarray, align 8
> %10 = getelementptr inbounds i32* %9, i64 5
> %11 = load i32* %10, align 4
> %12 = add nsw i32 %11, 1
> store i32 %12, i32* %10, align 4
> ret void
> }
>
> Running "opt -std-compile-opts" or even "opt -O3" doesn't seem to
> change the bitcode any. I had expected the three increments by 1 to
> be collapsed into a single increment by 3:
>
> @myarray = external global i32*
>
> define void @update_array() #0 {
> %1 = load i32** @myarray, align 8
> %2 = load i32* %1, align 4
> %3 = add nsw i32 %2, 3
> store i32 %3, i32* %1, align 4
> ret void
> }
>
> Even the (x86-64) code generator doesn't do any last-minute
> optimizations:
>
> movq myarray(%rip), %rax
> incl 20(%rax)
> movq myarray(%rip), %rax
> incl 20(%rax)
> movq myarray(%rip), %rax
> incl 20(%rax)
>
> This is with LLVM revision 180116.
>
> Is there some semantic reason that the increments aren't allowed to be
> combined, or is this a missed optimization opportunity in LLVM?
>
> Thanks,
> -- Scott
> ______________________________**_________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/32e1523f/attachment.html>
More information about the llvm-dev
mailing list