[llvm-commits] [llvm] r142530 - in /llvm/trunk: lib/Target/ARM/ARMISelLowering.cpp test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll

Evan Cheng evan.cheng at apple.com
Tue Oct 25 18:08:37 PDT 2011


Hi James,

Sorry I missed this patch earlier. I have some problem with this.

1. Note -Os for LLVM, especially for Darwin, means reduce code size without hurting performance. Did you test the impact of this patch on the performance of generated code? Our tests has shown movw / movt to be faster than ldr. This is especially a concern for Cortex-A9 with more limited address generation bandwidth. Also note, for PIC codegen, at least on Darwin, loading a global would require ldr + add + ldr. That's especially unfriendly:
        ldr.n   r1, LCPI1_0
LPC1_0:
        add     r1, pc
        ldr     r1, [r1]
...
LCPI1_0:
        .long   _foo

vs.
        movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc
        ldr     r0, [r0]

2. Where is code size saving from? You are talking about movw + movt vs. ldr + a constantpool entry.

3. There are good reasons to avoid constantpools. The constantpools in text makes assembly hard to read and it sometimes would end up introducing jumps. That hurts both performance, compile time, and sometimes code size.

4. Darwin is especially sensitive to this because of objective-c code. SPEC is interesting, but it's not the only benchmarks.

Please check with me before making changes that can impact Darwin codegen in the future. We can continue the discussion, but I'm going revert the change for Darwin.

Thanks.

Evan


On Oct 19, 2011, at 7:11 AM, James Molloy wrote:

> Author: jamesm
> Date: Wed Oct 19 09:11:07 2011
> New Revision: 142530
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=142530&view=rev
> Log:
> Use literal pool loads instead of MOVW/MOVT for materializing global addresses when optimizing for size.
> 
> On spec/gcc, this caused a codesize improvement of ~1.9% for ARM mode and ~4.9% for Thumb(2) mode. This is
> codesize including literal pools.
> 
> The pools themselves doubled in size for ARM mode and quintupled for Thumb mode, leaving suggestion that there
> is still perhaps redundancy in LLVM's use of constant pools that could be decreased by sharing entries.
> 
> Fixes PR11087.
> 
> 
> Added:
>    llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll   (with props)
> Modified:
>    llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
> 
> Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=142530&r1=142529&r2=142530&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Wed Oct 19 09:11:07 2011
> @@ -2103,8 +2103,10 @@
>   }
> 
>   // If we have T2 ops, we can materialize the address directly via movt/movw
> -  // pair. This is always cheaper.
> -  if (Subtarget->useMovt()) {
> +  // pair. This is always cheaper in terms of performance, but uses at least 2
> +  // extra bytes.
> +  if (Subtarget->useMovt() &&
> +      !DAG.getMachineFunction().getFunction()->hasFnAttr(Attribute::OptimizeForSize)) {
>     ++NumMovwMovt;
>     // FIXME: Once remat is capable of dealing with instructions with register
>     // operands, expand this into two nodes.
> @@ -2129,7 +2131,8 @@
>   ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();
> 
>   // FIXME: Enable this for static codegen when tool issues are fixed.
> -  if (Subtarget->useMovt() && RelocM != Reloc::Static) {
> +  if (Subtarget->useMovt() && RelocM != Reloc::Static &&
> +      !DAG.getMachineFunction().getFunction()->hasFnAttr(Attribute::OptimizeForSize)) {
>     ++NumMovwMovt;
>     // FIXME: Once remat is capable of dealing with instructions with register
>     // operands, expand this into two nodes.
> 
> Added: llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll?rev=142530&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll (added)
> +++ llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll Wed Oct 19 09:11:07 2011
> @@ -0,0 +1,27 @@
> +; RUN: llc < %s -mtriple=armv7-apple-darwin  | FileCheck %s
> +; RUN: llc < %s -mtriple=armv7-unknown-linux-eabi | FileCheck %s
> +
> +; Check that when optimizing for size, a literal pool load is used
> +; instead of the (potentially faster) movw/movt pair when loading
> +; a large constant.
> +
> + at x = global i32* inttoptr (i32 305419888 to i32*), align 4
> +
> +define i32 @f() optsize {
> +  ; CHECK: f:
> +  ; CHECK: ldr  r{{.}}, {{.?}}LCPI{{.}}_{{.}}
> +  ; CHECK: ldr  r{{.}}, [{{(pc, )?}}r{{.}}]
> +  ; CHECK: ldr  r{{.}}, [r{{.}}]
> +  %1 = load i32** @x, align 4
> +  %2 = load i32* %1
> +  ret i32 %2
> +}
> +
> +define i32 @g() {
> +  ; CHECK: g:
> +  ; CHECK: movw
> +  ; CHECK: movt
> +  %1 = load i32** @x, align 4
> +  %2 = load i32* %1
> +  ret i32 %2
> +}
> 
> Propchange: llvm/trunk/test/CodeGen/ARM/2011-10-18-DisableMovtSize.ll
> ------------------------------------------------------------------------------
>    svn:eol-style = native
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list