[PATCH] Fix a bug in the Thumb1 ARM Load/Store optimizer which resulted in the base register reset instruction being incorrectly placed.

Moritz Roth moritz.roth at arm.com
Tue Jun 10 08:36:01 PDT 2014


Hi rengolin,

Hi Renato, all,

This patch fixes http://llvm.org/bugs/show_bug.cgi?id=19972 - a Thumb-1 specific codegen bug in the ARM Load/Store optimization pass.

Previously, the basic block was searched for future uses of the base register, and if necessary any writeback to the base register was reset using a SUB instruction (e.g. before calling a function) just before such a use. However, this step happened *before* the merged LDM/STM instruction was built. So if there was (e.g.) a function call directly after the not-yet-formed LDM/STM, the pass would first insert a SUB instruction to reset the base register, and then (at the same location, incorrectly) insert the LDM/STM itself.

Example (taken from Bugzilla):
  void bar(int*, int, int);

  void foo(int *A) {
    int x = A[0];
    int y = A[1];
    bar(A, x, y);
  }

This currently compiles to:
$ clang -target arm  -Os -mthumb  -mcpu=arm926ej-s -c  t.c -S -o -
  @ BB#0:                                 @ %entry
          push    {r7, lr}
          add     r7, sp, #0
          subs    r0, #8
          ldm     r0!, {r1, r2}
          bl      bar
          pop     {r7, pc}

With the patch applied, it gives the much more reasonable
  @ BB#0:
	push	{r7, lr}
	add	r7, sp, #0
	ldm	r0!, {r1, r2}
	subs	r0, #8
	bl	bar
	pop	{r7, pc}

(Note that we do need the subs as the ldm writes back to the base register).

The patch moves the code that emits the merged instruction forward so this happens before the call that updates base register uses and potentially inserts the SUB. I've also added a test case derived from the example above to make sure that it now works properly.

Cheers
Moritz

http://reviews.llvm.org/D4084

Files:
  lib/Target/ARM/ARMLoadStoreOptimizer.cpp
  test/CodeGen/Thumb/2014-06-10-thumb1-ldst-opt-bug.ll

Index: lib/Target/ARM/ARMLoadStoreOptimizer.cpp
===================================================================
--- lib/Target/ARM/ARMLoadStoreOptimizer.cpp
+++ lib/Target/ARM/ARMLoadStoreOptimizer.cpp
@@ -505,7 +505,7 @@
 
   // Exception: If the base register is in the input reglist, Thumb1 LDM is
   // non-writeback. Check for this.
-  if (Opcode == ARM::tLDRi && isThumb1)
+  if (Opcode == ARM::tLDMIA && isThumb1)
     for (unsigned I = 0; I < NumRegs; ++I)
       if (Base == Regs[I].first) {
         Writeback = false;
@@ -519,17 +519,17 @@
       // Update tLDMIA with writeback if necessary.
       Opcode = ARM::tLDMIA_UPD;
 
-    // The base isn't dead after a merged instruction with writeback. Update
-    // future uses of the base with the added offset (if possible), or reset
-    // the base register as necessary.
-    if (!BaseKill)
-      UpdateBaseRegUses(MBB, MBBI, dl, Base, NumRegs, Pred, PredReg);
-
     MIB = BuildMI(MBB, MBBI, dl, TII->get(Opcode));
 
     // Thumb1: we might need to set base writeback when building the MI.
     MIB.addReg(Base, getDefRegState(true))
        .addReg(Base, getKillRegState(BaseKill));
+
+    // The base isn't dead after a merged instruction with writeback. Update
+    // future uses of the base with the added offset (if possible), or reset
+    // the base register as necessary.
+    if (!BaseKill)
+      UpdateBaseRegUses(MBB, MBBI, dl, Base, NumRegs, Pred, PredReg);
   } else {
     // No writeback, simply build the MachineInstr.
     MIB = BuildMI(MBB, MBBI, dl, TII->get(Opcode));
Index: test/CodeGen/Thumb/2014-06-10-thumb1-ldst-opt-bug.ll
===================================================================
--- /dev/null
+++ test/CodeGen/Thumb/2014-06-10-thumb1-ldst-opt-bug.ll
@@ -0,0 +1,17 @@
+; RUN: llc < %s -mtriple=thumbv6m-eabi -o - | FileCheck %s
+
+define void @foo(i32* %A) #0 {
+entry:
+; CHECK-LABEL: foo:
+; CHECK: push {r7, lr}
+; CHECK: ldm [[REG0:r[0-9]]]!,
+; CHECK-NEXT: subs [[REG0]]
+; CHECK-NEXT: bl
+  %0 = load i32* %A, align 4
+  %arrayidx1 = getelementptr inbounds i32* %A, i32 1
+  %1 = load i32* %arrayidx1, align 4
+  tail call void @bar(i32* %A, i32 %0, i32 %1) #2
+  ret void
+}
+
+declare void @bar(i32*, i32, i32) #1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4084.10278.patch
Type: text/x-patch
Size: 2232 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140610/5afc50bd/attachment.bin>


More information about the llvm-commits mailing list