[PATCH] D52834: [ARM] Account for implicit IT when calculating inline asm size

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 3 09:40:34 PDT 2018


peter.smith created this revision.
peter.smith added reviewers: rengolin, t.p.northover, eli.friedman.
Herald added a reviewer: javed.absar.
Herald added subscribers: chrib, kristof.beyls.

When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated.

      

The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT.

      

Fixes pr31805 https://bugs.llvm.org/show_bug.cgi?id=31805 which seems to be intermittently causing the Linux Kernel to fail to build on Arm.

Possible alternatives:

- Be less conservative and a small constant, say 4-bytes to account for any overspill from implicit IT creation, this could still fail but would probably work well in practice.
- The Arm MaxInstLength could be increased to 6, but this may not be right for Arm as there is some code in the ConstantIslands pass that assumes 4-byte alignment of Arm instructions.
- Reimplement getInlineAsmLength() or some similar function to estimate how many implicit IT instructions are needed. Doing this accurately may require a lot of the functionality of an assembly parser though.


https://reviews.llvm.org/D52834

Files:
  lib/Target/ARM/ARMComputeBlockSize.cpp
  test/CodeGen/ARM/cbz-implicit-it-range.ll


Index: test/CodeGen/ARM/cbz-implicit-it-range.ll
===================================================================
--- /dev/null
+++ test/CodeGen/ARM/cbz-implicit-it-range.ll
@@ -0,0 +1,47 @@
+;RUN: llc -O2 -mtriple=thumbv7a-linux-gnueabihf -arm-implicit-it=always %s -o - | FileCheck %s
+;RUN llc -O2 -mtriple=thumbv7a-linux-gnueabihf -no-integrated-as %s -o -
+
+; Check that we do not produce a CBZ instruction to jump over the inline
+; assembly as the distance is too far if the implicit IT instructions are
+; added.
+
+define void @f0(i32 %p1, i32 %p2, i32 %p3) nounwind {
+entry:
+  %cmp = icmp eq i32 %p1, 0
+  br i1 %cmp, label %if.else, label %if.then
+
+if.then:
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  tail call void asm sideeffect "movseq r0, #0\0A", ""()
+  br label %if.end
+
+if.else:
+  tail call void asm sideeffect "nop\0A", ""()
+  br label %if.end
+
+if.end:
+  ret void
+}
+; CHECK-LABEL: f0:
+; CHECK: beq .LBB0_{{[0-9]+}}
+
Index: lib/Target/ARM/ARMComputeBlockSize.cpp
===================================================================
--- lib/Target/ARM/ARMComputeBlockSize.cpp
+++ lib/Target/ARM/ARMComputeBlockSize.cpp
@@ -51,14 +51,21 @@
   BBI.PostAlign = 0;
 
   for (MachineInstr &I : *MBB) {
-    BBI.Size += TII->getInstSizeInBytes(I);
+    unsigned InstSize = TII->getInstSizeInBytes(I);
     // For inline asm, getInstSizeInBytes returns a conservative estimate.
     // The actual size may be smaller, but still a multiple of the instr size.
-    if (I.isInlineAsm())
+    if (I.isInlineAsm()) {
       BBI.Unalign = isThumb ? 1 : 2;
+      // Account for worst case of an extra implicit IT instruction
+      // for each 4-byte Thumb instruction.
+      if (isThumb)
+        InstSize += InstSize / 2;
+    }
     // Also consider instructions that may be shrunk later.
     else if (isThumb && mayOptimizeThumb2Instruction(&I))
       BBI.Unalign = 1;
+
+    BBI.Size += InstSize;
   }
 
   // tBR_JTr contains a .align 2 directive.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D52834.168125.patch
Type: text/x-patch
Size: 3157 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181003/0a53bbb5/attachment.bin>


More information about the llvm-commits mailing list