[PATCH] D52834: [ARM] Account for implicit IT when calculating inline asm size
Peter Smith via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 3 09:40:34 PDT 2018
peter.smith created this revision.
peter.smith added reviewers: rengolin, t.p.northover, eli.friedman.
Herald added a reviewer: javed.absar.
Herald added subscribers: chrib, kristof.beyls.
When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated.
The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT.
Fixes pr31805 https://bugs.llvm.org/show_bug.cgi?id=31805 which seems to be intermittently causing the Linux Kernel to fail to build on Arm.
Possible alternatives:
- Be less conservative and a small constant, say 4-bytes to account for any overspill from implicit IT creation, this could still fail but would probably work well in practice.
- The Arm MaxInstLength could be increased to 6, but this may not be right for Arm as there is some code in the ConstantIslands pass that assumes 4-byte alignment of Arm instructions.
- Reimplement getInlineAsmLength() or some similar function to estimate how many implicit IT instructions are needed. Doing this accurately may require a lot of the functionality of an assembly parser though.
https://reviews.llvm.org/D52834
Files:
lib/Target/ARM/ARMComputeBlockSize.cpp
test/CodeGen/ARM/cbz-implicit-it-range.ll
Index: test/CodeGen/ARM/cbz-implicit-it-range.ll
===================================================================
--- /dev/null
+++ test/CodeGen/ARM/cbz-implicit-it-range.ll
@@ -0,0 +1,47 @@
+;RUN: llc -O2 -mtriple=thumbv7a-linux-gnueabihf -arm-implicit-it=always %s -o - | FileCheck %s
+;RUN llc -O2 -mtriple=thumbv7a-linux-gnueabihf -no-integrated-as %s -o -
+
+; Check that we do not produce a CBZ instruction to jump over the inline
+; assembly as the distance is too far if the implicit IT instructions are
+; added.
+
+define void @f0(i32 %p1, i32 %p2, i32 %p3) nounwind {
+entry:
+ %cmp = icmp eq i32 %p1, 0
+ br i1 %cmp, label %if.else, label %if.then
+
+if.then:
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ tail call void asm sideeffect "movseq r0, #0\0A", ""()
+ br label %if.end
+
+if.else:
+ tail call void asm sideeffect "nop\0A", ""()
+ br label %if.end
+
+if.end:
+ ret void
+}
+; CHECK-LABEL: f0:
+; CHECK: beq .LBB0_{{[0-9]+}}
+
Index: lib/Target/ARM/ARMComputeBlockSize.cpp
===================================================================
--- lib/Target/ARM/ARMComputeBlockSize.cpp
+++ lib/Target/ARM/ARMComputeBlockSize.cpp
@@ -51,14 +51,21 @@
BBI.PostAlign = 0;
for (MachineInstr &I : *MBB) {
- BBI.Size += TII->getInstSizeInBytes(I);
+ unsigned InstSize = TII->getInstSizeInBytes(I);
// For inline asm, getInstSizeInBytes returns a conservative estimate.
// The actual size may be smaller, but still a multiple of the instr size.
- if (I.isInlineAsm())
+ if (I.isInlineAsm()) {
BBI.Unalign = isThumb ? 1 : 2;
+ // Account for worst case of an extra implicit IT instruction
+ // for each 4-byte Thumb instruction.
+ if (isThumb)
+ InstSize += InstSize / 2;
+ }
// Also consider instructions that may be shrunk later.
else if (isThumb && mayOptimizeThumb2Instruction(&I))
BBI.Unalign = 1;
+
+ BBI.Size += InstSize;
}
// tBR_JTr contains a .align 2 directive.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D52834.168125.patch
Type: text/x-patch
Size: 3157 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181003/0a53bbb5/attachment.bin>
More information about the llvm-commits
mailing list