[PATCH][X86] Trying to fix PR20243 - leaf frame pointer bug on X86, ELF, PIC

Dario Domizioli dario.domizioli at gmail.com
Wed Nov 5 06:57:05 PST 2014


Hello llvm-dev,

I am trying to fix PR20243.
The main issue seems to be that the compiler is using a pseudo-instruction
to express a TLS global access (in PIC mode on X86 ELF) and that masks the
fact that a library call has to be performed. The pseudo-instruction is
only lowered in the AsmPrinter, too late for any code that checks whether
the function containing the instruction is a leaf function. This affects
the decision whether to eliminate the frame pointer in the "eliminate frame
pointer only for leaf functions" case.

I have prepared a patch, and my intention was to mark the TLS access
pseudo-instructions with the "isCall" flag, so that they would be
recognized as calls.
However I ran into a problem: the (I believe recently introduced) "FP
Stackify" pass tries to interpret the pseudo-instruction as a proper call
and it tries to get the return registers, but then it asserts because it
gets utterly confused by the pseudo-instruction.
So, to try to solve this, I had to add the "isPseudo" flag to the
instruction, and modify the "FP Stackify" pass to ignore call instructions
that are also pseudo (a one-line change).

This problem suggests to me however that this is probably not the right way
to approach the original problem. It's not a good sign if I'm breaking
things in the process of fixing other stuff.

I'm attaching my patch. I would gladly accept any advice on the matter. Am
I attacking the issue from the right direction? Is the "isCall" flag the
right way to express what the pseudo-instruction does? Or is the usage of a
pseudo-instruction itself wrong, and should it be lowered in ISel instead
of the AsmPrinter?

Cheers,
    Dario Domizioli
    SN Systems - Sony Computer Entertainment Group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141105/0170ab63/attachment.html>
-------------- next part --------------
Index: lib/Target/X86/X86FloatingPoint.cpp
===================================================================
--- lib/Target/X86/X86FloatingPoint.cpp	(revision 221343)
+++ lib/Target/X86/X86FloatingPoint.cpp	(working copy)
@@ -383,7 +383,7 @@
         X86::RFP80RegClass.contains(MI->getOperand(0).getReg()))
       FPInstClass = X86II::SpecialFP;
 
-    if (MI->isCall())
+    if (MI->isCall() && !MI->isPseudo())
       FPInstClass = X86II::SpecialFP;
 
     if (FPInstClass == X86II::NotFP)
Index: lib/Target/X86/X86InstrCompiler.td
===================================================================
--- lib/Target/X86/X86InstrCompiler.td	(revision 221343)
+++ lib/Target/X86/X86InstrCompiler.td	(working copy)
@@ -414,6 +414,8 @@
             MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
             XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7,
             XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
+    isCall = 1,
+    isPseudo = 1,
     Uses = [ESP] in {
 def TLS_addr32 : I<0, Pseudo, (outs), (ins i32mem:$sym),
                   "# TLS_addr32",
@@ -434,6 +436,8 @@
             MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
             XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7,
             XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
+    isCall = 1,
+    isPseudo = 1,
     Uses = [RSP] in {
 def TLS_addr64 : I<0, Pseudo, (outs), (ins i64mem:$sym),
                    "# TLS_addr64",
Index: test/CodeGen/X86/tls-addr-non-leaf-function.ll
===================================================================
--- test/CodeGen/X86/tls-addr-non-leaf-function.ll	(revision 0)
+++ test/CodeGen/X86/tls-addr-non-leaf-function.ll	(working copy)
@@ -0,0 +1,42 @@
+; RUN: llc < %s -relocation-model=pic -O2 -disable-fp-elim -o - | FileCheck %s
+; RUN: llc < %s -relocation-model=pic -O2 -o - | FileCheck %s
+
+; This test runs twice with different options regarding the frame pointer:
+; first the elimination is disabled, then it is enabled. The disabled case is
+; the "control group".
+; The function 'foo' below is marked with the "no-frame-pointer-elim-non-leaf"
+; attribute which dictates that the frame pointer should not be eliminated
+; unless the function is a leaf (i.e. it doesn't call any other function).
+; Now, 'foo' is not a leaf function, because it performs a TLS access which on
+; X86 ELF in PIC mode is expanded as a library call. This call was not spotted
+; by the code that checks for calls within a function, because the call is
+; represented with a pseudo-instruction which didn't have the "isCall" flag.
+; Such pseudo-instruction is expanded too late (after register allocation) and
+; therefore if it's not recognized as a call it will affect the decision on
+; whether to eliminate the frame pointer.
+; With the fix, the "isCall" flag for the pseudo-instruction is set, so 'foo'
+; appears to be a non-leaf function, and the difference in the options does not
+; affect codegen: both versions will have a frame pointer.
+
+; Test that there's some frame pointer usage in 'foo'...
+; CHECK: _Z3fooP10SomeStruct
+; CHECK: pushq %rbp
+; CHECK: movq %rsp, %rbp
+; ... and the TLS library call is also present.
+; CHECK: leaq _ZL2ts at TLSLD(%rip), %rdi
+; CHECK: callq __tls_get_addr at PLT
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+%struct.SomeStruct = type { i32 }
+
+ at _ZL2ts = internal thread_local global %struct.SomeStruct zeroinitializer, align 4
+
+define %struct.SomeStruct* @_Z3fooP10SomeStruct(%struct.SomeStruct* readnone %t) "no-frame-pointer-elim-non-leaf" {
+entry:
+  %cmp = icmp eq %struct.SomeStruct* %t, null
+  %_ZL2ts.t = select i1 %cmp, %struct.SomeStruct* @_ZL2ts, %struct.SomeStruct* %t
+  ret %struct.SomeStruct* %_ZL2ts.t
+}
+


More information about the llvm-commits mailing list