[PATCH][X86] Trying to fix PR20243 - leaf frame pointer bug on X86, ELF, PIC
Dario Domizioli
dario.domizioli at gmail.com
Tue Nov 11 10:37:14 PST 2014
Hi Reid, Rafael.
Thanks for the help and the review!
I have committed revision 221695 with the simplified test that Rafael
suggested.
Cheers,
Dario Domizioli
SN Systems - Sony Computer Entertainment Group
On 11 November 2014 17:52, Rafael Espíndola <rafael.espindola at gmail.com>
wrote:
> Yes. It is a valid bug fix. It would still be nice to cleanup the x86
> implementation, but that is independent.
>
> My only request for the patch is to reduce the testcase a bit. For
> example, you don't need the struct.SomeStruct. You should be able to
> test the difference with only
>
> @x = thread_local global i32 0
> define i32 @f() "no-frame-pointer-elim-non-leaf" {
> %a = load i32* @x, align 4
> ret i32 %a
> }
>
> no?
>
> LGTM with that.
>
>
> On 11 November 2014 12:19, Reid Kleckner <rnk at google.com> wrote:
> > Works for me, I think that definitely fixes a bug in existing code.
> Rafael,
> > do you think this approach is OK?
> >
> > On Tue, Nov 11, 2014 at 7:49 AM, Dario Domizioli <
> dario.domizioli at gmail.com>
> > wrote:
> >>
> >> Good news! (maybe)
> >>
> >> I was focusing too much on the instruction definition and not enough on
> >> looking at how the instruction was getting created...
> >> While I inspected the ISel lowering code creating the TLS access pseudo
> >> (in X86ISelLowering.cpp) to find out how to do something similar to what
> >> AArch64 does, I found a comment in the GetTLSADDR() function basically
> >> saying that the code there was responsible for informing the
> >> MachineFrameInfo that the function had calls... however it was only
> setting
> >> the "adjustsStack" flag and not the "hasCalls" flag.
> >>
> >> This means that I don't strictly need to change the definition of the
> TLS
> >> access pseudo, and I can fix the PR with just a one-line change in
> >> X86ISelLowering.cpp.
> >>
> >> Do you think it's acceptable?
> >> I attach the new patch. I also changed the comment in the test to
> explain
> >> the situation.
> >>
> >> Cheers,
> >> Dario Domizioli
> >> SN Systems - Sony Computer Entertainment Group
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On 11 November 2014 12:16, Dario Domizioli <dario.domizioli at gmail.com>
> >> wrote:
> >>>
> >>> Thanks Rafael!
> >>>
> >>> I'll have a look at the AArch64 version (and the PPC patch) and I'll
> try
> >>> to implement something similar.
> >>> I'll post a new patch soon.
> >>>
> >>> Cheers,
> >>> Dario Domizioli
> >>> SN Systems - Sony Computer Entertainment Group
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 11 November 2014 05:45, Rafael Espíndola <
> rafael.espindola at gmail.com>
> >>> wrote:
> >>>>
> >>>> For a related issue, see http://reviews.llvm.org/D6209
> >>>>
> >>>> On 10 November 2014 16:02, Rafael Espíndola <
> rafael.espindola at gmail.com>
> >>>> wrote:
> >>>> > On 10 November 2014 10:48, Dario Domizioli <
> dario.domizioli at gmail.com>
> >>>> > wrote:
> >>>> >> Hi Rafael.
> >>>> >>
> >>>> >> I have been looking at this for a while, but I cannot seem to find
> a
> >>>> >> way of
> >>>> >> preserving an LLVM intrinsic after ISel and expanding it later.
> >>>> >> Basically, if I add an LLVM intrinsic for TLS access in
> >>>> >> IR/IntrinsicsX86.td,
> >>>> >> then the backend expects me to provide an instruction selection
> rule
> >>>> >> for
> >>>> >> lowering it to MachineInstructions; however as you said we cannot
> do
> >>>> >> it at
> >>>> >> ISel stage because we must ensure the structure is preserved so
> that
> >>>> >> the
> >>>> >> linker can pattern-match it.
> >>>> >>
> >>>> >> With "intrinsic", do you instead mean an actual function that we
> >>>> >> manufacture
> >>>> >> on the fly in Clang (or provide a declaration for in a header), and
> >>>> >> then
> >>>> >> eliminate / patch up in MC later on?
> >>>> >> That might work but it seems a bit convoluted to me. Also... if we
> >>>> >> manufacture the function, we end up with a declaration in the IR
> that
> >>>> >> doesn't correspond to anything in the source; if we instead have
> the
> >>>> >> function in a header, doesn't that require the user to include the
> >>>> >> header?
> >>>> >> Is that what you're suggesting, or am I getting confused?
> >>>> >
> >>>> > Sorry, you were right. This has to be an instruction, it is too late
> >>>> > to have an intrinsic.
> >>>> >
> >>>> > Your change to add a isCall is probably also correct. The problem
> with
> >>>> > TLS_addr64 is that it has not been upgraded to use the new call
> >>>> > representation with register masks.
> >>>> >
> >>>> > Probably the best example of how this should work is what AArch64
> >>>> > does:
> >>>> >
> >>>> >
> >>>> >
> -----------------------------------------------------------------------
> >>>> > let isCall = 1, Defs = [LR] in
> >>>> > def TLSDESC_BLR
> >>>> > : Pseudo<(outs), (ins GPR64:$dest, i64imm:$sym),
> >>>> > [(AArch64tlsdesc_call GPR64:$dest,
> >>>> > tglobaltlsaddr:$sym)]>;
> >>>> >
> >>>> >
> ------------------------------------------------------------------------
> >>>> >
> >>>> > so it has isCall, but not the explicit Defs list. It is created with
> >>>> >
> >>>> > -------------------------------------
> >>>> > const uint32_t *Mask = ARI->getTLSCallPreservedMask();
> >>>> > ....
> >>>> >
> >>>> > SmallVector<SDValue, 6> Ops;
> >>>> > Ops.push_back(Chain);
> >>>> > Ops.push_back(Func);
> >>>> > Ops.push_back(SymAddr);
> >>>> > Ops.push_back(DAG.getRegister(AArch64::X0, PtrVT));
> >>>> > Ops.push_back(DAG.getRegisterMask(Mask));
> >>>> > Ops.push_back(Glue);
> >>>> >
> >>>> > SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
> >>>> > Chain = DAG.getNode(AArch64ISD::TLSDESC_CALL, DL, NodeTys, Ops);
> >>>> > --------------------------------------
> >>>> >
> >>>> > With this X86FloatingPoint.cpp should not get confused about
> >>>> > __tls_get_addr returning a value in the fp stack.
> >>>> >
> >>>> > Cheers,
> >>>> > Rafael
> >>>
> >>>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141111/dd308ecb/attachment.html>
More information about the llvm-commits
mailing list