<div dir="ltr">Hi Reid, Rafael.<div><br></div><div>Thanks for the help and the review!</div><div><br></div><div>I have committed revision 221695 with the simplified test that Rafael suggested.</div><div><br></div><div>Cheers,</div><div>    Dario Domizioli</div><div>    SN Systems - Sony Computer Entertainment Group</div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 11 November 2014 17:52, Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Yes. It is a valid bug fix. It would still be nice to cleanup the x86<br>

implementation, but that is independent.<br>

<br>

My only request for the patch is to reduce the testcase a bit. For<br>

example, you don't need the struct.SomeStruct. You should be able to<br>

test the difference with only<br>

<br>

@x = thread_local global i32 0<br>

define i32 @f() "no-frame-pointer-elim-non-leaf" {<br>

  %a = load i32* @x, align 4<br>

  ret i32 %a<br>

}<br>

<br>

no?<br>

<br>

LGTM with that.<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

On 11 November 2014 12:19, Reid Kleckner <<a href="mailto:rnk@google.com">rnk@google.com</a>> wrote:<br>

> Works for me, I think that definitely fixes a bug in existing code. Rafael,<br>

> do you think this approach is OK?<br>

><br>

> On Tue, Nov 11, 2014 at 7:49 AM, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>

> wrote:<br>

>><br>

>> Good news! (maybe)<br>

>><br>

>> I was focusing too much on the instruction definition and not enough on<br>

>> looking at how the instruction was getting created...<br>

>> While I inspected the ISel lowering code creating the TLS access pseudo<br>

>> (in X86ISelLowering.cpp) to find out how to do something similar to what<br>

>> AArch64 does, I found a comment in the GetTLSADDR() function basically<br>

>> saying that the code there was responsible for informing the<br>

>> MachineFrameInfo that the function had calls... however it was only setting<br>

>> the "adjustsStack" flag and not the "hasCalls" flag.<br>

>><br>

>> This means that I don't strictly need to change the definition of the TLS<br>

>> access pseudo, and I can fix the PR with just a one-line change in<br>

>> X86ISelLowering.cpp.<br>

>><br>

>> Do you think it's acceptable?<br>

>> I attach the new patch. I also changed the comment in the test to explain<br>

>> the situation.<br>

>><br>

>> Cheers,<br>

>>     Dario Domizioli<br>

>>     SN Systems - Sony Computer Entertainment Group<br>

>><br>

>><br>

>><br>

>><br>

>><br>

>><br>

>><br>

>><br>

>><br>

>> On 11 November 2014 12:16, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>

>> wrote:<br>

>>><br>

>>> Thanks Rafael!<br>

>>><br>

>>> I'll have a look at the AArch64 version (and the PPC patch) and I'll try<br>

>>> to implement something similar.<br>

>>> I'll post a new patch soon.<br>

>>><br>

>>> Cheers,<br>

>>>     Dario Domizioli<br>

>>>     SN Systems - Sony Computer Entertainment Group<br>

>>><br>

>>><br>

>>><br>

>>><br>

>>><br>

>>> On 11 November 2014 05:45, Rafael Espíndola <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>><br>

>>> wrote:<br>

>>>><br>

>>>> For a related issue, see <a href="http://reviews.llvm.org/D6209" target="_blank">http://reviews.llvm.org/D6209</a><br>

>>>><br>

>>>> On 10 November 2014 16:02, Rafael Espíndola <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>><br>

>>>> wrote:<br>

>>>> > On 10 November 2014 10:48, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>

>>>> > wrote:<br>

>>>> >> Hi Rafael.<br>

>>>> >><br>

>>>> >> I have been looking at this for a while, but I cannot seem to find a<br>

>>>> >> way of<br>

>>>> >> preserving an LLVM intrinsic after ISel and expanding it later.<br>

>>>> >> Basically, if I add an LLVM intrinsic for TLS access in<br>

>>>> >> IR/IntrinsicsX86.td,<br>

>>>> >> then the backend expects me to provide an instruction selection rule<br>

>>>> >> for<br>

>>>> >> lowering it to MachineInstructions; however as you said we cannot do<br>

>>>> >> it at<br>

>>>> >> ISel stage because we must ensure the structure is preserved so that<br>

>>>> >> the<br>

>>>> >> linker can pattern-match it.<br>

>>>> >><br>

>>>> >> With "intrinsic", do you instead mean an actual function that we<br>

>>>> >> manufacture<br>

>>>> >> on the fly in Clang (or provide a declaration for in a header), and<br>

>>>> >> then<br>

>>>> >> eliminate / patch up in MC later on?<br>

>>>> >> That might work but it seems a bit convoluted to me. Also... if we<br>

>>>> >> manufacture the function, we end up with a declaration in the IR that<br>

>>>> >> doesn't correspond to anything in the source; if we instead have the<br>

>>>> >> function in a header, doesn't that require the user to include the<br>

>>>> >> header?<br>

>>>> >> Is that what you're suggesting, or am I getting confused?<br>

>>>> ><br>

>>>> > Sorry, you were right. This has to be an instruction, it is too late<br>

>>>> > to have an intrinsic.<br>

>>>> ><br>

>>>> > Your change to add a isCall is probably also correct. The problem with<br>

>>>> > TLS_addr64 is that it has not been upgraded to use the new call<br>

>>>> > representation with register masks.<br>

>>>> ><br>

>>>> > Probably the best example of how this should work is what AArch64<br>

>>>> > does:<br>

>>>> ><br>

>>>> ><br>

>>>> > -----------------------------------------------------------------------<br>

>>>> > let isCall = 1, Defs = [LR] in<br>

>>>> > def TLSDESC_BLR<br>

>>>> >     : Pseudo<(outs), (ins GPR64:$dest, i64imm:$sym),<br>

>>>> >              [(AArch64tlsdesc_call GPR64:$dest,<br>

>>>> > tglobaltlsaddr:$sym)]>;<br>

>>>> ><br>

>>>> > ------------------------------------------------------------------------<br>

>>>> ><br>

>>>> > so it has isCall, but not the explicit Defs list. It is created with<br>

>>>> ><br>

>>>> > -------------------------------------<br>

>>>> >   const uint32_t *Mask = ARI->getTLSCallPreservedMask();<br>

>>>> > ....<br>

>>>> ><br>

>>>> >   SmallVector<SDValue, 6> Ops;<br>

>>>> >   Ops.push_back(Chain);<br>

>>>> >   Ops.push_back(Func);<br>

>>>> >   Ops.push_back(SymAddr);<br>

>>>> >   Ops.push_back(DAG.getRegister(AArch64::X0, PtrVT));<br>

>>>> >   Ops.push_back(DAG.getRegisterMask(Mask));<br>

>>>> >   Ops.push_back(Glue);<br>

>>>> ><br>

>>>> >   SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);<br>

>>>> >   Chain = DAG.getNode(AArch64ISD::TLSDESC_CALL, DL, NodeTys, Ops);<br>

>>>> > --------------------------------------<br>

>>>> ><br>

>>>> > With this X86FloatingPoint.cpp should not get confused about<br>

>>>> > __tls_get_addr returning a value in the fp stack.<br>

>>>> ><br>

>>>> > Cheers,<br>

>>>> > Rafael<br>

>>><br>

>>><br>

>><br>

><br>

</div></div></blockquote></div><br></div>