<div dir="ltr">Hi Reid, Rafael.<div><br></div><div>Thanks for the help and the review!</div><div><br></div><div>I have committed revision 221695 with the simplified test that Rafael suggested.</div><div><br></div><div>Cheers,</div><div> Dario Domizioli</div><div> SN Systems - Sony Computer Entertainment Group</div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 11 November 2014 17:52, Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Yes. It is a valid bug fix. It would still be nice to cleanup the x86<br>
implementation, but that is independent.<br>
<br>
My only request for the patch is to reduce the testcase a bit. For<br>
example, you don't need the struct.SomeStruct. You should be able to<br>
test the difference with only<br>
<br>
@x = thread_local global i32 0<br>
define i32 @f() "no-frame-pointer-elim-non-leaf" {<br>
%a = load i32* @x, align 4<br>
ret i32 %a<br>
}<br>
<br>
no?<br>
<br>
LGTM with that.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On 11 November 2014 12:19, Reid Kleckner <<a href="mailto:rnk@google.com">rnk@google.com</a>> wrote:<br>
> Works for me, I think that definitely fixes a bug in existing code. Rafael,<br>
> do you think this approach is OK?<br>
><br>
> On Tue, Nov 11, 2014 at 7:49 AM, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>
> wrote:<br>
>><br>
>> Good news! (maybe)<br>
>><br>
>> I was focusing too much on the instruction definition and not enough on<br>
>> looking at how the instruction was getting created...<br>
>> While I inspected the ISel lowering code creating the TLS access pseudo<br>
>> (in X86ISelLowering.cpp) to find out how to do something similar to what<br>
>> AArch64 does, I found a comment in the GetTLSADDR() function basically<br>
>> saying that the code there was responsible for informing the<br>
>> MachineFrameInfo that the function had calls... however it was only setting<br>
>> the "adjustsStack" flag and not the "hasCalls" flag.<br>
>><br>
>> This means that I don't strictly need to change the definition of the TLS<br>
>> access pseudo, and I can fix the PR with just a one-line change in<br>
>> X86ISelLowering.cpp.<br>
>><br>
>> Do you think it's acceptable?<br>
>> I attach the new patch. I also changed the comment in the test to explain<br>
>> the situation.<br>
>><br>
>> Cheers,<br>
>> Dario Domizioli<br>
>> SN Systems - Sony Computer Entertainment Group<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On 11 November 2014 12:16, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>
>> wrote:<br>
>>><br>
>>> Thanks Rafael!<br>
>>><br>
>>> I'll have a look at the AArch64 version (and the PPC patch) and I'll try<br>
>>> to implement something similar.<br>
>>> I'll post a new patch soon.<br>
>>><br>
>>> Cheers,<br>
>>> Dario Domizioli<br>
>>> SN Systems - Sony Computer Entertainment Group<br>
>>><br>
>>><br>
>>><br>
>>><br>
>>><br>
>>> On 11 November 2014 05:45, Rafael Espíndola <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>><br>
>>> wrote:<br>
>>>><br>
>>>> For a related issue, see <a href="http://reviews.llvm.org/D6209" target="_blank">http://reviews.llvm.org/D6209</a><br>
>>>><br>
>>>> On 10 November 2014 16:02, Rafael Espíndola <<a href="mailto:rafael.espindola@gmail.com">rafael.espindola@gmail.com</a>><br>
>>>> wrote:<br>
>>>> > On 10 November 2014 10:48, Dario Domizioli <<a href="mailto:dario.domizioli@gmail.com">dario.domizioli@gmail.com</a>><br>
>>>> > wrote:<br>
>>>> >> Hi Rafael.<br>
>>>> >><br>
>>>> >> I have been looking at this for a while, but I cannot seem to find a<br>
>>>> >> way of<br>
>>>> >> preserving an LLVM intrinsic after ISel and expanding it later.<br>
>>>> >> Basically, if I add an LLVM intrinsic for TLS access in<br>
>>>> >> IR/IntrinsicsX86.td,<br>
>>>> >> then the backend expects me to provide an instruction selection rule<br>
>>>> >> for<br>
>>>> >> lowering it to MachineInstructions; however as you said we cannot do<br>
>>>> >> it at<br>
>>>> >> ISel stage because we must ensure the structure is preserved so that<br>
>>>> >> the<br>
>>>> >> linker can pattern-match it.<br>
>>>> >><br>
>>>> >> With "intrinsic", do you instead mean an actual function that we<br>
>>>> >> manufacture<br>
>>>> >> on the fly in Clang (or provide a declaration for in a header), and<br>
>>>> >> then<br>
>>>> >> eliminate / patch up in MC later on?<br>
>>>> >> That might work but it seems a bit convoluted to me. Also... if we<br>
>>>> >> manufacture the function, we end up with a declaration in the IR that<br>
>>>> >> doesn't correspond to anything in the source; if we instead have the<br>
>>>> >> function in a header, doesn't that require the user to include the<br>
>>>> >> header?<br>
>>>> >> Is that what you're suggesting, or am I getting confused?<br>
>>>> ><br>
>>>> > Sorry, you were right. This has to be an instruction, it is too late<br>
>>>> > to have an intrinsic.<br>
>>>> ><br>
>>>> > Your change to add a isCall is probably also correct. The problem with<br>
>>>> > TLS_addr64 is that it has not been upgraded to use the new call<br>
>>>> > representation with register masks.<br>
>>>> ><br>
>>>> > Probably the best example of how this should work is what AArch64<br>
>>>> > does:<br>
>>>> ><br>
>>>> ><br>
>>>> > -----------------------------------------------------------------------<br>
>>>> > let isCall = 1, Defs = [LR] in<br>
>>>> > def TLSDESC_BLR<br>
>>>> > : Pseudo<(outs), (ins GPR64:$dest, i64imm:$sym),<br>
>>>> > [(AArch64tlsdesc_call GPR64:$dest,<br>
>>>> > tglobaltlsaddr:$sym)]>;<br>
>>>> ><br>
>>>> > ------------------------------------------------------------------------<br>
>>>> ><br>
>>>> > so it has isCall, but not the explicit Defs list. It is created with<br>
>>>> ><br>
>>>> > -------------------------------------<br>
>>>> > const uint32_t *Mask = ARI->getTLSCallPreservedMask();<br>
>>>> > ....<br>
>>>> ><br>
>>>> > SmallVector<SDValue, 6> Ops;<br>
>>>> > Ops.push_back(Chain);<br>
>>>> > Ops.push_back(Func);<br>
>>>> > Ops.push_back(SymAddr);<br>
>>>> > Ops.push_back(DAG.getRegister(AArch64::X0, PtrVT));<br>
>>>> > Ops.push_back(DAG.getRegisterMask(Mask));<br>
>>>> > Ops.push_back(Glue);<br>
>>>> ><br>
>>>> > SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);<br>
>>>> > Chain = DAG.getNode(AArch64ISD::TLSDESC_CALL, DL, NodeTys, Ops);<br>
>>>> > --------------------------------------<br>
>>>> ><br>
>>>> > With this X86FloatingPoint.cpp should not get confused about<br>
>>>> > __tls_get_addr returning a value in the fp stack.<br>
>>>> ><br>
>>>> > Cheers,<br>
>>>> > Rafael<br>
>>><br>
>>><br>
>><br>
><br>
</div></div></blockquote></div><br></div>