<div dir="ltr">Maybe the code emitter will just work because it detects the register size since we have to support hand written assembly.</div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div>

<br><div class="gmail_quote">On Wed, Aug 2, 2017 at 2:17 PM, Craig Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Getting the instruction to actually use (%ecx) as the address requires putting a 0x67 prefix on the instruction. I'm not sure how to convince X86MCCodeEmitter.cpp to do that for you. Assuming you're wanting to generate binary and not textual assembly.</div><div class="gmail_extra"><br clear="all"><div><div class="m_4790451667647171290gmail_signature" data-smartmail="gmail_signature">~Craig</div></div>

<br><div class="gmail_quote"><div><div class="h5">On Wed, Aug 2, 2017 at 2:03 PM, Taddeus Kroes via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div lang="en-NL" link="blue" vlink="#954F72"><div class="m_4790451667647171290m_3789710980351367518WordSection1"><p class="MsoNormal"><span lang="EN-US">Hi Eli,<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">Thanks, I’ll look into that then!<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US">Cheers,<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US">Taddeüs<u></u><u></u></span></p><p class="MsoNormal"><u></u> <u></u></p><div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0cm 0cm 0cm"><p class="MsoNormal" style="border:none;padding:0cm"><b>From: </b><a href="mailto:efriedma@codeaurora.org" target="_blank">Friedman, Eli</a><br><b>Sent: </b>Wednesday, 2 August 2017 19:48<br><b>To: </b><a href="mailto:t.kroes@vu.nl" target="_blank">Taddeus</a>; <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br><b>Subject: </b>Re: [llvm-dev] Efficiently ignoring upper 32 pointer bits whendereferencing</p></div><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">On 8/2/2017 9:03 AM, Taddeus via llvm-dev wrote:</p><p class="MsoNormal">> Hi all,</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> I am experiencing a problem with the representation of addresses in </p><p class="MsoNormal">> the x86_64 TableGen backend and was hoping someone can tell me if it </p><p class="MsoNormal">> is fixable. Any comments or hints in to send me in the right direction </p><p class="MsoNormal">> would be greatly appreciated. I am using  LLVM version 3.8, commit 251286.</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> I have an IR pass that stores metadata in the upper 32 bits of 64-bit </p><p class="MsoNormal">> pointers in order to implement memory safety. The pass instruments </p><p class="MsoNormal">> loads and stores to do an AND of the address with 0xffffffff to mask </p><p class="MsoNormal">> out that metadata. E.g., when loading a 4-byte value from memory </p><p class="MsoNormal">> pointed to by %rbx, this translates to the following asm:</p><p class="MsoNormal">>     mov    %ecx,%ecx   ; zeroes the upper bits, removing the metadata</p><p class="MsoNormal">>     mov    (%rcx),%eax</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> This leads to quite some overhead (12% on SPEC CPU2006) so I am </p><p class="MsoNormal">> looking into possibilities for backend modifications to optimize this. </p><p class="MsoNormal">> The first mov introduces unnecessary extra cycles and the second mov </p><p class="MsoNormal">> has to wait for its results, potentially stalling the pipeline. On top </p><p class="MsoNormal">> of that, it increases register pressure when the original pointer must </p><p class="MsoNormal">> be preserved for later use (e.g. the mask would be "mov %esi,%ecx" </p><p class="MsoNormal">> after which %rsi is dereferenced, instead of just dereferencing %esi).</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> So, what I would like to generate instead is the following:</p><p class="MsoNormal">>     mov    (%ecx),%eax</p><p class="MsoNormal">> I.e., don't do the masking in a separate mov, but by using a </p><p class="MsoNormal">> subregister for the address (which is zero-extended, effectively </p><p class="MsoNormal">> ignoring the metadata bits). As a side note, GCC does emit the second </p><p class="MsoNormal">> snippet as expected.</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> Looking at the TableGen files I found two problems:</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> 1. The AND of the address with 0xffffffff is replaced with </p><p class="MsoNormal">> SUBREG_TO_REG(MOV32rr (EXTRACT_SUBREG ...)) in </p><p class="MsoNormal">> lib/Target/X86/X86InstrCompile<wbr><a href="http://r.td">r.td</a> (line 1326). That MOV32rr emits an </p><p class="MsoNormal">> explicit mov instruction later. I think I need to replace this with </p><p class="MsoNormal">> (i32 (EXTRACT_SUBREG ...)) to get rid of the mov, but that produces a </p><p class="MsoNormal">> 32-bit value, which leads me to the next, more general problem:</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> 2. The x86 backend currently does not support dereferencing 32-bit </p><p class="MsoNormal">> addresses in 64-bit mode. Specifically, addresses are defined as an </p><p class="MsoNormal">> iPTR type in X86InstrInfo.td which I assume is expanded to 4 or 8 </p><p class="MsoNormal">> bytes depending on if 32/64 bit mode is active:</p><p class="MsoNormal">>     def addr : ComplexPattern<iPTR, 5, "selectAddr", [], </p><p class="MsoNormal">> [SDNPWantParent]>;</p><p class="MsoNormal">> The derefencing mov instruction looks like this:</p><p class="MsoNormal">>    def MOV32rm : I<0x8B, MRMSrcMem, (outs GR32:$dst), (ins i32mem:$src),</p><p class="MsoNormal">>         "mov{l}\t{$src, $dst|$dst, $src}",</p><p class="MsoNormal">>         [(set GR32:$dst, (loadi32 addr:$src))], IIC_MOV_MEM>, OpSize32;</p><p class="MsoNormal">> So it expects a source address of type 'addr' which is 8 bytes. This </p><p class="MsoNormal">> leads to the following code being emitted when I apply my solution to </p><p class="MsoNormal">> problem 1:</p><p class="MsoNormal">>      mov    (%rcx),%eax</p><p class="MsoNormal">> In other words, the upper bits are not ignored.</p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">><u></u> <u></u></p><p class="MsoNormal">> I am currently not sure what is the best place to solve this problem. </p><p class="MsoNormal">> The best would be to give the 'addr' type a dynamic size but I don't </p><p class="MsoNormal">> know how to do this. Any ideas on this?</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">A TableGen pattern can only match one specific type; you'll need a </p><p class="MsoNormal">separate pattern to match a 32-bit address.  Yes, this means you'll need </p><p class="MsoNormal">to write your own separate pattern for every load/store instruction, but </p><p class="MsoNormal">there isn't really any way around that.</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">There are some existing patterns involving MOV32rm, if you want </p><p class="MsoNormal">inspiration; for example, the following pattern is from X86InstrCompiler.td:</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">def : Pat<(extloadi64i32 addr:$src),</p><p class="MsoNormal">           (SUBREG_TO_REG (i64 0), (MOV32rm addr:$src), sub_32bit)>;</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">-Eli</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal">-- </p><p class="MsoNormal">Employee of Qualcomm Innovation Center, Inc.</p><p class="MsoNormal">Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</p><p class="MsoNormal"><u></u> <u></u></p><p class="MsoNormal"><u></u> <u></u></p></div></div><br></div></div>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></blockquote></div><br></div>

</blockquote></div><br></div>