[llvm-dev] Efficiently ignoring upper 32 pointer bits whendereferencing
Taddeus Kroes via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 2 15:38:18 PDT 2017
Good point. Maybe the prefix can be specified next to the opcode in the pattern in X86InstrInfo.td?
Cheers,
Taddeüs
From: Craig Topper
Sent: Wednesday, 2 August 2017 23:22
To: Taddeus Kroes
Cc: Friedman, Eli; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Efficiently ignoring upper 32 pointer bits whendereferencing
Maybe the code emitter will just work because it detects the register size since we have to support hand written assembly.
~Craig
On Wed, Aug 2, 2017 at 2:17 PM, Craig Topper <craig.topper at gmail.com> wrote:
Getting the instruction to actually use (%ecx) as the address requires putting a 0x67 prefix on the instruction. I'm not sure how to convince X86MCCodeEmitter.cpp to do that for you. Assuming you're wanting to generate binary and not textual assembly.
~Craig
On Wed, Aug 2, 2017 at 2:03 PM, Taddeus Kroes via llvm-dev <llvm-dev at lists.llvm.org> wrote:
Hi Eli,
Thanks, I’ll look into that then!
Cheers,
Taddeüs
From: Friedman, Eli
Sent: Wednesday, 2 August 2017 19:48
To: Taddeus; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Efficiently ignoring upper 32 pointer bits whendereferencing
On 8/2/2017 9:03 AM, Taddeus via llvm-dev wrote:
> Hi all,
>
> I am experiencing a problem with the representation of addresses in
> the x86_64 TableGen backend and was hoping someone can tell me if it
> is fixable. Any comments or hints in to send me in the right direction
> would be greatly appreciated. I am using LLVM version 3.8, commit 251286.
>
>
> I have an IR pass that stores metadata in the upper 32 bits of 64-bit
> pointers in order to implement memory safety. The pass instruments
> loads and stores to do an AND of the address with 0xffffffff to mask
> out that metadata. E.g., when loading a 4-byte value from memory
> pointed to by %rbx, this translates to the following asm:
> mov %ecx,%ecx ; zeroes the upper bits, removing the metadata
> mov (%rcx),%eax
>
> This leads to quite some overhead (12% on SPEC CPU2006) so I am
> looking into possibilities for backend modifications to optimize this.
> The first mov introduces unnecessary extra cycles and the second mov
> has to wait for its results, potentially stalling the pipeline. On top
> of that, it increases register pressure when the original pointer must
> be preserved for later use (e.g. the mask would be "mov %esi,%ecx"
> after which %rsi is dereferenced, instead of just dereferencing %esi).
>
> So, what I would like to generate instead is the following:
> mov (%ecx),%eax
> I.e., don't do the masking in a separate mov, but by using a
> subregister for the address (which is zero-extended, effectively
> ignoring the metadata bits). As a side note, GCC does emit the second
> snippet as expected.
>
>
> Looking at the TableGen files I found two problems:
>
> 1. The AND of the address with 0xffffffff is replaced with
> SUBREG_TO_REG(MOV32rr (EXTRACT_SUBREG ...)) in
> lib/Target/X86/X86InstrCompiler.td (line 1326). That MOV32rr emits an
> explicit mov instruction later. I think I need to replace this with
> (i32 (EXTRACT_SUBREG ...)) to get rid of the mov, but that produces a
> 32-bit value, which leads me to the next, more general problem:
>
> 2. The x86 backend currently does not support dereferencing 32-bit
> addresses in 64-bit mode. Specifically, addresses are defined as an
> iPTR type in X86InstrInfo.td which I assume is expanded to 4 or 8
> bytes depending on if 32/64 bit mode is active:
> def addr : ComplexPattern<iPTR, 5, "selectAddr", [],
> [SDNPWantParent]>;
> The derefencing mov instruction looks like this:
> def MOV32rm : I<0x8B, MRMSrcMem, (outs GR32:$dst), (ins i32mem:$src),
> "mov{l}\t{$src, $dst|$dst, $src}",
> [(set GR32:$dst, (loadi32 addr:$src))], IIC_MOV_MEM>, OpSize32;
> So it expects a source address of type 'addr' which is 8 bytes. This
> leads to the following code being emitted when I apply my solution to
> problem 1:
> mov (%rcx),%eax
> In other words, the upper bits are not ignored.
>
>
> I am currently not sure what is the best place to solve this problem.
> The best would be to give the 'addr' type a dynamic size but I don't
> know how to do this. Any ideas on this?
A TableGen pattern can only match one specific type; you'll need a
separate pattern to match a 32-bit address. Yes, this means you'll need
to write your own separate pattern for every load/store instruction, but
there isn't really any way around that.
There are some existing patterns involving MOV32rm, if you want
inspiration; for example, the following pattern is from X86InstrCompiler.td:
def : Pat<(extloadi64i32 addr:$src),
(SUBREG_TO_REG (i64 0), (MOV32rm addr:$src), sub_32bit)>;
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170803/e9f58e74/attachment.html>
More information about the llvm-dev
mailing list