[llvm-dev] Efficiently ignoring upper 32 pointer bits when dereferencing
Taddeus via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 2 09:03:37 PDT 2017
Hi all,
I am experiencing a problem with the representation of addresses in the
x86_64 TableGen backend and was hoping someone can tell me if it is
fixable. Any comments or hints in to send me in the right direction
would be greatly appreciated. I am using LLVM version 3.8, commit
251286.
I have an IR pass that stores metadata in the upper 32 bits of 64-bit
pointers in order to implement memory safety. The pass instruments
loads and stores to do an AND of the address with 0xffffffff to mask
out that metadata. E.g., when loading a 4-byte value from memory
pointed to by %rbx, this translates to the following asm:
mov %ecx,%ecx ; zeroes the upper bits, removing the metadata
mov (%rcx),%eax
This leads to quite some overhead (12% on SPEC CPU2006) so I am looking
into possibilities for backend modifications to optimize this. The
first mov introduces unnecessary extra cycles and the second mov has to
wait for its results, potentially stalling the pipeline. On top of
that, it increases register pressure when the original pointer must be
preserved for later use (e.g. the mask would be "mov %esi,%ecx" after
which %rsi is dereferenced, instead of just dereferencing %esi).
So, what I would like to generate instead is the following:
mov (%ecx),%eax
I.e., don't do the masking in a separate mov, but by using a
subregister for the address (which is zero-extended, effectively
ignoring the metadata bits). As a side note, GCC does emit the second
snippet as expected.
Looking at the TableGen files I found two problems:
1. The AND of the address with 0xffffffff is replaced with
SUBREG_TO_REG(MOV32rr (EXTRACT_SUBREG ...)) in
lib/Target/X86/X86InstrCompiler.td (line 1326). That MOV32rr emits an
explicit mov instruction later. I think I need to replace this with
(i32 (EXTRACT_SUBREG ...)) to get rid of the mov, but that produces a
32-bit value, which leads me to the next, more general problem:
2. The x86 backend currently does not support dereferencing 32-bit
addresses in 64-bit mode. Specifically, addresses are defined as an
iPTR type in X86InstrInfo.td which I assume is expanded to 4 or 8 bytes
depending on if 32/64 bit mode is active:
def addr : ComplexPattern<iPTR, 5, "selectAddr", [],
[SDNPWantParent]>;
The derefencing mov instruction looks like this:
def MOV32rm : I<0x8B, MRMSrcMem, (outs GR32:$dst), (ins i32mem:$src),
"mov{l}\t{$src, $dst|$dst, $src}",
[(set GR32:$dst, (loadi32 addr:$src))], IIC_MOV_MEM>, OpSize32;
So it expects a source address of type 'addr' which is 8 bytes. This
leads to the following code being emitted when I apply my solution to
problem 1:
mov (%rcx),%eax
In other words, the upper bits are not ignored.
I am currently not sure what is the best place to solve this problem.
The best would be to give the 'addr' type a dynamic size but I don't
know how to do this. Any ideas on this?
Cheers,
TaddeĆ¼s
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170802/af1dfc41/attachment.html>
More information about the llvm-dev
mailing list