[LLVMdev] instruction/intrinsic for segmented adressing

mobi phil mobi at mobiphil.com
Sat Dec 6 08:48:38 PST 2014


Thanks again for your help!


> >>
> >> Probably fairly minimal in most cases (on x86). On ARM there is
> >> definitely a cost.
> >>
> > hm... why? You cannot have indexed addressing?
> What I need is a way to force
> The code that needs to be emitted is roughly:
>     [..."segment"-offset into x1...]
>     mrs x0, tpidr_el0
>     ldr xD, [x0, x1]
>
> That's a more complex addressing mode and an additional MRS
> instruction over the usual sequence. You also lose the ability to fold
> the actual address-computation into the LDR.
>

but this is the price you pay always for RISC vs. x86, or? Probably it is
difficult to quantify but wonder if it would add more than 5% slowdown to
an average program, especially long running server class application.


>
> > and now the obvious question: for aarch64, is there an adrspace(256)
> > identical declaration for LLVM?
>
> Nope. That's what I meant by saying there's no direct control over
> these features from LLVM.
>
wouldn't it make sense to add such an addressing instruction at LLVM IR
level? I mean there were no similar requests? Do not know if there is any
interest, but this would help implementing lot of stuff like pointer size
compression on 64 bit (pointers would be kept as 32bit), easier data
sharing between processes (mmap with segmented addressing), position
independent data (load and save chunks of data with pointers, keeping
pointers semantics valid).

Knowing this, it means that my compiler has to generate platform dependent
assembler code inside the IR. Which means I would not be able to run such a
code inside LLVM virtual machine.

Another solution for my problem would be to carry around the segment
address as extra function parameter to all functions, but that would be a
funny


>
> It's a very difficult problem. The main issue is that the stack won't
> be in this special address space (at least not without heavy LLVM
> modifications), so you need a way to distinguish stack accesses from
> heap. Without source annotation that's reducible to the halting
> problem. For example:
>
> int load_address(int *addr) {
>   return addr;
> }
>
> int evil(int *heap_addr) {
>   int local_var = 42;
>   return load_address(rand() % 2 ? heap_addr : &local_var);
> }
>

> Should the code emitted for load_address use gs or not?
>

the stack should not be in this address space and this addressing should
not apply to stack. The framework would make any kind of C++ constructor
private (friend accessible only to some Factory methods), so such objects
could not be created on the stack only on heap. So I wonder if it is
possible in a LLVM pass to track back all pointers in the IR that were
initialized with a certain function (factory function) and change the
addressing

Tried to play with a naiv approach.

uint8_t *global_segment;
#define ainline __attribute__((always_inline))
template<class A>
   class CompactPointer
   {
      uint32_t adr;
      public:
      ainline A *operator->() { return
reinterpret_cast<A*>(static_cast<uint32_t*>(global_segment)+adr);}
   };


int main() {
   CompactPointer<OtherObject> cpoo;
   CompactPointer<Object> cp = cpoo->cpo;
}
~


all such dereferencing statements would have in the IR references to the
global_segment. Could track back all those (with a custom LLVM pass) and
translate to a "segmented instruction". Having seen that address(256) is
specific to X86, could generate for both x86-64 and Aarch64 custom
addressing code. The problem is that doing so I would probably break the
chance that some code get optimized in other phases, if I would apply such
a pass at later stages, I might not be able to find the patterns.

On x86-64, unless I call some library functions I have the guaranty that
nobody would change the values in the gs/fs registers. Is there a way to
tell LLVM not to reserve a certain register?

I wish more attention would be given to such a design pattern through all
languages and platforms.

Sorry if there is a bit of confusion in what I write, but I am still a bit
confused as I do not know well yet LLVM and the platforms themselves, and
would like to know what are my possibilities before starting to read
hundreds of pages of documentation

thanks in advance for the answers,

rgrds,
mobi phil

being mobile, but including technology
http://mobiphil.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141206/f1b55b73/attachment.html>


More information about the llvm-dev mailing list