[LLVMdev] instruction/intrinsic for segmented adressing

mobi phil mobi at mobiphil.com
Sun Dec 7 07:59:05 PST 2014


>
> The price is paid in different ways, everywhere. All I can say for
> sure is that addressing based on TPIDR is going to be more expensive
> than without. Only benchmarks could quantify it.
>

well, will need some experimenting, first need an 64bit arm board (not sure
if qemu arm64 is fit enough)


> > wouldn't it make sense to add such an addressing instruction at LLVM IR
> > level? I mean there were no similar requests?
>
> It's not come up before, no. It's not the worst idea I've heard, but
> equally I'm not exactly convinced of the benefit yet. Either way, it
> won't happen unless someone implements it ("patches welcome" as the
> saying goes).
>

Tried to describe at least the projected benefit in my prev emails, though
no idea how useful it will be for the hole industry,  precisely how useful
others will find it. Personally I assume that the cost of addressing
overhead is worth to pay to save a lot of memory especially in pointer
intensive applications (is there any class of application that cannot be
called like that?). I know that Oracle invested some money in pointer
compression, javascript has also some tricks to embedd information in the
redundant part of 64 bit pointers. I am sure behind the scenes there are
lot of such hacks to reduce the space used by 64 bit pointers. Already
described in my prev email, but maybe some other wording: such segmented
addressing would use only 32 bit pointers (in extreme case 16 bit!!) and
the virtual memory area would be split into segments (well, yes, back to
segmented addressing of x86). Once having this mechanism, segments could be
relocated, saved/loaded to disk, shared between processes, migrated to
other processes etc. One could write some fast loadable in memory database
engines on top of it that would both fit for 64 bit servers and phones.


> > Another solution for my problem would be to carry around the segment
> address
> > as extra function parameter to all functions, but that would be a funny
>
> That's not exactly a terrible idea (I believe GHC might do something
> morally similar). It allows the compiler to spill it if necessary
> unlike reserving a register absolutely (say before a
> performance-critical loop), but its omnipresence probably discourages
> the spilling.
>
> If nothing else, it sounds like a useful way to get yourself up and
> running without backend or OS support.
>

Just that would be nicer to write C++ code (runtime for my language)
without exposing this detail... Ideally C++ could be extended, but take
this at the moment rather like an utopia, and introduce keyword shortp in
front of pointer declarations.

class Class {};

class TheClass {
     shortp char *int;
     shortp Class *cls;
};

and any statement that would access such pointers, should be compiled to
"segmented" addressing.
The compiler should forbid the allocation of such objects on the stack.
Though this mechanism could be extended to stack if another register could
be allocated for stack segment based addressing.


> > So I wonder if it is possible in a
> > LLVM pass to track back all pointers in the IR that were initialized
> with a
> > certain function (factory function) and change the addressing
>
> This is the problem I believe is logically impossible without source
> help, and if you've got that you'd just as well emit different IR to
> begin with.
>

will study more in depth the code emitter in clang...


> > On x86-64, unless I call some library functions I have the guaranty that
> > nobody would change the values in the gs/fs registers.
>
> You do? I thought both were reserved by Linux. I suppose if you hack
> the kernel and/or libc you could fix them.
>

 well, I am confused, thought GS would be the thread boy, but it seems that
it is FS, and GS is not affected. Tried the following:


 #define GS_RELATIVE __attribute__((address_space(256)))
int GS_RELATIVE *gsr;
int main(){
   int i = 12345;
   arch_prctl(ARCH_SET_GS, &i);
   gsr = 0;
   printf("our gs relative ... %d\n", *gsr);
}

then I created threads an so on and the value 1234 at gs:0 is printed
correctly, so the kernel does not seem to change the value of fs.
But of course I should clarify this.


>Is there a way to tell LLVM not to reserve a certain register?
> I don't think I follow here. Reserving a register is possible in
> certain limited circumstances (though discouraged, at least by me).
> Unreserving a register isn't, as far as I'm aware.
>

Well, if FS and GS are compromised by kernel, thought to force the IR to
reserve one of the general registers to hold my segment value instead of
passing it as parameter to each function. I understand that some LLVM
passes may observe that the value is used often, but I still have some
overhead to pass a 64 bit pointer to each function call.
But joggling with thread local storage (__thread) comes to my mind that
this could be my best solution, to store my segment address as "__thread"
data. Well, would not use directly the content of any registers (gs/fs),
but the generated machine code would load it in register any time I need it
and would optimize it for several successive references. Though the
question arises if there is a way to tell to the optimizer, hey, don't
care, nobody will change the value of this so you can consider it "valid"
after function calls as well, so no need for an eventual reload of the
value from the thread local storage.

thanks a lot for your help
mph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141207/0dc27726/attachment.html>


More information about the llvm-dev mailing list