[llvm-dev] RFC: alloca -- specify rounding factor for allocation (and more)

Mon Aug 31 14:23:19 PDT 2015

Thanks a lot for the response.

Reid Kleckner schrieb:
> allocas support alignment, and I believe the backend will sort stack
> objects by alignment to try to pack data into the padding. If I'm wrong,
> this is probably a reasonable feature request. That said, 64-byte
> alignment is larger than the default 16-byte stack alignment available
> on most platforms, so you'll end up using a more expensive prologue. I'd
> recommend reducing the alignment back to 16.

I don't really need alignment, I need size :) I can see how aligment 
could trick two allocas into one being of a proper size, but that 
wouldn't really work for me.

But I am pretty sure now, I will do the rounding from clang with a 
union. It's just easier. I will round to 5 * sizeof( void *) which 
should give sufficient room for the interesting cases, but not waste too 
much stack space.

---
#define  round_up( x, s)   (((x) + ((s) - 1)) / (s))

// that's what I want to pass
struct param_a_b_c_d
{
    long   a, b, c, d;
};

// that creates the size rounding
union alloc_param_a_b_c_d
{
    struct param_a_b_c_d   param;
    void                   *space[ 5 * round_up( sizeof( struct 
param_a_b_c_d), sizeof( void *[5]))];
};

struct param_a_b_c_d_e_f
{
    long   a, b, c, d, e, f;
};

union alloc_param_a_b_c_d_e_f
{
    struct param_a_b_c_d_e_f  param;
    void                      *space[ 5 * round_up( sizeof( struct 
param_a_b_c_d_e_f), sizeof( void *[5]))];
};

---
becomes

%union.alloc_param_a_b_c_d = type { [5 x i8*] }
%union.alloc_param_a_b_c_d_e_f = type { [10 x i8*] }
---

> ----
>
> LLVM will not normally perform tail call optimization if the call takes
> the address of an alloca. TCO deallocates the frame of the calling
> function and all of its allocas before jumping to the callee.

This is good to know. I certainly would want TCO and this is something 
else I have to be aware of. As the alloca is "passed" in, it should be 
able to do the TCO, since there is no need to clean up the stack.

>
> To enable TCO, you would need some new transform to replace uses of a
> local alloca with uses of the incoming parameter pack. You will need
> some way to know when the incoming parameter space is big enough for the
> outgoing call.

I am hoping, but not sure yet, I can achieve this with what I wrote in 
"Re: [llvm-dev] alloca combining, not (yet) possible ?" Message-ID: 
<55E460F2.9060807 at mulle-kybernetik.com>

>
> -----
>
> It sounds like what you really want is something like 'inalloca':
> http://llvm.org/docs/InAlloca.html I strongly advise that you *don't*
> use it in its current state, though, since we added it for 32-bit MSVC
> compatibility, it doesn't generate fast code, the mid-level IR is less
> analyzable, and it's only supported on x86 currently.
>
> inalloca essentially allows you to manually allocate all of the outgoing
> argument memory yourself, and its address is passed in implicitly as the
> incoming stack pointer.

OK, but where does that leave me :) I don't think I would want to use 
"inalloca", if it's not well supported and part of some optimization 
scheme yet, that would be benefical to me.

Ciao
    Nat!