[LLVMdev] Is using lots of in-register values in IR bad?

Thu Jul 28 16:06:58 PDT 2011

Erkki Lindpere <villane at gmail.com> writes:

> I want to experiment with avoiding mutable state as far as I can. At
> the moment there are no mutable variables -- only immutable value
> types (numerics, bool, vectors, tuples) and I've been doing everything
> in LLVM registers. The compiler doesn't generate a single alloca, load
> or store at the moment.

Ok.  Do you ever need to grab the address of something on the stack?  If
so you're going to need an alloca.  AFAIK, it's the only way to generate
an address for a local object.  This is by design of the IR and it
greatly simplifies analysis.

How do you handle global data?  That can only be accessed in LLVM IR via
load/store.  A GlobalValue is an address by definition.

> I wonder if it was maybe a bad idea to do it this way? Because a lot
> of stuff in LLVM seem to be only available through pointers. e.g.
> extractvalue takes only constant indices, but GEP can take variables.

Yeah, this is quite a limitation of the current IR.  It is lacking a few
fundamental operations that, for example, vector machines of the '60's
and '70's implemented directly.  Extract/insert from/to variable index
being one of them.  Extractvalue is a little more complicated, of
course, but special cases of it are implemented on x86 (for example) and
other "modernish" targets.

For cases like these, it is best to create a target-specific intrinsic
and use that to represent the operation.  For operations not implemented
directly by the target, an alloca+GEP may be necessary.

> Some things seem to be possible only by bitcasting pointers, e.g.
> splitting a Vector into equal-sized parts to partially compute the sum
> of it's elements with SIMD instructions...

That doesn't seem like the Right Way to do it.  As in the extractvalue
case, the IR has no direct support for vector reductions.  If your
target has these kinds of operations, you should probably use an
intrinsic to implement them.

Think of target intrinsics as a way to extend the IR for special
operations.  The analysis and transformation passes won't understand
them but typically in these cases you "know" the right sequence to
generate.

> And there may of course be some penalty for passing large(-ish)
> structures by-value. I haven't investigated at which sizes does that
> become worse than passing pointers.

It is highly target-dependent.  But usually the target's ABI has already
made that decision for you.  In the case of pass-by-address you will
need an alloca.

> Maybe a better alternative would be to allocate memory for every local
> value, and let the mem2reg pass optimize?

That is often simpler.  Then the translation of every object from your
high-level language to LLVM IR looks the same.  But it is not strictly
necessary.

> I hope these kind of questions are appropriate for this list.

Absolutely.  Welcome!

                            -Dave