[llvm-dev] [RFC] The future of the va_arg instruction

Thu Aug 17 03:27:23 PDT 2017

On 14 August 2017 at 21:12, Friedman, Eli <efriedma at codeaurora.org> wrote:
> We don't have any optimizations that touch va_arg, as far as I know.  It's
> an instruction mostly because it got added when LLVM was first written, and
> nobody has bothered to try to get rid of it.

I couldn't find any optimisations that directly touch it either, and
it doesn't sound like people are rushing forwards with examples where
generating IR with explicit va_list manipulation results in pessimised
codegen.

>> va_arg really does three things:
>> * Calculates how to load a value of the given type
>> * Increments the appropriate fields in the va_list struct
>> * Loads a value of the given type
>>
>> The problem I see is it's fairly difficult to specialise its behaviour
>> depending on the target. In one of the many previous threads about ABI
>> lowering, I think someone commented that in LLVM it happens both too
>> early and too late (in the frontend, and on the SelectionDAG). This
>> seems to be the case here, to support targets with a more complex
>> va_list struct featuring separate save areas for GPRs and FPRs,
>> splitting a va_arg in to multiple operations (one per element of an
>> aggregate) doesn't seem like it could work without heroic gymnastics
>> in the backend.
>>
>> Converting the va_arg instruction to a new GETVAARG SelectionDAG node
>> plus a series of LOADs seems like it may provide a straight-forward
>> path to supporting aggregates on targets that use a pointer for
>> va_list. Of course this ends up exposing loads plus offset generation
>> in the SelectionDAG, just hiding the va_list increment behind
>> GETVAARG. For such an approach to work, you must be able to load the
>> given type from a contiguous region of memory, which won't always be
>> true for targets with a more complex va_list struct.
>
>
> Really, IMO, we shouldn't have a va_arg instruction at all, but deprecating
> it is too much work to be worthwhile. :)
>
> If we are going to keep it around, though, we should really do the lowering
> in IR, before we hit SelectionDAG.  Like you explained, it's just a bunch of
> load and store operations, so there isn't any reason to wait, and
> transforming IR is much easier than lowering in SelectionDAG.

I agree. It seems there's an argument that va_arg could be much more
useful in the future, as part of an IR-level ABI lowering. Until that
exists it's perhaps not a big deal either way. I'm CCing Tim Northover
who committed the Clang AArch64 Darwin ABI lowering, and perhaps has a
view on whether there's much value in using va_arg when possible.

va_list manipulation doesn't produce that much noise in the IR when
va_list is just a pointer. I suspect it's more noisy when va_list is a
struct, but there's not a clear path for expanding va_arg to handle
aggregates for those cases outside of an IR-level transform. I'm also
adding in Will Dietz, who has been involved in previous discussions
around this topic.

Best,

Alex