[llvm-dev] [RFC] The future of the va_arg instruction

Mon Aug 14 02:26:09 PDT 2017

On 9 August 2017 at 19:38, Friedman, Eli <efriedma at codeaurora.org> wrote:
> On 8/9/2017 9:11 AM, Alex Bradbury via llvm-dev wrote:
>>
>> Option 3: Teach va_arg to handle aggregates
>>    * In this option, va_arg might reasonably be expected to handle a
>> struct,
>>    but would not be expected to have detailed ABI-specific knowledge. e.g.
>> it
>>    won't automagically know whether a value of a certain size/type is
>> passed
>>    indirectly or not. In a sense, this would put support for aggregates
>> passed
>>    as varargs on par with aggregates passed in named arguments.
>>    * Casting would be necessary in the same cases casting is required
>> for named args
>>    * Support for aggregates could be implemented via a new module-level
>> pass, much like PNaCl.
>>    * Alternatively, the conversion from the va_arg instruction to
>>    SelectionDAG could be modified. It might be desirable to convert the
>> vaarg
>>    instruction to a number of loads and a new node that is responsible
>> only for
>>    manipulating the va_list struct.
>
>
> We could automatically split va_arg on an LLVM struct type into a series of
> va_arg calls for each of the elements of the struct.  Not sure that actually
> helps anyone much, though.
>
> Anything more requires full type information, which isn't currently encoded
> into IR; for example, on x86-64, to properly lower va_arg on a struct, you
> need to figure out whether the struct would be passed in integer registers,
> floating-point registers, or memory.

I've been thinking more about this. Firstly, if anyone has insight in
to any cases where the va_arg instruction actually provides better
optimisation opportunities, please do share. The va_arg IR instruction
has been supported in LLVM for over a decade, but Clang doesn't
generate it for the vast majority of the "top tier" targets. I'm
trying to determine if it just needs more love, or if perhaps it
wasn't really the right thing to express at the IR level. Is the main
motivation of va_arg to allow such argument access to be specified
concisely in IR, or is there a particular way it makes life easier for
optimisations or analysis (and if so, which ones and at which point in
compilation?).

va_arg really does three things:
* Calculates how to load a value of the given type
* Increments the appropriate fields in the va_list struct
* Loads a value of the given type

The problem I see is it's fairly difficult to specialise its behaviour
depending on the target. In one of the many previous threads about ABI
lowering, I think someone commented that in LLVM it happens both too
early and too late (in the frontend, and on the SelectionDAG). This
seems to be the case here, to support targets with a more complex
va_list struct featuring separate save areas for GPRs and FPRs,
splitting a va_arg in to multiple operations (one per element of an
aggregate) doesn't seem like it could work without heroic gymnastics
in the backend.

Converting the va_arg instruction to a new GETVAARG SelectionDAG node
plus a series of LOADs seems like it may provide a straight-forward
path to supporting aggregates on targets that use a pointer for
va_list. Of course this ends up exposing loads plus offset generation
in the SelectionDAG, just hiding the va_list increment behind
GETVAARG. For such an approach to work, you must be able to load the
given type from a contiguous region of memory, which won't always be
true for targets with a more complex va_list struct.

Best,

Alex