[cfe-dev] Clang/LLVM function ABI lowering

Tue Jun 9 10:24:11 PDT 2020

On Jun 9, 2020, at 2:49 AM, David Chisnall via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> On 04/06/2020 05:54, James Y Knight via cfe-dev wrote:
>> Essentially, I'd like to see the code in Clang responsible for function parameter-type mangling as part of its ABI lowering deleted. Currently, there is a secret "LLVM IR" ABI used between Clang and LLVM, which involves expanding some arguments into multiple arguments, adding a smattering of "inreg" or "byval" attributes, and converting some types into other types. All in a completely target-dependent, complex, and undocumented manner.
> 
> This has been my biggest complaint about LLVM for almost 15 years. We've had long discussions about it and I don't think anyone is actually happy with this situation.  Unfortunately, the path forwards isn't quite so clear cut.

Right, I’m happy to share my perspective.  I think the path here is pretty clear, but I’m not conveying it well.

In short, I think we have to fix the LLVM IR representation first.  Once that is done, fixing the other half is much easier. 

>> So, while the IR function syntax appears at first glance to be generic and target-independent, that's not at all true. Sadly, in some cases, clang must even know how many registers different calling conventions use, and count numbers of available registers left, in order to choose the right set of those "generic" attributes to put on a parameter.
>> So: not only does a frontend need to understand the C ABI rules, they also need to understand that complex dance for how to convert that into LLVM IR -- and that's both completely undocumented, and a huge mess.
> 
> Completely agreed.  It also leads so some subtle gotchas for other front ends.  For example, what is the most efficient way of returning a pair of i32s?  Typically, it's packed in an i64, because most 32-bit back ends will use both of their ABI's return registers for an i64.

This is concerned with efficiency of generated code when a frontend doesn’t care about a concrete ABI.  Because we have chosen to make unannotated LLVM IR “close” to machine semantics, we get lowering like this that are implied by C, but aren’t the best to do for frontends who don’t care.

Ignoring compatibility for a second, the right way to fix this problem is to make it so "attribute free” LLVM functions get lowered in a way that is efficient for the target, without caring about ABI compatibility.  This property is also important, because it enables IPO passes to strip ABI attributes without fear of reducing performance (this can’t be guaranteed in all cases of course, but should be generally true in practice).

Now we can’t ignore compatibility with existing IR, but we could introduce a new “nativecall” (bikeshed) target-independent calling convention that provides this.

> This also causes problems for optimisations, because they have to understand the special semantics of sret, the fact that a pair of pointers may be packed into an i64 for return (e.g. i386-unknown-freebsd) but that shouldn't require their alias analysis to treat them as having escaped from the type system, and so on.

This is a different kind of problem: the complexity here is that we have “sret" at all and so transformations need to be aware of it and make sure it is handled correctly.  This is true regardless of how and when it is used - this is inherent complexity introduced to the system that I don’t think we will ever be able to remove.

>> Instead, I believe clang should always pass function parameters in a "naive" fashion. E.g. if a parameter type is "struct X", the llvm function should be lowered to LLVM IR with a function parameter of type %struct.X. The decision on whether to then pass that in a register (or multiple registers), on the stack, padded and then passed on the stack, etc, should be the responsibility of LLVM. Only in the case of C++ types which /must/ be passed indirectly for correctness, independent of calling convention ABI, should clang be explicitly making the decision to pass indirectly.
> 
> C++ is not the only place where this causes problems.  A few off the top of my head:
> 
> - Bitfield layout is target-ABI dependent.
> - Explicitly aligned fields introduce padding.
> - _Atomic-qualified types may be differently aligned than their non-_Atomic variants.
> - Unions need lowering to something else before they can be expressed in LLVM IR at all
> - In some situations, C semantics make structure padding important (e.g. for structure equality comparison), so the decision on whether it needs copying is nontrivial.

Right.  Also, C and C++ are not the only languages with stable ABIs.  Other languages (e.g. Swift, Fortran, ...) also have type systems that are a superset of C and have their own defined ABIs as well.  If you’re looking to generate ABI compatible calls for complex types, then you only have a few choices:

1) You have a union of all the type systems that the world will care about.
2) You punt on this, force things through memory and passed by pointers (so you cannot call something that passes or returns complex types by value).
3) You have an extensible system where, e.g. the C part of the type system is standardized, you provide functionality for high level language authors to desugar their type systems down to C, and provide hooks for them to do custom things where they need to.

To be clear, I’m arguing for #3.

> One of the proposed solutions was to factor this logic out of Clang and expose it as a set of builders, with enough of the type system to be able to handle these cases.

I can’t imagine how this works.  What is “enough” of the type system?  LLVM/Clang are open source projects, you can’t just say “the thing sufficient to cover this one use-case”.  You have to solve for the general case because otherwise you’ll get feature crept into it over time and end up with a mess.

>> Of course, the tricky part is that LLVM doesn't -- and shouldn't -- have the full C type system available to it, and the full C type system typically is required to evaluate the ABI rules (e.g., distinguishing a "_Complex float" from a struct containing two floats).
> 
> I think I'd phrase that somewhat differently.  C/C++ ABIs are defined in terms of C types.

Not always.  High level languages can and do have weird things on their own - they aren’t just aggregations of C constructs.  There are lots and lots of examples of this, particularly outside the C family.  Almost all of them need to “speak C” though.

>  Whatever does the lowering *must* have access to the full C type system.  That doesn't mean that the LLVM type system must have to non-ambiguously, natively, represent the entire C type system, only that it must be able to carry that semantic information to the back end somehow.

Agreed - LLVM IR shouldn’t have the C type system.  C itself isn’t a closed system either - people keep adding new things to C over time (thankfully, relatively slowly).  What I’m arguing above is about the “builder” sort of stuff you’re referring to, not LLVM IR.

> One proposal to remove the implicit contracts with the back ends was to expose explicit register and stack controls in the IR, so that functions would be tagged with attributes so that the front end would handle the lowering to specific registers.
> 
> This approach is an improvement for front end developers (they need to read the ABI docs for targets that they want to support, but they don't need to understand the implicit and undocumented mapping of that into LLVM IR), reduced work for back-end developers (they don't need to know much about calling conventions, the front end explicitly tells them where to put parameters and where to find return values), 

I agree, but I don’t see an alternative to this - literally, the only options I see are the hybrid mess we have now (which includes a variety of weird attributes <http://llvm.org/docs/LangRef.html#parameter-attributes> anyway) or to directly add target-specific attributes.  I don’t think that a well considered framework for target specific attributes will make things worse here.

> but imposes extra work on optimisations to preserve this (or not, but to understand when they can change it and how: setting the calling convention to fast_cc is a lot easier than manually tweaking the set of registers that are used for parameters).

I think this is an important problem to solve.  I’d recommend solving it as mentioned above - these sorts of passes can just switch to “native call” or other designated overall calling convention and drop all the attributes (which would include making things like sret explicit).

>> Therefore, in order to communicate the correct ABI information to LLVM, I'd like clang to also emit /explicitly-ABI-specific/ data (metadata?), reflecting the extra information that the ABI rules require the backend to know about the type. E.g., for X86_64, clang needs to inform LLVM of the classification for each parameter's type into MEMORY, INTEGER, SSE, SSEUP, X87, X87UP, COMPLEX_X87. Or, for PPC64 elfv2, Clang needs to inform LLVM when a structure should be treated as a "homogenous aggregate" of floating-point or vector type. (In both cases, that information cannot correctly be extracted from the LLVM IR struct type, only from the C type system.)
> 
> Metadata can be lost at arbitrary points, so isn't sufficient for this. Function attributes would be adequate, if we only care about calling conventions and not things like structure layout being handled here.

Agreed.  Structure layout is a different problem.  There is no reasonable way to abstract over that in LLVM IR in my opinion.  The scope we can consider is “what registers / stack is a C structure passed/returned in” not what offsets a bitfield or union member lands at.

>> We should document what data is needed, for each architecture/abi. This required data should be as straightforward an application of the ABI document's rules as possible -- and be only the minimum data necessary.
> 
> I think the minimum first step would be documenting what those conventions are currently.

+1 that would be great.

>> If this is done, frontends (either a new one, or Clang itself) who want to use the C ABI have a significantly simpler task. It remains non-trivial -- you do still need to understand ABI-specific rules, and write ABI-specific code to generate ABI-specific metadata. But, at least the interface boundary has become something which is readily-understandable and implementable based on the ABI documents.
> 
> This does seem like it will simplify parts of the clang / LLVM interface.  How will it affect other ABIs?  For example, the Haskell, HiPE, and Swift calling conventions are not defined in terms of C types, will they also need to define that same set of lowering instructions? What would those look like?

If we have target-specific attributes, then they have a choice: they can generate any of the existing LLVM IR stuff (because we can’t remove/break existing things) or they can generate the new attributes.  This would be strictly additive.  In the immediate term, adding target-specific attributes doesn’t make their life any easier, you need the “builder” to make their lives better.

However, the builder only helps them if they only care about C and aggregations of C types, and if they don’t have to handle the weird and thorny parts (because C in its generality is super weird!).

>> All that said, an MLIR encoding of the C type system can still be useful -- it could contain the code which distills the C types into the ABI-specific metadata. But, I  see that as less important than getting the fundamentals in LLVM-IR into a better shape. Even frontends without a C type system representation should still be able to generate LLVM IR which conforms in their own manner to the documented ABIs -- without it being super painful. Also, the code in Clang now is really confusing, and nearly unmaintainable; it would be a clear improvement to be able to eliminate the majority of it, not just move it into an MLIR dialect.
> 
> I am less convinced that the code could be eliminated (equivalent logic would be needed, at least).  I am; however, hugely in favour of moving it closer to the back ends so that someone maintaining a Target doesn't need to also maintain code in Clang's CodeGen layer to do part of the lowering.

I agree that the right path is to get LLVM IR fixed, then figure out how to produce it.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200609/517157d9/attachment-0001.html>