[cfe-dev] [llvm-dev] Clang/LLVM function ABI lowering (was: Re: [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM)

James Y Knight via cfe-dev cfe-dev at lists.llvm.org
Thu Jun 4 10:01:03 PDT 2020


On Thu, Jun 4, 2020 at 11:47 AM Eli Friedman <efriedma at quicinc.com> wrote:

> In LLVM, ABI information currently comes from three sources:
>
>
>
>    1. The function type
>    2. The calling convention
>    3. Attributes
>
>
> I’m getting the following from your description of what you think needs to
> change:
> 1. ABI attributes shouldn’t be mixed with other attributes; we should have
> some data structure dedicated to ABI information.
> 2. ABI information should be explicitly target-specific: instead of using
> attributes like “inreg” that have target-specific meanings, each
> target-specific ABI attribute should have its own target-specific name.
> 3. We should depend more on explicit ABI information, as opposed to
> depending on each target’s default rules.
>

I think reasonable default rules are likely to remain useful, especially
for basic non-aggregate types. But, yes, all that.

4. We should document each ABI supported by clang.
> 5. LLVM function types should be fixed to correspond more closely to C
> function types.
>

I'd rephrase 5 as "LLVM function types should be more ABI-agnostic".
Although your bullet fairly reflects what I said in my initial email,
having LLVM function types match C function types is *not* what I meant to
propose. Rather, what I'd like is for the C function type -> IR
function-type mapping to be as independent from the calling convention as
possible.

I think messing with LLVM function types is a giant sinkhole that would
> destroy any comprehensive proposal, though.  The LLVM type system is not
> the C type system; LLVM structs are not C structs, and LLVM functions are
> not C functions.  And any changes are very high impact: messing with struct
> or function types would impact basically every file in LLVM.
>
>
I think we do not need to extend the LLVM type system. It is sufficient to
represent what is needed. C unions and structs do not, and will not, be
convertible 1:1 into LLVM structs. I don't propose to change that.

>From the standpoint of LLVM IR optimizations and lowering, the place we’re
> currently at with function types is actually pretty convenient, mostly,
> even if generating LLVM IR is inconvenient. Making the LLVM IR
> representation closer to the machine, as opposed to the frontend, is good
> for optimization: it’s hard to model the cost of code implicitly generated
> during isel.  And first-class structs/arrays are pretty awful to work with
> in LLVM IR; optimizations strongly prefer working with simple values.
> Really, I think we want to break up arguments more, not less.
>

The IR representation *isn't* the machine for function calls today. And,
indeed, this causes some trouble already. For example, we generate
inefficient code for:

void bar(int*);
void foo(int a) { bar(&a); }


If "a" is passed on the stack (e.g. 32-bit x86), we load it from that
parameter slot, allocate a new stack slot, store the value there, then pass
the address of the new stack slot to bar. It's silly. We ought to simply
pass the address where the variable is already being kept.

And, yes, my proposal is to take this even further -- implementing my
proposal would make this current inefficiency show up much more often than
it does now. Which, yes, means we will need to actually solve the issue! I
don't have an immediate proposal, but I don't see any reason to think it's
unsolvable.

I agree the way ABI markings are represented in IR is lacking, though, and
> we need ABI-specific documentation for the way the lowering works. Wrapping
> up the current clang code in a friendlier interface only goes so far.
>
>
>
> -Eli
>
>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *James Y
> Knight via llvm-dev
> *Sent:* Wednesday, June 3, 2020 9:54 PM
> *To:* Chris Lattner <clattner at nondot.org>
> *Cc:* llvm-dev at lists.llvm.org; cfe-dev <cfe-dev at lists.llvm.org>;
> flang-dev at lists.llvm.org
> *Subject:* [EXT] Re: [llvm-dev] [cfe-dev] Clang/LLVM function ABI
> lowering (was: Re: [RFC] Refactor Clang: move frontend/driver/diagnostics
> code to LLVM)
>
>
>
> While MLIR may be one part of the solution, I think it's also the case
> that the function-ABI interface between Clang and LLVM is just wrong and
> should be fixed -- independently of whether Clang might use MLIR in the
> future.
>
>
>
> I've mentioned this idea before, I think, but never got around to writing
> up a real proposal. And I still haven't. Maybe this email could inspire
> someone else to work on that.
>
>
>
> Essentially, I'd like to see the code in Clang responsible for function
> parameter-type mangling as part of its ABI lowering deleted. Currently,
> there is a secret "LLVM IR" ABI used between Clang and LLVM, which involves
> expanding some arguments into multiple arguments, adding a smattering of
> "inreg" or "byval" attributes, and converting some types into other types.
> All in a completely target-dependent, complex, and undocumented manner.
>
>
>
> So, while the IR function syntax appears at first glance to be generic and
> target-independent, that's not at all true. Sadly, in some cases, clang
> must even know how many registers different calling conventions use, and
> count numbers of available registers left, in order to choose the right set
> of those "generic" attributes to put on a parameter.
>
>
>
> So: not only does a frontend need to understand the C ABI rules, they also
> need to understand that complex dance for how to convert that into LLVM IR
> -- and that's both completely undocumented, and a huge mess.
>
>
>
> Instead, I believe clang should always pass function parameters in a
> "naive" fashion. E.g. if a parameter type is "struct X", the llvm function
> should be lowered to LLVM IR with a function parameter of type %struct.X.
> The decision on whether to then pass that in a register (or multiple
> registers), on the stack, padded and then passed on the stack, etc, should
> be the responsibility of LLVM. Only in the case of C++ types which *must* be
> passed indirectly for correctness, independent of calling convention ABI,
> should clang be explicitly making the decision to pass indirectly.
>
>
>
> Of course, the tricky part is that LLVM doesn't -- and shouldn't -- have
> the full C type system available to it, and the full C type system
> typically is required to evaluate the ABI rules (e.g., distinguishing a
> "_Complex float" from a struct containing two floats).
>
>
>
> Therefore, in order to communicate the correct ABI information to LLVM,
> I'd like clang to also emit *explicitly-ABI-specific* data (metadata?),
> reflecting the extra information that the ABI rules require the backend to
> know about the type. E.g., for X86_64, clang needs to inform LLVM of the
> classification for each parameter's type into MEMORY, INTEGER, SSE, SSEUP,
> X87, X87UP, COMPLEX_X87. Or, for PPC64 elfv2, Clang needs to inform LLVM
> when a structure should be treated as a "homogenous aggregate" of
> floating-point or vector type. (In both cases, that information cannot
> correctly be extracted from the LLVM IR struct type, only from the C type
> system.)
>
>
>
> We should document what data is needed, for each architecture/abi. This
> required data should be as straightforward an application of the ABI
> document's rules as possible -- and be only the minimum data necessary.
>
>
>
> If this is done, frontends (either a new one, or Clang itself) who want to
> use the C ABI have a significantly simpler task. It remains non-trivial --
> you do still need to understand ABI-specific rules, and write ABI-specific
> code to generate ABI-specific metadata. But, at least the interface
> boundary has become something which is readily-understandable and
> implementable based on the ABI documents.
>
>
>
> All that said, an MLIR encoding of the C type system can still be useful
> -- it could contain the code which distills the C types into the
> ABI-specific metadata. But, I  see that as less important than getting the
> fundamentals in LLVM-IR into a better shape. Even frontends without a C
> type system representation should still be able to generate LLVM IR which
> conforms in their own manner to the documented ABIs -- without it being
> super painful. Also, the code in Clang now is really confusing, and nearly
> unmaintainable; it would be a clear improvement to be able to eliminate the
> majority of it, not just move it into an MLIR dialect.
>
>
>
> On Wed, Jun 3, 2020 at 7:26 PM Chris Lattner via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> On Jun 2, 2020, at 4:21 PM, comex via cfe-dev <cfe-dev at lists.llvm.org>
> wrote:
>
>
>
> While this is a different area of the codebase, another thing that
> would benefit greatly from being moved out of Clang is function call
> ABI handling.  Currently, that handling is split awkwardly between
> Clang and LLVM proper, forcing frontends that implement C FFI to
> either recreate the Clang parts themselves (like Rust does), depend on
> Clang (like Swift does), or live with FFI just not working with some
> function signatures.  I'm not sure what Flang currently does, but my
> understanding is that Flang does support C FFI, so it would probably
> benefit from this as well.  Just something to consider. :)
>
>
>
> For what its worth, I think there is a pretty clear path on this, but it
> hinges on Clang moving to MLIR as its code generation backend (an
> intermediary to generating LLVM IR).
>
>
>
> The approach is to factor the ABI lower part of clang out of Clang itself
> into a specific dialect lowering pass, that works on a generic C type
> system (plus callout to extended type systems).  MLIR has all the infra to
> support this, it is just a massive job to refactor all the things to change
> clang’s architecture.
>
>
>
> I also don’t think there is broad consensus on the direction for Clang
> here, but given that Flang is already using MLIR for this, maybe it would
> make sense to start work there.
>
>
>
> If you’re curious, I co-delivered a talk about this recently, the slides are
> available here
> <https://docs.google.com/presentation/d/11-VjSNNNJoRhPlLxFgvtb909it1WNdxTnQFipryfAPU/edit#slide=id.g7d334b12e5_0_4>
> .
>
>
>
> -Chris
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200604/8c869a0e/attachment-0001.html>


More information about the cfe-dev mailing list