[LLVMdev] GCC and LLVM collaboration: Cauldron's feedback

Mon Jul 28 03:21:51 PDT 2014

Folks,

As most of you know, I held a BoF at the cauldron where we discussed
the interactions between LLVM and GCC, and the overall response was
very positive. Here are some of the highlights, complementing Alex's
great summary on the blog.

1. Driver
=======

This was probably the most contentious issue. Compiler drivers are
complex pieces of spaghetti code and each target's implementation ends
up being done differently. Legacy behaviour on old cores vs. new cores
on the same target is likely to stay as it is, but new cores and
architectures should come clean and well discussed.

AArch64 is a good example on a whole new architecture in need for a
common description and it seems that both GCC and LLVM have the same
goals, but I don't see much collaboration on that side. I may be
wrong, though.

People agreed that triples mean nothing and create huge confusion. The
main reason is, as usual, they were chosen at random by implementers
(distros, archs, manufacturers) and now they're much more "names" than
accurate descriptions. GCC is "just following" the norm, but for them,
it's part of their build process, so it's a lot harder to change it.

An unified drivers in GCC, where users could have multiple back-ends
(like Clang/LLVM) is probably going to take a long time and a lot of
effort and most developers are sceptical as to the feasibility of such
project, but the majority would be *very* happy if that happened.

The overall feeling during the Bof was that people were unwilling to
work it out, but afterwards a number of developers were already
cooking a few ideas as to how to do it in an iterative way.

Others were even considering a Python wrapper for both GCC and LLVM to
keep the interface common. As weird as it sounds, that was actually an
idea that floated on the US Connect (last November) to work around
Android's build system mess, so it might not be entirely without
merit.

2. Standard additions
================

We were pretty much in sync on how discussions should be done when
adding new extensions to the languages, ABIs, macros, asm aliases,
etc. I also think people were comfortable with the idea that
discussing them was not the same as abiding to them.

So, language changes and extensions should be:
 * written "in standard's language", covering every corner case and
delineating undefined behaviour, etc.
 * submitted to either the language standard or a local committee (GCC
has its own, we could use the same for now [1])

If the extension is accepted as a draft by the language standard, than
the likelihood that it'll be accepted in some future version is
higher. But more than that, it means the idea has merit and people
that know the language well think it could be a feasible
implementation and not clash with the current standard (which is the
most important part).

If not, a local committee could still define it as a common
non-standard extension, but still requiring standard's language so
that implementations know *how* to do it, not just have a vague idea.
GCC docs are fairly specific for most of its extensions, but there are
some seriously lacking issues, and GCC itself sometimes doesn't abide
by its own docs. That's an implementation issue, and one that the
compiler should fix it.

Such committee could (should?) be concerned with not just language
extensions, but also other parts of the toolchain as to which
languages and ABIs can't reach, for instance macros, builtins,
intrinsics, flags behaviour, work around build systems, etc.

[1] Right now, GCC has a committee to discuss and approve such
features, specifically requiring well written documents to be
discussed. LLVM engineers sending documents to such committee would
only make sense, IMHO, if LLVM engineers were part of such committee.
How many people here are actually interested in that line of work is
yet unknown. But if we really want to cooperate, starting a separate
committee won't help. Could this be the beginning of the open compiler
initiative?

3. Deprecation
===========

Legacy sucks, we all know that. GCC takes the view that they can't
remove *any* legacy, which is exactly the opposite of what we do. I
don't see a problem here.

As LLVM becomes more mainstream, code that uses legacy extensions
(such as nested functions and VLAIS) will be left out by at least one
of the compilers, and if the user wants to be compiler agnostic, s(he)
will *have* to use the common denominator, which has to be *at least*
the language standard.

But this also means that if we start adding bells and whistles, we may
never get it widely used if we don't discuss this on the shared
committee and get it accepted somewhere more generic than our own
backyard.

4. Shared features, different behaviour
=============================

I think there was general agreement that any feature we discuss and
propose, can't be enforced in any way. One thing is to accept that a
feature makes sense, another is what do you think is the best
implementation, or even which state should be the default one.

One example is the inline asm validation. Most people agree that
"some" validation is interesting, but also that there are cases where
validation will break things, and most likely stop you from being
efficient. So, having a flag -f{no-}validate-asm would be ok, even if
LLVM defaults to YES and GCC to NO, as long as both understood it. GCC
doesn't have to validate at all, and should emit a warning "we don't
validate inline asm" if the flag forces it to.

This provides a clear road to sharing the common interface, but
doesn't set dates as to when things have to be done. Sharing ideas and
definitions is one thing, sharing roadmaps is a different altogether.
This shared committee will only make sense IFF we keep it simple.

The slides should be in the wiki soon...

https://gcc.gnu.org/wiki/cauldron2014

cheers,
--renato