[llvm-dev] The Trouble with Triples
Eric Christopher via llvm-dev
llvm-dev at lists.llvm.org
Tue Sep 22 12:41:25 PDT 2015
I've been busy working on other things. I replied to your earlier mail
which is much longer and also encapsulates all of the stuff here.
Thanks! :)
-eric
On Tue, Sep 22, 2015 at 6:06 AM Daniel Sanders <Daniel.Sanders at imgtec.com>
wrote:
> The thread has gone quiet for a few days and I need to be making progress
> towards a gcc-compatible toolchain (e.g. a mips-mti-linux-gnu toolchain
> that can target MIPS32/MIPS64 and later for all appropriate ABI's and both
> endians) so I need to chase this a earlier than I normally would.
>
>
>
> > Here's the line of thought that I'd like people to start with:
>
> > * Triples don't describe the target. They look like they should, but
> they don't. They're really just arbitrary strings.
>
> > * LLVM relies on Triple as a description of the target. It defines the
> backend to use, the binary format to use, OS and Vendor specific quirks to
> enable/disable, the default CPU, the default ABI, the endian, and countless
> other details about the target.
>
> > * If LLVM is built on top of an incorrect concept we should fix that but
> we can't abandon Triple's at the user level since every toolchain uses them.
>
> > * But we can't keep using Triples inappropriately either. If the
> information feeding into LLVM is faulty then the resulting behaviour will
> be faulty too.
>
> > * So let's start with a Triple, and convert it to a not-broken
> equivalent as early as possible. We'll call it TargetTuple.
>
> > Are there any disagreements on this part of the thinking?
>
> > If we have agreement on this, then I think that this by itself is ample
> reason for phases 1-4, and 6 of the plan.
>
> > The justification for the IR serialization in phase 5 is simply that we
> need to deliver the Triple/TargetTuple to
>
> > LTO for it to operate correctly and we currently do this by serializing
> Triple in the IR. If Triple has been replaced
>
> > by TargetTuple then TargetTuple must be serializable in the IR somehow.
>
>
>
> Are we agreed on this much? If so, I think we should go ahead with this
> part of the work and judge each follow-on task independently on its own
> merits.
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Daniel
> Sanders via llvm-dev
> *Sent:* 17 September 2015 14:21
> *To:* Eric Christopher; Renato Golin; Jim Grosbach
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] The Trouble with Triples
>
>
>
> I think we need to take a step further back and re-enter from the right
> starting point. The thing that's bothering me about the push back so far is
> that it's trying to discuss and understand the consequences of resolving
> the core problem while seemingly ignoring the core problem itself. The
> reason I've been steering everything back to GNU Triple's being ambiguous
> and inconsistent is because it's the root of all the problems and the fixes
> to the various issues fall out naturally once this core point has been
> addressed.
>
>
>
> Here's the line of thought that I'd like people to start with:
>
> · Triples don't describe the target. They look like they should,
> but they don't. They're really just arbitrary strings.
>
> · LLVM relies on Triple as a description of the target. It
> defines the backend to use, the binary format to use, OS and Vendor
> specific quirks to enable/disable, the default CPU, the default ABI, the
> endian, and countless other details about the target.
>
> · If LLVM is built on top of an incorrect concept we should fix
> that but we can't abandon Triple's at the user level since every toolchain
> uses them.
>
> · But we can't keep using Triples inappropriately either. If the
> information feeding into LLVM is faulty then the resulting behaviour will
> be faulty too.
>
> · So let's start with a Triple, and convert it to a not-broken
> equivalent as early as possible. We'll call it TargetTuple.
>
> Are there any disagreements on this part of the thinking? If there are,
> then we should resolve these before proceeding to the rest since everything
> else depends on accepting this core problem exists and can be fixed in this
> way.
>
> If we have agreement on this, then I think that this by itself is ample
> reason for phases 1-4, and 6 of the plan. The justification for the IR
> serialization in phase 5 is simply that we need to deliver the
> Triple/TargetTuple to LTO for it to operate correctly and we currently do
> this by serializing Triple in the IR. If Triple has been replaced by
> TargetTuple then TargetTuple must be serializable in the IR somehow.
>
>
>
> Hopefully, we are agreed so far. Let's assume for the rest of this
> explanation that Phases 1-6 are complete and we now have const TargetTuple
> throughout the API. I'd like to draw particular attention to TargetMachine
> which, like everything else, has had its Triple member (called
> TargetTriple) replaced with a TargetTuple member (named TheTargetTuple).
> This member is used in all the same ways it used to be used when it was a
> Triple (named TargetTriple).
>
>
>
> At this point, in the MC layer we have a number of classes that need to
> know the ABI but lack this information. Our TargetMachine has an accurate
> TargetTuple object that describes the invariants of the desired target. The
> desired ABI is an invariant too so why not have it in the TargetTuple which
> is already plumbed in everywhere we need it? After all, it's a property of
> the target OS/Environment. If we have the ABI in the TargetTuple, then we
> don't need any other means to set the ABI, tools can set it up front in the
> TargetTuple and we don't need any command-line option handling for it in
> the backend.
>
>
>
> Meanwhile, in clang we have a number of command line options that change
> the desired target. Let's say we've constructed a Triple and resolved it to
> TargetTuple (more on that below). We're now processing the –EL option. At
> the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
> triple, construct a Triple object from it and resolve the new Triple to a
> TargetTuple. But why do we need to bother with that kind of weird hackery
> when we can simply do Obj.setEndian(Little)? This is what Phase 7 of the
> plan is about. We end up with a cleaner way to process target changes that,
> until now, have required weird triple hacking to handle.
>
>
>
> I skipped the Triple -> TargetTuple resolution a moment ago and I should
> address that now. We already know that mapping Triple to TargetTuple is a
> many to many mapping. One Triple has many possible TargetTuple's depending
> on the environment. One TargetTuple can be formed from multiple possible
> Triples. In an ideal world, we'd like to bake in all of these mappings so
> that one clang binary supports everything. Unfortunately, being a many to
> many mapping, some of these mappings are mutually exclusive. Note that this
> isn't a new problem resulting from this project. The problem has always
> been there but has been ignored until now. To resolve this, we need to
> provide configure-time and possibly run-time controls for how this
> conversion is disambiguated. This resolution is performed as early as
> possible so that the middle/back-ends don't need to know anything about the
> ambiguity problem.
>
>
>
> ---
>
>
>
> To reply more directly to your email:
>
> > What can't be done to TargetMachine to avoid this serialization?
>
>
>
> TargetMachine already has the serialization (see
> TargetMachine::TargetTriple). We're not doing anything new here. We're
> simply replacing one object holding faulty information with a new object
> holding reliable information.
>
>
>
> > And a followup question: What can't be serialized at the function level
> in the IR to make certain things clear that aren't global? We already do
> this for a lot of command line options.
>
>
>
> The data I want to fix is global. I think the bit you may be getting hung
> up on here is that small portions of this global data can also be
> overridden at the function level. Those overrides aren't a problem and
> continue to operate in the same way as they do today.
>
>
>
> > And one more: What global options do we need to consider here?
>
>
>
> I'm not certain I understand this question. If you're talking command line
> options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
> -mips64r[2356], -mabi=…. If you're talking about Triple -> TargetTuple
> mappings, there's quite a wide variety but the main ones for Mips are
> endian, architecture, default CPU, and default ABI.
>
>
>
> > The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level.
>
> > This is a fairly recently stated goal, but I think it makes sense for
> LLVM in general. TargetSubtargetInfo takes care of
>
> > everything that resides under this (as much as possible, some bits are
> still in transition, e.g. TargetOptions). This is part
>
> > of my suggestion to Daniel about the problems with MCSubtargetInfo and
> the assembler. Targets like Mips and ARM
>
> > were unfortunately designed to change things on the fly during assembly
> and need to collate or at least change defaults
>
> > as we're processing code. I definitely had to deal with a lot of the
> pain you're talking about when I was rewriting some
>
> > of the handling there during the TargetSubtargetInfo work.
>
>
>
> I generally agree with this. The key bit I need to draw attention to is
> that the 'defaults' don't change, but are instead overridden. These
> constant defaults are stored in TargetMachine and particularly
> TargetMachine::TargetTriple. These defaults are wrong for some toolchains
> since the information stored in TargetMachine::TargetTriple are wrong. It's
> the defaults I'm trying to fix rather than the overrides.
>
>
>
> I think I understand your proposed plan now and it's a few steps ahead of
> where we are and where we need to be. I agree that overridable state should
> be in TargetSubtargetInfo, however I can't initialize that state without
> the default values which come from the faulty information in
> TargetMachine::TargetTriple. This triple work is a pre-requisite to your
> plan and at first I don't need to override ABI's.
>
>
>
> > Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing.
>
> > I also don't see this is bad, but I also don't see it taking all of them
> right now and I'm not sure how it solves some of the existing problems
>
> > with data sharing that we've got which is where the push back you're
> both getting is coming from here. Ultimately library-wise I can agree
>
> > with some of the directions you're headed - I just don't see the
> unification and interactions right now.
>
>
>
> I think we'll end up with TargetTuple taking over many arguments to
> TargetMachine but that's not my goal at this stage. My goal is simply to
> fix the faulty information currently held in Triple and use the
> now-accurate information in TargetTuple to fix various blocking issues that
> prevent a proper Mips toolchain product based on Clang/LLVM. At the end of
> Phase 7, it become possible to fix a number of issues that are impossible
> to fix right now because the available data we can consult at the moment is
> incorrect.
>
>
>
>
>
> *From:* Eric Christopher [mailto:echristo at gmail.com]
> *Sent:* 16 September 2015 23:52
> *To:* Renato Golin; Jim Grosbach
> *Cc:* Daniel Sanders; llvm-dev at lists.llvm.org
> *Subject:* Re: The Trouble with Triples
>
>
>
> Let's take a step back here.
>
>
>
> It appears that you and Daniel are trying to solve some problems. I think
> solving problems is good, I just want to make sure that we're solving them
> in a way that gets us a decent API at the end. I also want to make sure
> we're solving the right problems.
>
>
>
> TargetTuple appears to be related to the TargetParser as you bring up in
> this mail. They're two separate parts of similar problems - people trying
> to both serialize command line options and communication from the front end
> to the backend with respect to target information.
>
>
>
> This leads me to a question: What can't be done to TargetMachine to avoid
> this serialization?
>
> And a followup question: What can't be serialized at the function level in
> the IR to make certain things clear that aren't global? We already do this
> for a lot of command line options.
>
> And one more: What global options do we need to consider here?
>
>
>
> The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level. This is a fairly
> recently stated goal, but I think it makes sense for LLVM in general.
> TargetSubtargetInfo takes care of everything that resides under this (as
> much as possible, some bits are still in transition, e.g. TargetOptions).
> This is part of my suggestion to Daniel about the problems with
> MCSubtargetInfo and the assembler. Targets like Mips and ARM were
> unfortunately designed to change things on the fly during assembly and need
> to collate or at least change defaults as we're processing code. I
> definitely had to deal with a lot of the pain you're talking about when I
> was rewriting some of the handling there during the TargetSubtargetInfo
> work.
>
>
>
> Now a bit more on TargetParser + TargetTuple:
>
>
>
> TargetParser appears to be trying to solve the parsing in Triple in a nice
> way for ARM and also some of the "what kind of subtarget feature
> canonicalization can we do in llvm that makes sense to communicate to the
> front end". I like this particular idea and have often wanted a library of
> feature handling, but it seems to have stabilized at an ARM specific set of
> code with no defined interface. I can't even figure out how I'd use it in
> lib/Basic right now for any target other than ARM. This isn't a
> condemnation of TargetParser, but I think it's something that needs to be
> thought through a bit more. It's been hooked up well before I'd expected it
> to and right now if we moved it to the ARM backend from Support it'd make
> just as much sense as it does where it is now other than making clang
> depend on the ARM backend as well as the X86 backend :)
>
>
>
> Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing. I also
> don't see this is bad, but I also don't see it taking all of them right now
> and I'm not sure how it solves some of the existing problems with data
> sharing that we've got which is where the push back you're both getting is
> coming from here. Ultimately library-wise I can agree with some of the
> directions you're headed - I just don't see the unification and
> interactions right now.
>
>
>
> As a suggestion as a way forward here let's see if we can get my questions
> above answered and also show some of how the interactions between llvm's
> libraries are going to get fixed, moved to a better place, etc here.
>
>
>
> Thanks!
>
>
>
> -eric
>
>
>
>
>
> On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at linaro.org>
> wrote:
>
> On 16 September 2015 at 21:56, Jim Grosbach <grosbach at apple.com> wrote:
> > Why do we care about GAS? We have an assembler.
>
> It's not that simple.
>
> There are a lot of old code out there, including the Linux kernel
> which we do care a lot, that only compiles with GAS. We're slowly
> moving the legacy code up to modern standards, and specifically some
> kernel folks are happy to move up not only the asm syntax, but the C
> standard and move away from GNU-specific behaviour. But we're not
> quite there yet, and might not be for a few more years. so, yes, we
> still care about GAS.
>
> But this is not just about GAS.
>
> As I said on my previous email, this is about clearing the bloat in
> target descriptions by both: removing the need for adding numerous CPU
> names, target features, architecture names (xscale, strongarm, etc),
> AND making sure all parties (front/middle/back-ends) speak the same
> language, produced from the same source.
>
> The TargetTuple is that common language, and the TargetParser created
> from the TableGen files is the common source. The Triple becomes a
> legacy constructor value for the Tuple. All other target information
> classes are already (or should be) generated from the TableGen files,
> so the ultimate source becomes the TableGen description, which I think
> it what you were aiming to on your comment.
>
> For simple architectures, like x86, you don't even need a
> TargetParser. You can easily construct the Tuple from a triple and use
> the Tuple as you've always used the triple. No harm done. But for the
> complex ones like ARM and MIPS, having a common interface generated
> from the same place the other interfaces are is important to avoid
> more bridges between front and middle and back end interpretations of
> the same target. Whatever legacy ARM or MIPS carry can be isolated in
> their own implementation, leaving the rest of the targets with a clean
> and simple interface.
>
> cheers,
> --renato
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/450036f7/attachment-0001.html>
More information about the llvm-dev
mailing list