[llvm-dev] The Trouble with Triples

Eric Christopher via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 23 11:49:53 PDT 2015


On Wed, Sep 23, 2015 at 11:38 AM Daniel Sanders <Daniel.Sanders at imgtec.com>
wrote:

> > OK, I'm going to just reply to the last because I think it's the most
> important part of all this and would like to try to have us side tracked
> again. If you'd like I can reply to it, but let's take the last part first
> :)
> >
>
> > > > Could you please provide some examples of things that are impossible
> right now
> > > > with command lines, how those interact with the TargetMachine, and
> how you see
> > > > it being impossible to deal with?
> > > There's some examples above but I'll give the detail in the morning.
> It's 11:30pm
> at the moment :-).
> > Let's talk through one of your examples here when you write things up.
> I think
> > tracing the execution as you see it will be important to coming to a
> mutual
> > understanding here. I know that you have a solution that you see is
> going to
> > solve the problems you see, but the I think the problems that you and I
> are seeing
> > are possibly not the same thing. So let's walk through this execution
> trace and see
> > what we can do.
>
> *ABI*
>
>
>
> Let's start at llvm-mc's main(). It's important to note that llvm-mc does
> not create a TargetMachine. Here's a sketch of what happens:
>

So, we can just stop here.

A couple problems:

a) llvm-mc isn't a supported product, but that's not the real issue.
b) The lack of a TargetMachine at the MC level was something I brought up a
long time ago in this thread with my proposed solutions. This is what needs
to be fixed, especially given that targets can switch ISA, ABI, floating
point, etc within a single assemble action.

I even brought up a lot of these problems originally when I was fixing MIPS
to work with the current subtarget rewrite.

-eric



> ·         Initialize LLVM
>
> ·         Parse the command line
>
> ·         Construct an MCTargetOptions from the flags
>
> ·         Normalize the triple
>
> ·         Construct a llvm::Target
>
> o   If the triple is not given, we fetch the default
>
> o   We normalize the triple
>
> o   We call TargetRegistry::lookupTarget() to get a llvm::Target.
>
> §  If –march is given, and Triple::getArchTypeForLLVMName() doesn't
> return Triple::UnknownArch, the new arch this mutates the triple. Otherwise
> it applies the –march correctly but doesn't change the triple to match. In
> this way, it's possible to end up with i586-linux-gnu targeting the foobar
> architecture.
>
> ·         Call createMCRegInfo()
>
> ·         Call createMCAsmInfo()
>
> o   MipsMCAsmInfo::PointerSize is incorrect for the N32 ABI (should be 4
> but gets 8 since it checks for Triple::mips64/mips64el)
>
> o   MipsMCAsmInfo::CalleeSaveStackSlotSize is incorrect for
> mips-linux-gnu –mips64 –mabi=64. Since it too checks for
> Triple::mips64/mips64el
>
> o   MipsMCAsmInfo::PrivateLabelPrefix and
> MipsMCAsmInfo::PrivateGlobalPrefix are wrong (currently "$", should be
> ".L") for N32/N64 but it's possible to fix this. However, O32 should permit
> "$" in addition to ".L". Even if MipsMCAsmInfo supported multiple prefixes
> (which is easy enough to add), checking for Triple::mips/mipsel would not
> yield the correct result on mips64-linux-gnu –mabi=32.
>
> ·         InitMCObjectFileInfo()
>
> o   FDECFEEncoding is incorrect for N32 (should be sdata4 but gets sdata8
> since it checks for Triple::mips64/mips64el)
>
> o   PersonalityEncoding and TTypeEncoding are correct but only because we
> don't have a R_MIPS_PC64 relocation yet. If we had such a relocation this
> would have the same problem as FDECFEEncoding.
>
> ·         createMCInstrInfo()
>
> ·         createMCInstPrinter()
>
> ·         createMCCodeEmitter()
>
> ·         createMCAsmBackend()
>
> ·         If emitting assembly, createMCAsmStreamer()
>
> ·         if emitting object, createMCObjectStreamer()
>
> o   This in turn calls createObjectWriter() and tells it to emit
> ELF32/ELF64 objects. This information comes from MipsAsmBackend and
> ultimately comes from Triple::mips/mipsel vs Triple::mips64/mips64el. This
> is incorrect for N32 (which should be ELF32 but has
> Triple::mips64/mips64el) and for mips-linux-gnu –mips64 (which should be
> ELF32 since it should target O32).
>
> ·         If assembling createMCAsmParser
>
> ·         If disassembling:
>
> o   createMCRegInfo() (again)
>
> o   createMCAsmInfo() (again)
>
> §  This has the same issues as the first call.
>
> o   createMCDisassembler()
>
> Clang does pretty much the same thing as this but additionally has to deal
> with using the correct default ABI for the given triple. I'll cover this
> kind of problem in 'CPU Defaults' below.
>
>
>
> Other places that get ABI information wrong:
>
> ·         AddressSanitizer: Uses Triple::mips64/mips64el to mean the N64
> ABI. N32 is a Triple::mips64/mips64el that should behave as the
> Triple::mips/mipsel cases do.
>
> ·         DataFlowSanitizer: Is heading down the same road but hasn't
> implemented O32/N32 yet.
>
> ·         MemorySanitizer: Is heading down the same road but hasn't
> implemented O32/N32 yet.
>
> ·         Many places where hasMips64*() or isGP64bit() are used in the
> backend.
>
> o   MSA intrinsic lowering
>
> o   Legalization configuration
>
> o   Instruction selection
>
> o   MipsTargetLowering::getOptimalMemOpType()
>
> o   And many more. I can provide more detail if you want.
>
>
>
> Other notables:
>
> ·         RuntimeDyldELF gets it right but only because it can read the
> ELF headers instead of the Triple. It went down the same road for a while.
>
>
>
> I'll provide a CodeGen example tomorrow if you want. I'd intended to
> include one but this email took longer to type up than I expected.
>
>
>
> *Endian Defaults*
>
>
>
> The toolchain is mips-linux-gnu and targets little endian by default.
> Here's what currently happens:
>
> ·         We parse the triple (mips-linux-gnu) and get Triple::mips
>
> ·         No command line flags modify this
>
> ·         We construct a TargetMachine and all the other objects using
> this llvm::Triple.
>
> ·         The architecture was Triple::mips so everything configures for
> big-endian even though the target was supposed to be little endian.
>
>
>
> *CPU Defaults*
>
>
>
> In LLVM, the default CPU is hardcoded to be MIPS32 (in
> MipsABIInfo::computeTargetABI()). In Clang, the default CPU for this triple
> is hardcoded to be MIPS32R2 (in mips::getMipsCPUAndABI()) and clang always
> passes an explicit CPU to the backend via –target-cpu.
>
>
>
> On Debian, the default CPU for mipsel-linux-gnu is MIPS-II. On Fedora, the
> default CPU for mipsel-linux-gnu is MIPS32R2. It is not possible to
> hardcode the default both ways.
>
> How would you resolve this conflict?
>
>
>
> In my opinion, the only choices to resolve this conflict are
> configure-time options or run-time config files. Configure-time options to
> select the default CPU is faster to
>
> implement and produces a (slightly) faster clang while run-time config
> files are more flexible but slower to implement and produces a slower
> clang. To me, configure-time is the
>
> sensible short term choice followed by moving to run-time config files
> once the pressure to achieve an initial release is gone.
>
>
>
> Now let's consider JIT's. JIT's should default to the host CPU as defined
> by the host triple so that it generates code for the same target as the
> rest of the system. There is a reasonable argument that the default CPU
> should be auto-detected CPU for performance reasons but it may not be
> possible to auto-detect the CPU in all circumstances. We therefore need a
> default to fall back on. This default should be the same as the default for
> the native compiler on this host (MIPS-II for Debian, MIPS32R2 for Fedora).
>
>
>
> In my opinion, the default CPU is a property of the target platform since
> the platform specifies the minimum CPU it is intended to run on. Our
> representation of the target platform is called llvm::Triple so the default
> CPU belongs in this object. Being in this object means that tools such as
> clang, or API's such as Target::createTargetMachine() will always get the
> defaults corresponding to the triple. These defaults, as we discussed above
> vary according to the OS (MIPS-II on Debian, MIPS32R2 on Fedora).
>
>
>
> This kind of problem also exists in other forms such as Softfloat vs
> Hardfloat defaults, NAN1985 vs NAN2008 defaults, default ABIs, etc.
>
>
>
> *Other things to mention*
>
>
>
> MIPS64 is not a fundamentally different architecture from MIPS32. If we
> had a representation of the ABI in the triple then we wouldn't need
> Triple::mips64/mips64el.
>
>
>
> *From:* Eric Christopher [mailto:echristo at gmail.com]
> *Sent:* 23 September 2015 01:34
>
>
> *To:* Daniel Sanders; Renato Golin; Jim Grosbach
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: The Trouble with Triples
>
>
>
> OK, I'm going to just reply to the last because I think it's the most
> important part of all this and would like to try to have us side tracked
> again. If you'd like I can reply to it, but let's take the last part first
> :)
>
>
>
> > Could you please provide some examples of things that are impossible
> right now
> > with command lines, how those interact with the TargetMachine, and how
> you see
> > it being impossible to deal with?
>
> There's some examples above but I'll give the detail in the morning. It's
> 11:30pm
> at the moment :-).
>
>
>
> Let's talk through one of your examples here when you write things up. I
> think tracing the execution as you see it will be important to coming to a
> mutual understanding here. I know that you have a solution that you see is
> going to solve the problems you see, but the I think the problems that you
> and I are seeing are possibly not the same thing. So let's walk through
> this execution trace and see what we can do.
>
>
>
> Thanks!
>
>
>
> -eric
>
>
>
> ------------------------------
>
> *From:* Eric Christopher [echristo at gmail.com]
> *Sent:* 22 September 2015 20:40
> *To:* Daniel Sanders; Renato Golin; Jim Grosbach
> *Cc:* llvm-dev at lists.llvm.org
>
>
> *Subject:* Re: The Trouble with Triples
>
>
>
> On Thu, Sep 17, 2015 at 6:21 AM Daniel Sanders <Daniel.Sanders at imgtec.com>
> wrote:
>
> I think we need to take a step further back and re-enter from the right
> starting point. The thing that's bothering me about the push back so far is
> that it's trying to discuss and understand the consequences of resolving
> the core problem while seemingly ignoring the core problem itself. The
> reason I've been steering everything back to GNU Triple's being ambiguous
> and inconsistent is because it's the root of all the problems and the fixes
> to the various issues fall out naturally once this core point has been
> addressed.
>
>
>
> *sigh*
>
>
>
>
>
> Here's the line of thought that I'd like people to start with:
>
> ·         Triples don't describe the target. They look like they should,
> but they don't. They're really just arbitrary strings.
>
>
>
> Triples are used as a starting point, but no more.
>
>
>
> ·         LLVM relies on Triple as a description of the target. It
> defines the backend to use, the binary format to use, OS and Vendor
> specific quirks to enable/disable, the default CPU, the default ABI, the
> endian, and countless other details about the target.
>
>
>
> These two statements aren't necessarily true in whole.
>
>
>
> a) We don't use the Triple to fully specify the target.
>
> b) We don't use the Triple to fully specify the ABI.
>
> c) We don't use the Triple to fully specify the CPU.
>
> d) We do use the triple to handle endianness since most, if not all,
> triples actually bother to encode endianness.
>
> e) The rest of the "countless details" may or may not be relevant, you
> haven't given an example of what you care about.
>
>
>
> From here on your email relies on all of these assumptions being true. So
> I'm going to skip past that part and go to where you answer some of my
> questions.
>
> At this point, in the MC layer we have a number of classes that need to
> know the ABI but lack this information. Our TargetMachine has an accurate
> TargetTuple object that describes the invariants of the desired target. The
> desired ABI is an invariant too so why not have it in the TargetTuple which
> is already plumbed in everywhere we need it? After all, it's a property of
> the target OS/Environment. If we have the ABI in the TargetTuple, then we
> don't need any other means to set the ABI, tools can set it up front in the
> TargetTuple and we don't need any command-line option handling for it in
> the backend.
>
>
>
> This isn't sufficient anyways as I don't want to depend on a weird
> serialization format to deal with something a simple command line can deal
> with (or you've said this in a way that's confused me). I see you saying
> you want:
>
>
>
> -tuple mips-linux-gnu-abio32-el
>
>
>
> to specify on a command line to, say, llvm-mc or a new assembler
> interface, or heck, to clang itself, that you want to compile for:
>
>
>
> -triple mipsel-linux-gnu -mabi=o32
>
>
>
> right? Basically? (Bikeshedding of how to actually serialize things aside?)
>
>
>
> Meanwhile, in clang we have a number of command line options that change
> the desired target. Let's say we've constructed a Triple and resolved it to
> TargetTuple (more on that below). We're now processing the –EL option. At
> the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
> triple, construct a Triple object from it and resolve the new Triple to a
> TargetTuple. But why do we need to bother with that kind of weird hackery
> when we can simply do Obj.setEndian(Little)? This is what Phase 7 of the
> plan is about. We end up with a cleaner way to process target changes that,
> until now, have required weird triple hacking to handle.
>
>
>
>
>
> This is something else I don't understand. Here is the first time you
> start talking about APIs which is what I'm particularly asking about in my
> earlier mails. I'd like to see how you plan on changing the TargetMachine
> and MC level APIs to deal with this. It seems like the Tuple is going to be
> a way to side-load information around to the MC layer and while I agree
> that something is necessary there, I don't think that this solution is the
> right one. (As I said earlier in the thread)
>
>
>
> I skipped the Triple -> TargetTuple resolution a moment ago and I should
> address that now. We already know that mapping Triple to TargetTuple is a
> many to many mapping. One Triple has many possible TargetTuple's depending
> on the environment. One TargetTuple can be formed from multiple possible
> Triples. In an ideal world, we'd like to bake in all of these mappings so
> that one clang binary supports everything. Unfortunately, being a many to
> many mapping, some of these mappings are mutually exclusive. Note that this
> isn't a new problem resulting from this project. The problem has always
> been there but has been ignored until now. To resolve this, we need to
> provide configure-time and possibly run-time controls for how this
> conversion is disambiguated. This resolution is performed as early as
> possible so that the middle/back-ends don't need to know anything about the
> ambiguity problem.
>
>
>
> The minute you start talking about configure time controls we've already
> lost. This, for me, is a non-starter. That said, I'd like to see the
> examples you think show that things are impossible to deal with in the
> current architecture.
>
>
>
> ---
>
>
>
> To reply more directly to your email:
>
>
>
> Thanks :)
>
>
>
> > What can't be done to TargetMachine to avoid this serialization?
>
>
>
> TargetMachine already has the serialization (see
> TargetMachine::TargetTriple). We're not doing anything new here. We're
> simply replacing one object holding faulty information with a new object
> holding reliable information.
>
>
>
>
>
> This is side stepping my question and making it about Triple. I've
> specifically said that TargetMachine does not and is not completely
> dependent upon Triple.
>
>
>
> > And a followup question: What can't be serialized at the function level
> in the IR to make certain things clear that aren't global? We already do
> this for a lot of command line options.
>
>
>
> The data I want to fix is global. I think the bit you may be getting hung
> up on here is that small portions of this global data can also be
> overridden at the function level. Those overrides aren't a problem and
> continue to operate in the same way as they do today.
>
>
>
> Examples please.
>
>
>
> > And one more: What global options do we need to consider here?
>
>
>
> I'm not certain I understand this question. If you're talking command line
> options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
> -mips64r[2356], -mabi=…. If you're talking about Triple -> TargetTuple
> mappings, there's quite a wide variety but the main ones for Mips are
> endian, architecture, default CPU, and default ABI.
>
>
>
> All of these are representable right now in the TargetMachine as far as I
> can tell. What examples are you having problems with?
>
>
>
>
>
> > The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level.
>
> > This is a fairly recently stated goal, but I think it makes sense for
> LLVM in general. TargetSubtargetInfo takes care of
>
> > everything that resides under this (as much as possible, some bits are
> still in transition, e.g. TargetOptions). This is part
>
> > of my suggestion to Daniel about the problems with MCSubtargetInfo and
> the assembler. Targets like Mips and ARM
>
> > were unfortunately designed to change things on the fly during assembly
> and need to collate or at least change defaults
>
> > as we're processing code. I definitely had to deal with a lot of the
> pain you're talking about when I was rewriting some
>
> > of the handling there during the TargetSubtargetInfo work.
>
>
>
> I generally agree with this. The key bit I need to draw attention to is
> that the 'defaults' don't change, but are instead overridden. These
> constant defaults are stored in TargetMachine and particularly
> TargetMachine::TargetTriple. These defaults are wrong for some toolchains
> since the information stored in TargetMachine::TargetTriple are wrong. It's
> the defaults I'm trying to fix rather than the overrides.
>
>
>
>
>
> I don't understand what you mean here.
>
>
>
> I think I understand your proposed plan now and it's a few steps ahead of
> where we are and where we need to be. I agree that overridable state should
> be in TargetSubtargetInfo, however I can't initialize that state without
> the default values which come from the faulty information in
> TargetMachine::TargetTriple. This triple work is a pre-requisite to your
> plan and at first I don't need to override ABI's.
>
>
>
>
>
> Can you provide an example of using a tool that you're having problems
> with?
>
>
>
> > Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing.
>
> > I also don't see this is bad, but I also don't see it taking all of them
> right now and I'm not sure how it solves some of the existing problems
>
> > with data sharing that we've got which is where the push back you're
> both getting is coming from here. Ultimately library-wise I can agree
>
> > with some of the directions you're headed - I just don't see the
> unification and interactions right now.
>
>
>
> I think we'll end up with TargetTuple taking over many arguments to
> TargetMachine but that's not my goal at this stage. My goal is simply to
> fix the faulty information currently held in Triple and use the
> now-accurate information in TargetTuple to fix various blocking issues that
> prevent a proper Mips toolchain product based on Clang/LLVM. At the end of
> Phase 7, it become possible to fix a number of issues that are impossible
> to fix right now because the available data we can consult at the moment is
> incorrect.
>
>
>
>
>
> Could you please provide some examples of things that are impossible right
> now with command lines, how those interact with the TargetMachine, and how
> you see it being impossible to deal with?
>
>
>
> Thanks
>
>
>
> -eric
>
>
>
>
>
> *From:* Eric Christopher [mailto:echristo at gmail.com]
> *Sent:* 16 September 2015 23:52
> *To:* Renato Golin; Jim Grosbach
> *Cc:* Daniel Sanders; llvm-dev at lists.llvm.org
>
>
> *Subject:* Re: The Trouble with Triples
>
>
>
> Let's take a step back here.
>
>
>
> It appears that you and Daniel are trying to solve some problems. I think
> solving problems is good, I just want to make sure that we're solving them
> in a way that gets us a decent API at the end. I also want to make sure
> we're solving the right problems.
>
>
>
> TargetTuple appears to be related to the TargetParser as you bring up in
> this mail. They're two separate parts of similar problems - people trying
> to both serialize command line options and communication from the front end
> to the backend with respect to target information.
>
>
>
> This leads me to a question: What can't be done to TargetMachine to avoid
> this serialization?
>
> And a followup question: What can't be serialized at the function level in
> the IR to make certain things clear that aren't global? We already do this
> for a lot of command line options.
>
> And one more: What global options do we need to consider here?
>
>
>
> The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level. This is a fairly
> recently stated goal, but I think it makes sense for LLVM in general.
> TargetSubtargetInfo takes care of everything that resides under this (as
> much as possible, some bits are still in transition, e.g. TargetOptions).
> This is part of my suggestion to Daniel about the problems with
> MCSubtargetInfo and the assembler. Targets like Mips and ARM were
> unfortunately designed to change things on the fly during assembly and need
> to collate or at least change defaults as we're processing code. I
> definitely had to deal with a lot of the pain you're talking about when I
> was rewriting some of the handling there during the TargetSubtargetInfo
> work.
>
>
>
> Now a bit more on TargetParser + TargetTuple:
>
>
>
> TargetParser appears to be trying to solve the parsing in Triple in a nice
> way for ARM and also some of the "what kind of subtarget feature
> canonicalization can we do in llvm that makes sense to communicate to the
> front end". I like this particular idea and have often wanted a library of
> feature handling, but it seems to have stabilized at an ARM specific set of
> code with no defined interface. I can't even figure out how I'd use it in
> lib/Basic right now for any target other than ARM. This isn't a
> condemnation of TargetParser, but I think it's something that needs to be
> thought through a bit more. It's been hooked up well before I'd expected it
> to and right now if we moved it to the ARM backend from Support it'd make
> just as much sense as it does where it is now other than making clang
> depend on the ARM backend as well as the X86 backend :)
>
>
>
> Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing. I also
> don't see this is bad, but I also don't see it taking all of them right now
> and I'm not sure how it solves some of the existing problems with data
> sharing that we've got which is where the push back you're both getting is
> coming from here. Ultimately library-wise I can agree with some of the
> directions you're headed - I just don't see the unification and
> interactions right now.
>
>
>
> As a suggestion as a way forward here let's see if we can get my questions
> above answered and also show some of how the interactions between llvm's
> libraries are going to get fixed, moved to a better place, etc here.
>
>
>
> Thanks!
>
>
>
> -eric
>
>
>
>
>
> On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at linaro.org>
> wrote:
>
> On 16 September 2015 at 21:56, Jim Grosbach <grosbach at apple.com> wrote:
> > Why do we care about GAS? We have an assembler.
>
> It's not that simple.
>
> There are a lot of old code out there, including the Linux kernel
> which we do care a lot, that only compiles with GAS. We're slowly
> moving the legacy code up to modern standards, and specifically some
> kernel folks are happy to move up not only the asm syntax, but the C
> standard and move away from GNU-specific behaviour. But we're not
> quite there yet, and might not be for a few more years. so, yes, we
> still care about GAS.
>
> But this is not just about GAS.
>
> As I said on my previous email, this is about clearing the bloat in
> target descriptions by both: removing the need for adding numerous CPU
> names, target features, architecture names (xscale, strongarm, etc),
> AND making sure all parties (front/middle/back-ends) speak the same
> language, produced from the same source.
>
> The TargetTuple is that common language, and the TargetParser created
> from the TableGen files is the common source. The Triple becomes a
> legacy constructor value for the Tuple. All other target information
> classes are already (or should be) generated from the TableGen files,
> so the ultimate source becomes the TableGen description, which I think
> it what you were aiming to on your comment.
>
> For simple architectures, like x86, you don't even need a
> TargetParser. You can easily construct the Tuple from a triple and use
> the Tuple as you've always used the triple. No harm done. But for the
> complex ones like ARM and MIPS, having a common interface generated
> from the same place the other interfaces are is important to avoid
> more bridges between front and middle and back end interpretations of
> the same target. Whatever legacy ARM or MIPS carry can be isolated in
> their own implementation, leaving the rest of the targets with a clean
> and simple interface.
>
> cheers,
> --renato
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150923/7fc1f49c/attachment.html>


More information about the llvm-dev mailing list