[llvm-dev] Encode target-abi into LLVM bitcode for LTO.

Mon Jan 27 15:04:08 PST 2020

On Mon, Jan 27, 2020 at 2:56 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
>
> On Mon, Jan 20, 2020 at 8:05 AM Sam Elliott via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> To follow up on this issue:
>>
>> Our plan is still to encode `target-abi` into the module flags for RISC-V
>> LLVM IR modules. As was pointed out earlier in this thread, the function
>> lowering in Clang is slightly different for the ABIs which support hardware
>> floating point. Therefore adding the `target-abi` metadata at the very
>> least can help catch places where modules with incompatible ABIs are merged.
>>
>> With this in mind, our plan is the following:
>>
>> 1. We have two patches, both prepared by Zakk, one to add the
>> `target-abi` to the module flags (https://reviews.llvm.org/D72755), and
>> one to check that LLVM has been passed the correct `target-abi` to match
>> the module (https://reviews.llvm.org/D72768).
>>   These two are intended to ensure that we are compiling the module as
>> expected, and to protect ourselves from errors that could arise otherwise,
>> and are entirely within the RISC-V target.
>>   We would like these to be backported to LLVM 10.0 once they are landed.
>>
>
> OK - so this means the target-abi module flag wouldn't be used to motivate
> the codegen, just to check that the command line flag is correct, and the
> command line flag would be what's used to motivate the codegen choices?
>
> Sounds OK - but I guess the existence of that command line flag (rather
> than using the module metadata) is a workaround for not being able to use
> the module metadata at the time when it's needed (that was something being
> debated/discussed in the various code reviews, yes?). I'd still be curious
> to have a well documented explanation for why that was complicated or not
> possible - and specifically why the target-triple and data layout which are
> also part of the Module are easier to access/use at the right times (is the
> target triple/data layout only needed later than the target abi? or is it
> that there's some workaround for accessing triple/data layout earlier &
> that workaround isn't as easy to apply to the module metadata which is a
> bit more module-centric, in some way?)
>

FWIW if we're going to put something in the module I would want us to use
it.

-eric

>
>
>> 2. We will try to make all the changes required for RISC-V LTO in time
>> for the LLVM 11.0 branch date. This includes the target-independent changes
>> to TargetMachine initialisation (the design of which is not yet agreed).
>>   Enabling LTO will require and use the `target-abi` module flag.
>>
>
> Ah, OK so this is the bit I was referencing above then. So your goal is to
> get LTO to be motivated by the target-abi module flag, where as in (1)
> it'll be motivated by a command line flag, and the module flag will just be
> checked later that it's consistent with the command line flag?
>
> So then this step (2) is where I'd be interested to know the answers to my
> above questions re: what's different about target triple/data layout and
> the target abi. Are they all needed around the same time, but their
> representation (target triple and data layout being first class properties
> of the module, but maybe they're parsed separatrely in some codepath to get
> them earlier than otherwise?) different? Or are their representations the
> same, but target-abi is needed earlier/in a different place than
> triple/data layout?
>
> Anyway, I'd like to see that (2) discussion in more detail - probably
> here/in an llvm-dev thread, to make sure it's well worked through,
> especially if it means adding significant new layering.
>
>
>> One advantage I see of this is that other frontends (rustc, LDC, zig, and
>> others) have a whole release cycle (10.0 -> 11.0) in order to add the right
>> target-abi metadata to LLVM modules that they produce, in order to enable
>> LTO. This also means we don’t need to rush target-independent changes to
>> TargetMachine into LLVM 10.0.
>>
>
> (1) seems fine/trivial-ishly OK to me & yeah, if it paves the way for (2)
> that's fine by me (pending other people who have more context than I do
> about any of this).
>
> Thanks for summarizing/looping back here!
>
> - Dave
>
>
>>
>> I welcome feedback on this plan.
>>
>> Sam
>>
>>
>> > On 15 Jan 2020, at 9:37 am, Zakk <zakk0610 at gmail.com> wrote:
>> >
>> >
>> >
>> > David Blaikie <dblaikie at gmail.com> 於 2020年1月14日 週二 上午2:15寫道：
>> >
>> >
>> > On Mon, Jan 13, 2020 at 6:12 AM Zakk <zakk0610 at gmail.com> wrote:
>> >
>> >
>> > David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> 於 2020年1月11日 週六
>> 上午2:03寫道：
>> > Ah, OK - thanks for walking me through that.
>> >
>> > Fair enough, I think I understand the issue/tradeoff now - and that the
>> other module level metadata don't currently influence the target
>> configuration at this level?
>> >
>> >
>> > I'm not sure, I only know that the target-abi is decided and computed
>> when instantiating the TargetMachine, and TragetMachine constructor doesn't
>> have IR as argument.
>> >
>> > Could you look into the other module level metadata to
>> understand/explain that, as it may help inform the design here?
>> >
>> > I grep "addModuleFlag" in clang and list all module level metadata:
>> >
>> > "NumRegisterParameters":  implement __attribute((regparm(3))) which is
>> x86 specific feature and used in X86TargetLowering phase.
>> > "cfguard": implement Control Flow Guard (Check mechanism) in IR level
>> pass.
>> > "StrictVTablePointers", "StrictVTablePointersRequirement": there is no
>> code use this info in llvm source tree
>> > "min_enum_size", "wchar_size":  indicate C type size, ARMAsmPrinter use
>> it to emit corresponded asm attributes.
>> > "Cross-DSO CFI", "CFI Canonical Jump Tables": implement control flow
>> integrity (CFI) schemes in IR level pass.
>> > "cf-protection-return" and "cf-protection-branch": implement control
>> flow architecture protection in x86, used in X86ISelLowering and
>> X86AsmPrinter phases.
>> > "nvvm-reflect-ftz":  implement optimized code paths that flush
>> subnormals to zero in NVPTX custom IR pass.
>> > "EnableSplitLTOUnit", "ThinLTO": LTO related stuff, I didn't look that
>> because I think they will not influence the target configuration.
>> >
>> >
>> > I didn't dig in those "Dwarf Version", "CodeView", "CodeViewGHash",
>> "Debug Info Version" and others Objective-C related info in module level
>> metadata.
>> >
>> > But I found several metadata info are handled at
>> AsmPrinter::EmitStartOfAsmFile(Module &M), but there is no any similar
>> interface in MC layer. ex. MCAsmBackend or MCStreamer.
>> >
>> >
>> > I guess this starts to come from the other direction for me, then: what
>> purpose does the "target triple" in the module textual IR serve? I guess
>> it's not used to create the backend, but at best used to validate that it's
>> correct/the same as what the backend's been configured with? (or perhaps
>> that is accessible before the Module proper is parsed?)
>> >
>> > I'm not sure too. For RISC-V as Sam mention before, it's not easy to
>> add ABI to the triple... but Yes, the target triple is accessible before
>> the Module proper is parsed in current codegen flow.
>> > so in my PoC https://reviews.llvm.org/D72245#change-sHyISc6hOqcy, I
>> need to changed the flow but I think static method is not a good idea.
>> >
>> > And this too. If the triple from the IR file is used in the places you
>> need this new ABI parameter, it might inform how best to pass this other
>> data through too.
>> >
>> > On Fri, Jan 10, 2020 at 9:09 AM Sam Elliott via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > I also work on the RISC-V backend, and have been doing a little work on
>> the ELF psABI document for RISC-V.
>> >
>> > I agree that, conceptually, the psABI choice should be in the module
>> metadata.
>> >
>> > Zakk, however, has discovered a phase ordering issue within LLVM that
>> relates to this approach. The phase ordering problem is that the LTO
>> backend is currently setup without interrogating the current module for any
>> information which might affect its setup. He has two patches that propose
>> two different ways forward, but he thinks neither is satisfactory. It would
>> be useful to have some guidance as to which approach is preferred by LLVM.
>> >
>> > Of course, it would be great if any such adopted approach was
>> compatible with Mips, given that backend seems to have a similar kind of
>> problem (though they have also managed to solve it different ways).
>> >
>> > As an aside, adding the ABI to the triple is not particularly easy, as,
>> for instance, there’s already half an ABI choice in the
>> ‘riscv64-unknown-linux-gnu` “triple”. The gnu/musl ABI choice is orthogonal
>> to the RISC-V calling convention choice around the use of software/hardware
>> floating point ABIs, and saying you have to specify both in the triple
>> seems like it will only confuse users.
>> >
>> > tldr: Module metadata seems like the right approach, feedback on which
>> of the two of Zakk’s patches (from the original email) forms the better
>> approach to get this information into the LTO backend setup would be
>> helpful.
>> >
>> > Sam
>> >
>> > > On 10 Jan 2020, at 12:23 am, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > >
>> > > On Wed, Jan 8, 2020 at 5:57 PM Eric Christopher <echristo at gmail.com>
>> wrote:
>> > > Right. I think that's what we ended up doing rather than a more
>> general attribute on the module itself.
>> > >
>> > > What sort of further generality did you have in mind?
>> > >
>> > > *shrugs* Probably ok? I'd probably prefer not to have to have target
>> code to do the evaluation if possible,
>> > >
>> > > Target code to evaluate what, sorry? I'm not following (this whole
>> conversation's started being a bit confusing to me - from what I can see,
>> module metadata pretty well captures the feature request here (a module
>> level target-specific attribute that can fail during IR linking if linking
>> modules with inconsistent values for this attribute)
>> > >
>> > > but everything is weird and an edge case - mips abis more than some :)
>> > >
>> > > -eric
>> > >
>> > > On Wed, Jan 8, 2020 at 8:58 AM David Blaikie <dblaikie at gmail.com>
>> wrote:
>> > > Oh, I should say - the module flags metadata also has support for
>> "error if you try to merge two modules with different values for this flag".
>> > >
>> > > On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com>
>> wrote:
>> > >
>> > >
>> > > On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com>
>> wrote:
>> > >
>> > >
>> > > On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > >
>> > >
>> > >> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote:
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders <
>> daniel_l_sanders at apple.com> wrote:
>> > >>
>> > >>
>> > >>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote:
>> > >>>
>> > >>>
>> > >>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道：
>> > >>> If this is something that can vary per file in a compilation and
>> resolve correctly when one object file is built with one ABI and another
>> object file is built with a different ABI (that seems to be antithetical to
>> the concept of "ABI" Though) - then it should be a subtarget feature.
>> > >>>
>> > >>> ABI is generally something that has to be agreed upon across object
>> files - so it wouldn't make sense to link two object files with two
>> different ABIs. What's going on here that makes that valid in this case?
>> > >>>
>> > >>>
>> > >>> Are you talking about that "[mips] Pass ABI name via -target-abi
>> instead of target-features"?
>> > >>>
>> > >>> I'm not talking about that patch in particular (I have no specific
>> knowledge of mips or its implementation) - but speaking about the general
>> design of LLVM's subtarget features.
>> > >>>
>> > >>> Might be interesting to know why that change was made & may help
>> explain what's going on here.
>> > >>
>> > >> It's been a while so I don't remember the detail but IIRC one of the
>> reasons was that mips had a feature bit per ABI and had a lot of duplicated
>> code sanity checking that only one bit was enabled and deriving the ABI
>> from the feature bits. The -target-abi option already existed and using
>> that prevented the possibility of having more than one ABI selected.
>> > >>
>> > >> There was a lot of code (some of which didn't have access to target
>> features) in the backend that tried to derive the ABI from the arch
>> component of the triple (e.g. mips64 => n64 ABI) even though there were
>> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's)
>> and there isn't a canonical choice for any given triple (it varies between
>> linux distributions and toolchains in general). Settling on -target-abi
>> allowed us to sort out the inconsistencies in the backends opinion of what
>> the selected ABI was. It also allowed us to move the selection of the ABI
>> into the frontend where disagreements between distributions/toolchains on
>> what each triple means was easier to deal with.
>> > >>
>> > >> Is this something that can vary per function in a program? (that
>> seems confusing to me - ABI is usually, sort of by definition, the thing
>> that all parts of the program have to agree with (at least on either side
>> of any function call - I suppose different functions could have different
>> ABIs as long as the function declarations carried ABI information so
>> callers could cooperate, etc)) It sounds to me like that's what Zakk is
>> suggesting/grappling with.
>> > >
>> > > No, it was a per-binary thing for mips and was stored in the ELF
>> header. Ignoring a couple quirks*, every object in the program had to agree
>> on the ABI in order to link.
>> > >
>> > > I'm not particularly familiar with LTO but going by the description
>> of the problem it seems to me that the overall issue is that for 1, 2, and
>> 5, each module fails to completely describe the contents. They each have a
>> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere.
>> As a result you can't check that you aren't trying to mix incompatible
>> modules and can only trust (and require) the command line option. It's
>> worth mentioning that DataLayout tends to change for different ABI's so the
>> ABI is kind-of there but there isn't anything that really guarantees that
>> there's a 1:1 relationship.
>> > >
>> > > 3 and 4 fix the problem of the missing labels but the snag with 4 is
>> that target features are overridable at the function level too and that
>> doesn't really make sense for ABI's (it's fine for calling conventions but
>> that's only part of the ABI and calling conventions are described elsewhere
>> in the IR anyway). Without changing the IR, 3 looks like the only one that
>> solves the overall problem but then you have potential for problems where
>> the official triple for a platform doesn't match what needs to be in the
>> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64
>> ABI (or more rarely O32) depending on the
>> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were
>> generally moving towards the idea of canonical triples which contained the
>> ABI and some lowering code on the user facing interfaces to disambiguate
>> things like mips64-linux-gnu to mips64-linux-gnuabin32.
>> > >
>> > >
>> > > To reply here a bit:
>> > >
>> > > I worry about target triple being used, but I think I do/did agree
>> that it's probably the best we can move to in the near term. My concern is
>> that we will have diverged from more "canonical" triples that are used in
>> other places just for our compilation model. I'd love to be able to encode
>> ABI into the module in some way and make it an error to link two modules
>> that have incompatible ABIs and am definitely up to ideas on how to encode
>> target specific module level data into the module. I'd like to avoid
>> metadata if possible. Any thoughts here on how you'd like to see it encoded
>> for the long term?
>> > >
>> > > I was going to say - module metadata has those semantics (but, yeah,
>> do have the "this is load bearing metadata" problem).
>> > >
>> > > But let's see what else is already there - NumRegisterParameters (no
>> idea what that is, but that sounds like an ABI feature/not something you
>> could drop & maintain correctness), Dwarf Version (kinda), CodeView
>> (kinda), PIC level (probably load bearing), PIE level (similar), Code Model
>> (load bearing).
>> > >
>> > > https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I
>> think this is probably the right tool for this now, given what else is
>> already here. If someone wants to make this solution not metadata (but I
>> think this "global flags metadata" is specifically treated to have all the
>> semantics we want here, so I'm not sure what else would be created that
>> didn't look basically the same) then it can be done & all these things can
>> be ported over to whatever that new thing is.
>> > >
>> > >
>> > > *Just for completeness, the quirks I can remember off-hand were:
>> > > - IEEE754 1985 and 2008 would successfully cross-link unless you used
>> a flag indicating that it mattered. This was because we wanted to omit the
>> 1985 standard from newer chips but there were many ecosystems using it due
>> to historical reasons. In practice, very few programs care about the tiny
>> details (does negation trap, etc.) so we essentially force-migrated whole
>> ecosystems by relaxing the link requirements and changing the default.
>> > > - Along the same lines, we also supported cross-linking specific
>> variants of the O32 ABI. There was only supposed to be one O32 but an
>> unfortunate mis-reading of the ABI spec coupled with a failure to catch it
>> with conformance tests split it in two. Luckily, Matthew Fortune found a
>> way to reunite them without breaking either one by adding a third that
>> followed the original intent of the spec and was compatible with either one
>> (but not both at once) and then migrating everyone to that.
>> > >
>> > >
>> > > *sigh* I remember that. :)
>> > >
>> > > Thanks for chiming in Daniel!
>> > >
>> > > -eric
>> > >
>> > >> If it can vary per function, then the ABI information shouldn't be
>> used outside the per-function context (ie: no global variables/other output
>> could depend on the ABI because which function's ABI would it depend on?).
>> > >>
>> > >>
>> > >>> I don't know WHY -target-abi is passing via different option, not
>> via -mattr (subtarget feature)
>> > >>> maybe usually subtarget feature is used to manages different
>> specific ISA.
>> > >>>
>> > >>>
>> > >>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> > >>> Hi all.
>> > >>>
>> > >>> There are two steps in LTO codegen so the problem is how to pass
>> ABI info into LTO code generator.
>> > >>>
>> > >>> The easier way is pass -target-abi via option to LTO codegen, but
>> there is linking issue when linking two bitcodes generated by different
>> -mabi option. (see https://reviews.llvm.org/D71387#1792169)
>> > >>>
>> > >>> Usually the ABI info for a file is derived from target triple, mcpu
>> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr
>> option, so the one of solutions is encoding target-abi in IR via LLVM
>> module flags metadata.
>> > >>>
>> > >>> But there is an another issue in assembler. In current LLVM design,
>> there is no mechanism to extract info from IR before AsmBackend
>> construction, so I use some little weird approach to init target-abi option
>> before construct AsmBackend[1] or reassign target-abi option in
>> getSubtargetImpl and do some hack in backend[2].
>> > >>>
>> > >>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see
>> llc.cpp)
>> > >>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h)
>> > >>>
>> > >>> I think [1] and [2] are not good enough, the other ideals like
>> > >>>
>> > >>> 3. encode target abi info in triple name. ex.
>> riscv64-unknown-elf-lp64d
>> > >>> 4. encode target-abi into in target-feature (maybe it's not a good
>> ideal because mips reverted this approach
>> > >>> before.
>> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583)
>> > >>>
>> > >>> 5. users should pass target-abi themselves. (append
>> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f)
>> > >>>
>> > >>> Is it a good idea to encode target-abi into bitcode?
>> > >>> If yes, is there another good approach to fix AsmBackend issue?
>> > >>> I’d appreciate any help or suggestions.
>> > >>>
>> > >>> Thanks.
>> > >>>
>> > >>> --
>> > >>> Best regards,
>> > >>> Kuan-Hsu
>> > >>>
>> > >>>
>> > >>>
>> > >>> _______________________________________________
>> > >>> LLVM Developers mailing list
>> > >>> llvm-dev at lists.llvm.org
>> > >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Best regards,
>> > >>> Kuan-Hsu
>> > >>>
>> > >>>
>> > >>> _______________________________________________
>> > >>> LLVM Developers mailing list
>> > >>> llvm-dev at lists.llvm.org
>> > >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > >
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> > --
>> > Sam Elliott
>> > Software Developer - LLVM
>> > lowRISC CIC
>> > --
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> >
>> > --
>> > Best regards,
>> > Kuan-Hsu
>> >
>> >
>> >
>> >
>> > --
>> > Best regards,
>> > Kuan-Hsu
>>
>> --
>> Sam Elliott
>> Software Developer - LLVM
>> lowRISC CIC
>> --
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200127/a40c3256/attachment-0001.html>