[llvm-dev] Encode target-abi into LLVM bitcode for LTO.
Zakk via llvm-dev
llvm-dev at lists.llvm.org
Wed Jan 15 01:37:44 PST 2020
David Blaikie <dblaikie at gmail.com> 於 2020年1月14日 週二 上午2:15寫道:
>
>
> On Mon, Jan 13, 2020 at 6:12 AM Zakk <zakk0610 at gmail.com> wrote:
>
>>
>>
>> David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> 於 2020年1月11日 週六
>> 上午2:03寫道:
>>
>>> Ah, OK - thanks for walking me through that.
>>>
>>> Fair enough, I think I understand the issue/tradeoff now - and that the
>>> other module level metadata don't currently influence the target
>>> configuration at this level?
>>>
>>>
>> I'm not sure, I only know that the target-abi is decided and computed
>> when instantiating the TargetMachine, and TragetMachine constructor doesn't
>> have IR as argument.
>>
>
> Could you look into the other module level metadata to understand/explain
> that, as it may help inform the design here?
>
>
I grep "addModuleFlag" in clang and list all module level metadata:
"NumRegisterParameters": implement __attribute((regparm(3))) which is x86
specific feature and used in X86TargetLowering phase.
"cfguard": implement Control Flow Guard (Check mechanism) in IR level pass.
"StrictVTablePointers", "StrictVTablePointersRequirement": there is no code
use this info in llvm source tree
"min_enum_size", "wchar_size": indicate C type size, ARMAsmPrinter use it
to emit corresponded asm attributes.
"Cross-DSO CFI", "CFI Canonical Jump Tables": implement control flow
integrity (CFI) schemes in IR level pass.
"cf-protection-return" and "cf-protection-branch": implement control flow
architecture protection in x86, used in X86ISelLowering and X86AsmPrinter
phases.
"nvvm-reflect-ftz": implement optimized code paths that flush subnormals
to zero in NVPTX custom IR pass.
"EnableSplitLTOUnit", "ThinLTO": LTO related stuff, I didn't look that
because I think they will not influence the target configuration.
I didn't dig in those "Dwarf Version", "CodeView", "CodeViewGHash", "Debug
Info Version" and others Objective-C related info in module level metadata.
But I found several metadata info are handled at
AsmPrinter::EmitStartOfAsmFile(Module &M), but there is no any similar
interface in MC layer. ex. MCAsmBackend or MCStreamer.
> I guess this starts to come from the other direction for me, then: what
>>> purpose does the "target triple" in the module textual IR serve? I guess
>>> it's not used to create the backend, but at best used to validate that it's
>>> correct/the same as what the backend's been configured with? (or perhaps
>>> that is accessible before the Module proper is parsed?)
>>>
>>
>> I'm not sure too. For RISC-V as Sam mention before, it's not easy to add
>> ABI to the triple... but Yes, the target triple is accessible before the
>> Module proper is parsed in current codegen flow.
>> so in my PoC https://reviews.llvm.org/D72245#change-sHyISc6hOqcy, I need
>> to changed the flow but I think static method is not a good idea.
>>
>
> And this too. If the triple from the IR file is used in the places you
> need this new ABI parameter, it might inform how best to pass this other
> data through too.
>
>
>> On Fri, Jan 10, 2020 at 9:09 AM Sam Elliott via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> I also work on the RISC-V backend, and have been doing a little work on
>>>> the ELF psABI document for RISC-V.
>>>>
>>>> I agree that, conceptually, the psABI choice should be in the module
>>>> metadata.
>>>>
>>>> Zakk, however, has discovered a phase ordering issue within LLVM that
>>>> relates to this approach. The phase ordering problem is that the LTO
>>>> backend is currently setup without interrogating the current module for any
>>>> information which might affect its setup. He has two patches that propose
>>>> two different ways forward, but he thinks neither is satisfactory. It would
>>>> be useful to have some guidance as to which approach is preferred by LLVM.
>>>>
>>>> Of course, it would be great if any such adopted approach was
>>>> compatible with Mips, given that backend seems to have a similar kind of
>>>> problem (though they have also managed to solve it different ways).
>>>>
>>>> As an aside, adding the ABI to the triple is not particularly easy, as,
>>>> for instance, there’s already half an ABI choice in the
>>>> ‘riscv64-unknown-linux-gnu` “triple”. The gnu/musl ABI choice is orthogonal
>>>> to the RISC-V calling convention choice around the use of software/hardware
>>>> floating point ABIs, and saying you have to specify both in the triple
>>>> seems like it will only confuse users.
>>>>
>>>> tldr: Module metadata seems like the right approach, feedback on which
>>>> of the two of Zakk’s patches (from the original email) forms the better
>>>> approach to get this information into the LTO backend setup would be
>>>> helpful.
>>>>
>>>> Sam
>>>>
>>>> > On 10 Jan 2020, at 12:23 am, David Blaikie via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>> >
>>>> > On Wed, Jan 8, 2020 at 5:57 PM Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>> > Right. I think that's what we ended up doing rather than a more
>>>> general attribute on the module itself.
>>>> >
>>>> > What sort of further generality did you have in mind?
>>>> >
>>>> > *shrugs* Probably ok? I'd probably prefer not to have to have target
>>>> code to do the evaluation if possible,
>>>> >
>>>> > Target code to evaluate what, sorry? I'm not following (this whole
>>>> conversation's started being a bit confusing to me - from what I can see,
>>>> module metadata pretty well captures the feature request here (a module
>>>> level target-specific attribute that can fail during IR linking if linking
>>>> modules with inconsistent values for this attribute)
>>>> >
>>>> > but everything is weird and an edge case - mips abis more than some :)
>>>> >
>>>> > -eric
>>>> >
>>>> > On Wed, Jan 8, 2020 at 8:58 AM David Blaikie <dblaikie at gmail.com>
>>>> wrote:
>>>> > Oh, I should say - the module flags metadata also has support for
>>>> "error if you try to merge two modules with different values for this flag".
>>>> >
>>>> > On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>> >
>>>> >
>>>> >> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders <
>>>> daniel_l_sanders at apple.com> wrote:
>>>> >>
>>>> >>
>>>> >>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote:
>>>> >>>
>>>> >>>
>>>> >>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道:
>>>> >>> If this is something that can vary per file in a compilation and
>>>> resolve correctly when one object file is built with one ABI and another
>>>> object file is built with a different ABI (that seems to be antithetical to
>>>> the concept of "ABI" Though) - then it should be a subtarget feature.
>>>> >>>
>>>> >>> ABI is generally something that has to be agreed upon across object
>>>> files - so it wouldn't make sense to link two object files with two
>>>> different ABIs. What's going on here that makes that valid in this case?
>>>> >>>
>>>> >>>
>>>> >>> Are you talking about that "[mips] Pass ABI name via -target-abi
>>>> instead of target-features"?
>>>> >>>
>>>> >>> I'm not talking about that patch in particular (I have no specific
>>>> knowledge of mips or its implementation) - but speaking about the general
>>>> design of LLVM's subtarget features.
>>>> >>>
>>>> >>> Might be interesting to know why that change was made & may help
>>>> explain what's going on here.
>>>> >>
>>>> >> It's been a while so I don't remember the detail but IIRC one of the
>>>> reasons was that mips had a feature bit per ABI and had a lot of duplicated
>>>> code sanity checking that only one bit was enabled and deriving the ABI
>>>> from the feature bits. The -target-abi option already existed and using
>>>> that prevented the possibility of having more than one ABI selected.
>>>> >>
>>>> >> There was a lot of code (some of which didn't have access to target
>>>> features) in the backend that tried to derive the ABI from the arch
>>>> component of the triple (e.g. mips64 => n64 ABI) even though there were
>>>> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's)
>>>> and there isn't a canonical choice for any given triple (it varies between
>>>> linux distributions and toolchains in general). Settling on -target-abi
>>>> allowed us to sort out the inconsistencies in the backends opinion of what
>>>> the selected ABI was. It also allowed us to move the selection of the ABI
>>>> into the frontend where disagreements between distributions/toolchains on
>>>> what each triple means was easier to deal with.
>>>> >>
>>>> >> Is this something that can vary per function in a program? (that
>>>> seems confusing to me - ABI is usually, sort of by definition, the thing
>>>> that all parts of the program have to agree with (at least on either side
>>>> of any function call - I suppose different functions could have different
>>>> ABIs as long as the function declarations carried ABI information so
>>>> callers could cooperate, etc)) It sounds to me like that's what Zakk is
>>>> suggesting/grappling with.
>>>> >
>>>> > No, it was a per-binary thing for mips and was stored in the ELF
>>>> header. Ignoring a couple quirks*, every object in the program had to agree
>>>> on the ABI in order to link.
>>>> >
>>>> > I'm not particularly familiar with LTO but going by the description
>>>> of the problem it seems to me that the overall issue is that for 1, 2, and
>>>> 5, each module fails to completely describe the contents. They each have a
>>>> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere.
>>>> As a result you can't check that you aren't trying to mix incompatible
>>>> modules and can only trust (and require) the command line option. It's
>>>> worth mentioning that DataLayout tends to change for different ABI's so the
>>>> ABI is kind-of there but there isn't anything that really guarantees that
>>>> there's a 1:1 relationship.
>>>> >
>>>> > 3 and 4 fix the problem of the missing labels but the snag with 4 is
>>>> that target features are overridable at the function level too and that
>>>> doesn't really make sense for ABI's (it's fine for calling conventions but
>>>> that's only part of the ABI and calling conventions are described elsewhere
>>>> in the IR anyway). Without changing the IR, 3 looks like the only one that
>>>> solves the overall problem but then you have potential for problems where
>>>> the official triple for a platform doesn't match what needs to be in the
>>>> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64
>>>> ABI (or more rarely O32) depending on the
>>>> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were
>>>> generally moving towards the idea of canonical triples which contained the
>>>> ABI and some lowering code on the user facing interfaces to disambiguate
>>>> things like mips64-linux-gnu to mips64-linux-gnuabin32.
>>>> >
>>>> >
>>>> > To reply here a bit:
>>>> >
>>>> > I worry about target triple being used, but I think I do/did agree
>>>> that it's probably the best we can move to in the near term. My concern is
>>>> that we will have diverged from more "canonical" triples that are used in
>>>> other places just for our compilation model. I'd love to be able to encode
>>>> ABI into the module in some way and make it an error to link two modules
>>>> that have incompatible ABIs and am definitely up to ideas on how to encode
>>>> target specific module level data into the module. I'd like to avoid
>>>> metadata if possible. Any thoughts here on how you'd like to see it encoded
>>>> for the long term?
>>>> >
>>>> > I was going to say - module metadata has those semantics (but, yeah,
>>>> do have the "this is load bearing metadata" problem).
>>>> >
>>>> > But let's see what else is already there - NumRegisterParameters (no
>>>> idea what that is, but that sounds like an ABI feature/not something you
>>>> could drop & maintain correctness), Dwarf Version (kinda), CodeView
>>>> (kinda), PIC level (probably load bearing), PIE level (similar), Code Model
>>>> (load bearing).
>>>> >
>>>> > https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I
>>>> think this is probably the right tool for this now, given what else is
>>>> already here. If someone wants to make this solution not metadata (but I
>>>> think this "global flags metadata" is specifically treated to have all the
>>>> semantics we want here, so I'm not sure what else would be created that
>>>> didn't look basically the same) then it can be done & all these things can
>>>> be ported over to whatever that new thing is.
>>>> >
>>>> >
>>>> > *Just for completeness, the quirks I can remember off-hand were:
>>>> > - IEEE754 1985 and 2008 would successfully cross-link unless you used
>>>> a flag indicating that it mattered. This was because we wanted to omit the
>>>> 1985 standard from newer chips but there were many ecosystems using it due
>>>> to historical reasons. In practice, very few programs care about the tiny
>>>> details (does negation trap, etc.) so we essentially force-migrated whole
>>>> ecosystems by relaxing the link requirements and changing the default.
>>>> > - Along the same lines, we also supported cross-linking specific
>>>> variants of the O32 ABI. There was only supposed to be one O32 but an
>>>> unfortunate mis-reading of the ABI spec coupled with a failure to catch it
>>>> with conformance tests split it in two. Luckily, Matthew Fortune found a
>>>> way to reunite them without breaking either one by adding a third that
>>>> followed the original intent of the spec and was compatible with either one
>>>> (but not both at once) and then migrating everyone to that.
>>>> >
>>>> >
>>>> > *sigh* I remember that. :)
>>>> >
>>>> > Thanks for chiming in Daniel!
>>>> >
>>>> > -eric
>>>> >
>>>> >> If it can vary per function, then the ABI information shouldn't be
>>>> used outside the per-function context (ie: no global variables/other output
>>>> could depend on the ABI because which function's ABI would it depend on?).
>>>> >>
>>>> >>
>>>> >>> I don't know WHY -target-abi is passing via different option, not
>>>> via -mattr (subtarget feature)
>>>> >>> maybe usually subtarget feature is used to manages different
>>>> specific ISA.
>>>> >>>
>>>> >>>
>>>> >>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>> >>> Hi all.
>>>> >>>
>>>> >>> There are two steps in LTO codegen so the problem is how to pass
>>>> ABI info into LTO code generator.
>>>> >>>
>>>> >>> The easier way is pass -target-abi via option to LTO codegen, but
>>>> there is linking issue when linking two bitcodes generated by different
>>>> -mabi option. (see https://reviews.llvm.org/D71387#1792169)
>>>> >>>
>>>> >>> Usually the ABI info for a file is derived from target triple, mcpu
>>>> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr
>>>> option, so the one of solutions is encoding target-abi in IR via LLVM
>>>> module flags metadata.
>>>> >>>
>>>> >>> But there is an another issue in assembler. In current LLVM design,
>>>> there is no mechanism to extract info from IR before AsmBackend
>>>> construction, so I use some little weird approach to init target-abi option
>>>> before construct AsmBackend[1] or reassign target-abi option in
>>>> getSubtargetImpl and do some hack in backend[2].
>>>> >>>
>>>> >>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see
>>>> llc.cpp)
>>>> >>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h)
>>>> >>>
>>>> >>> I think [1] and [2] are not good enough, the other ideals like
>>>> >>>
>>>> >>> 3. encode target abi info in triple name. ex.
>>>> riscv64-unknown-elf-lp64d
>>>> >>> 4. encode target-abi into in target-feature (maybe it's not a good
>>>> ideal because mips reverted this approach
>>>> >>> before.
>>>> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583)
>>>> >>>
>>>> >>> 5. users should pass target-abi themselves. (append
>>>> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f)
>>>> >>>
>>>> >>> Is it a good idea to encode target-abi into bitcode?
>>>> >>> If yes, is there another good approach to fix AsmBackend issue?
>>>> >>> I’d appreciate any help or suggestions.
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>> --
>>>> >>> Best regards,
>>>> >>> Kuan-Hsu
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> LLVM Developers mailing list
>>>> >>> llvm-dev at lists.llvm.org
>>>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Best regards,
>>>> >>> Kuan-Hsu
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> LLVM Developers mailing list
>>>> >>> llvm-dev at lists.llvm.org
>>>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> --
>>>> Sam Elliott
>>>> Software Developer - LLVM
>>>> lowRISC CIC
>>>> --
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>>
>> --
>> Best regards,
>> Kuan-Hsu
>>
>>
>>
--
Best regards,
Kuan-Hsu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200115/abe5d054/attachment.html>
More information about the llvm-dev
mailing list