[llvm-dev] Encode target-abi into LLVM bitcode for LTO.

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 8 08:58:00 PST 2020


Oh, I should say - the module flags metadata also has support for "error if
you try to merge two modules with different values for this flag".

On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com>
> wrote:
>
>>
>>
>> On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>
>>>
>>> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders <
>>> daniel_l_sanders at apple.com> wrote:
>>>
>>>>
>>>>
>>>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>
>>>>
>>>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道:
>>>>>
>>>>>> If this is something that can vary per file in a compilation and
>>>>>> resolve correctly when one object file is built with one ABI and another
>>>>>> object file is built with a different ABI (that seems to be antithetical to
>>>>>> the concept of "ABI" Though) - then it should be a subtarget feature.
>>>>>>
>>>>>> ABI is generally something that has to be agreed upon across object
>>>>>> files - so it wouldn't make sense to link two object files with two
>>>>>> different ABIs. What's going on here that makes that valid in this case?
>>>>>>
>>>>>>
>>>>> Are you talking about that "[mips] Pass ABI name via -target-abi
>>>>> instead of target-features"?
>>>>>
>>>>
>>>> I'm not talking about that patch in particular (I have no specific
>>>> knowledge of mips or its implementation) - but speaking about the general
>>>> design of LLVM's subtarget features.
>>>>
>>>> Might be interesting to know why that change was made & may help
>>>> explain what's going on here.
>>>>
>>>>
>>>> It's been a while so I don't remember the detail but IIRC one of the
>>>> reasons was that mips had a feature bit per ABI and had a lot of duplicated
>>>> code sanity checking that only one bit was enabled and deriving the ABI
>>>> from the feature bits. The -target-abi option already existed and using
>>>> that prevented the possibility of having more than one ABI selected.
>>>>
>>>> There was a lot of code (some of which didn't have access to target
>>>> features) in the backend that tried to derive the ABI from the arch
>>>> component of the triple (e.g. mips64 => n64 ABI) even though there were
>>>> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's)
>>>> and there isn't a canonical choice for any given triple (it varies between
>>>> linux distributions and toolchains in general). Settling on -target-abi
>>>> allowed us to sort out the inconsistencies in the backends opinion of what
>>>> the selected ABI was. It also allowed us to move the selection of the ABI
>>>> into the frontend where disagreements between distributions/toolchains on
>>>> what each triple means was easier to deal with.
>>>>
>>>
>>> Is this something that can vary per function in a program? (that seems
>>> confusing to me - ABI is usually, sort of by definition, the thing that all
>>> parts of the program have to agree with (at least on either side of any
>>> function call - I suppose different functions could have different ABIs as
>>> long as the function declarations carried ABI information so callers could
>>> cooperate, etc)) It sounds to me like that's what Zakk is
>>> suggesting/grappling with.
>>>
>>>
>>> No, it was a per-binary thing for mips and was stored in the ELF header.
>>> Ignoring a couple quirks*, every object in the program had to agree on the
>>> ABI in order to link.
>>>
>>> I'm not particularly familiar with LTO but going by the description of
>>> the problem it seems to me that the overall issue is that for 1, 2, and 5,
>>> each module fails to completely describe the contents. They each have a
>>> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere.
>>> As a result you can't check that you aren't trying to mix incompatible
>>> modules and can only trust (and require) the command line option. It's
>>> worth mentioning that DataLayout tends to change for different ABI's so the
>>> ABI is kind-of there but there isn't anything that really guarantees that
>>> there's a 1:1 relationship.
>>>
>>> 3 and 4 fix the problem of the missing labels but the snag with 4 is
>>> that target features are overridable at the function level too and that
>>> doesn't really make sense for ABI's (it's fine for calling conventions but
>>> that's only part of the ABI and calling conventions are described elsewhere
>>> in the IR anyway). Without changing the IR, 3 looks like the only one that
>>> solves the overall problem but then you have potential for problems where
>>> the official triple for a platform doesn't match what needs to be in the
>>> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64
>>> ABI (or more rarely O32) depending on the
>>> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were
>>> generally moving towards the idea of canonical triples which contained the
>>> ABI and some lowering code on the user facing interfaces to disambiguate
>>> things like mips64-linux-gnu to mips64-linux-gnuabin32.
>>>
>>>
>> To reply here a bit:
>>
>> I worry about target triple being used, but I think I do/did agree that
>> it's probably the best we can move to in the near term. My concern is that
>> we will have diverged from more "canonical" triples that are used in other
>> places just for our compilation model. I'd love to be able to encode ABI
>> into the module in some way and make it an error to link two modules that
>> have incompatible ABIs and am definitely up to ideas on how to encode
>> target specific module level data into the module. I'd like to avoid
>> metadata if possible. Any thoughts here on how you'd like to see it encoded
>> for the long term?
>>
>
> I was going to say - module metadata has those semantics (but, yeah, do
> have the "this is load bearing metadata" problem).
>
> But let's see what else is already there - NumRegisterParameters (no idea
> what that is, but that sounds like an ABI feature/not something you could
> drop & maintain correctness), Dwarf Version (kinda), CodeView (kinda), PIC
> level (probably load bearing), PIE level (similar), Code Model (load
> bearing).
>
> https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I think
> this is probably the right tool for this now, given what else is already
> here. If someone wants to make this solution not metadata (but I think this
> "global flags metadata" is specifically treated to have all the semantics
> we want here, so I'm not sure what else would be created that didn't look
> basically the same) then it can be done & all these things can be ported
> over to whatever that new thing is.
>
>
>>
>>
>>> *Just for completeness, the quirks I can remember off-hand were:
>>> - IEEE754 1985 and 2008 would successfully cross-link unless you used a
>>> flag indicating that it mattered. This was because we wanted to omit the
>>> 1985 standard from newer chips but there were many ecosystems using it due
>>> to historical reasons. In practice, very few programs care about the tiny
>>> details (does negation trap, etc.) so we essentially force-migrated whole
>>> ecosystems by relaxing the link requirements and changing the default.
>>> - Along the same lines, we also supported cross-linking specific
>>> variants of the O32 ABI. There was only supposed to be one O32 but an
>>> unfortunate mis-reading of the ABI spec coupled with a failure to catch it
>>> with conformance tests split it in two. Luckily, Matthew Fortune found a
>>> way to reunite them without breaking either one by adding a third that
>>> followed the original intent of the spec and was compatible with either one
>>> (but not both at once) and then migrating everyone to that.
>>>
>>>
>> *sigh* I remember that. :)
>>
>> Thanks for chiming in Daniel!
>>
>> -eric
>>
>>
>>> If it can vary per function, then the ABI information shouldn't be used
>>> outside the per-function context (ie: no global variables/other output
>>> could depend on the ABI because which function's ABI would it depend on?).
>>>
>>>
>>>>
>>>> I don't know WHY -target-abi is passing via different option, not via
>>>>> -mattr (subtarget feature)
>>>>> maybe usually subtarget feature is used to manages different specific
>>>>> ISA.
>>>>>
>>>>>
>>>>>
>>>>>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Hi all.
>>>>>>>
>>>>>>> There are two steps in LTO codegen so the problem is how to pass ABI
>>>>>>> info into LTO code generator.
>>>>>>>
>>>>>>> The easier way is pass -target-abi via option to LTO codegen, but
>>>>>>> there is linking issue when linking two bitcodes generated by different
>>>>>>> -mabi option. (see https://reviews.llvm.org/D71387#1792169)
>>>>>>>
>>>>>>> Usually the ABI info for a file is derived from target triple, mcpu
>>>>>>> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr
>>>>>>> option, so the one of solutions is encoding target-abi in IR via LLVM
>>>>>>> module flags metadata.
>>>>>>>
>>>>>>> But there is an another issue in assembler. In current LLVM design,
>>>>>>> there is no mechanism to extract info from IR before AsmBackend
>>>>>>> construction, so I use some little weird approach to init target-abi option
>>>>>>> before construct AsmBackend[1] or reassign target-abi option in
>>>>>>> getSubtargetImpl and do some hack in backend[2].
>>>>>>>
>>>>>>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see llc.cpp)
>>>>>>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h)
>>>>>>>
>>>>>>> I think [1] and [2] are not good enough, the other ideals like
>>>>>>>
>>>>>>> 3. encode target abi info in triple name. ex.
>>>>>>> riscv64-unknown-elf-lp64d
>>>>>>> 4. encode target-abi into in target-feature (maybe it's not a good
>>>>>>> ideal because mips reverted this approach
>>>>>>> before.
>>>>>>> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583)
>>>>>>>
>>>>>>> 5. users should pass target-abi themselves. (append
>>>>>>> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f)
>>>>>>>
>>>>>>> Is it a good idea to encode target-abi into bitcode?
>>>>>>> If yes, is there another good approach to fix AsmBackend issue?
>>>>>>> I’d appreciate any help or suggestions.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Kuan-Hsu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Kuan-Hsu
>>>>>
>>>>>
>>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200108/8b94d658/attachment.html>


More information about the llvm-dev mailing list