[PATCH] D31528: [ELF][MIPS] Multi-GOT implementation

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 19 17:12:17 PDT 2017


After looking quickly at the patch, I think it is actually fairly
reasonable. The complexity seems to be mostly contained inside
mips-specific code. In fact, it does not even add an "if MIPS" (it actually
removes one). However, I can see how this would add extra complexity to
parallelizing, so we probably want to understand the interaction there
better.

It seems this feature is needed for LLD to be a production MIPS linker, so
we should support it. We just need to think about how best to do it so that
it doesn't become too much a burden. I think we should consider this patch
on its own merits and independently from things like the "isMIPS64EL"
craziness.

-- Sean Silva

On Monday, April 10, 2017, Rui Ueyama via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> On Mon, Apr 10, 2017 at 5:13 PM, Davide Italiano <davide at freebsd.org
> <javascript:_e(%7B%7D,'cvml','davide at freebsd.org');>> wrote:
>
>> On Mon, Apr 10, 2017 at 4:58 PM, Rui Ueyama via llvm-commits
>> <llvm-commits at lists.llvm.org
>> <javascript:_e(%7B%7D,'cvml','llvm-commits at lists.llvm.org');>> wrote:
>> > On Sat, Apr 8, 2017 at 12:08 AM, Simon Atanasyan <simon at atanasyan.com
>> <javascript:_e(%7B%7D,'cvml','simon at atanasyan.com');>>
>> > wrote:
>> >>
>> >> On Tue, Apr 4, 2017 at 9:09 PM, Rui Ueyama <ruiu at google.com
>> <javascript:_e(%7B%7D,'cvml','ruiu at google.com');>> wrote:
>> >> > On Tue, Apr 4, 2017 at 5:46 AM, Simon Atanasyan <simon at atanasyan.com
>> <javascript:_e(%7B%7D,'cvml','simon at atanasyan.com');>>
>> >> > wrote:
>> >> >>
>> >> >> On Sat, Apr 1, 2017 at 4:20 AM, Rui Ueyama via Phabricator
>> >> >> <reviews at reviews.llvm.org
>> <javascript:_e(%7B%7D,'cvml','reviews at reviews.llvm.org');>> wrote:
>> >> >> >
>> >> >> > This is not your fault, but I have to say that this MIPS GOT
>> layout
>> >> >> > is
>> >> >> > very odd,
>> >> >> > too different from other architectures, and too complicated. I
>> want
>> >> >> > to
>> >> >> > avoid supporting
>> >> >> > this unless I'm convinced that it is absolutely necessary. It
>> seems
>> >> >> > to
>> >> >> > me that MIPS
>> >> >> > needs a clean, common new ABI. Only the MIPS ABI imposes a lot of
>> >> >> > restrictions
>> >> >> > on the size of GOT sections and the order of GOT section members,
>> >> >> > even
>> >> >> > though MIPS
>> >> >> > as a processor is an ordinary RISC ISA. This change would really
>> hurt
>> >> >> > maintainability
>> >> >> > of LLD which I already found some MIPS-specific behavior is hard
>> to
>> >> >> > keep
>> >> >> > it right
>> >> >> > when editing code for all the other architectures.
>> >> >>
>> >> >> MIPS will not always use old, obsoleted ABIs. It will switch to new
>> >> >> one. But it does not
>> >> >> happen this year or so. Besides other obstacles, there is a hardware
>> >> >> problem prevents from
>> >> >> fast switching and common acceptance of the new ABI. Historically
>> many
>> >> >> MIPS instructions
>> >> >> are partitioned as 16 bit for opcode and 16 bit bit for
>> address/index.
>> >> >> That is one of
>> >> >> the source of GOT size limitation and reason of multi-GOT invention.
>> >> >>
>> >> >> The biggest part of the patch isolated in the MipsGotSection class.
>> It
>> >> >> adds some new
>> >> >> MIPS specific code like new constructor of the DynamicReloc class.
>> But
>> >> >> at the same
>> >> >> time it removes some `if (Config->EMachine == EM_MIPS)` statements
>> and
>> >> >> MIPS specific
>> >> >> fields from the `SymbolBody` class.
>> >> >
>> >> >
>> >> > It is isolated as a separate class, but we still need to understand
>> and
>> >> > modify it when we need to do something for relocation processing. I'm
>> >> > actually trying to change the design of relocation processing, to
>> >> > increase
>> >> > parallelism of relocation processing. We can't parallelize it
>> entirely,
>> >> > but
>> >> > some part (such as making a decision whether a symbol needs a GOT
>> slot
>> >> > or
>> >> > not) can be processed per-file or per-relocation basis.
>> >> >
>> >> > Then I found that this part of code is very complex and has grown
>> >> > organically. I tried to reduce its complexity and found that keeping
>> >> > everything right for MIPS is hard. I'm really don't want to increase
>> >> > complexity of this code. If you increase the complexity, I won't be
>> able
>> >> > to
>> >> > refactor it anymore because I'm struggling to do that even for the
>> >> > current
>> >> > code.
>> >> >
>> >> > In addition to that, the MIPS multi-GOT ABI doesn't seem a right
>> design
>> >> > to
>> >> > me. If multi-GOT is in use, only the first GOT is recognized as a
>> real
>> >> > GOT
>> >> > by the dynamic linker, and secondary GOTs are just some sections that
>> >> > simulates GOT. It's too hacky, isn't it?
>> >> >
>> >> >> > I wonder what is the performance penalty you would have to pay
>> when
>> >> >> > you
>> >> >> > use the -mxgot
>> >> >> > option. With the option, you'll need three instructions as
>> opposed to
>> >> >> > a
>> >> >> > single instruction
>> >> >> > to access an GOT entry. Does that actually make observable
>> difference
>> >> >> > in
>> >> >> > performance?
>> >> >>
>> >> >> Regular (without -mxgot) access to GOT requires a single
>> instruction:
>> >> >>
>> >> >> lw  t9,0(gp)
>> >> >>
>> >> >> I was wrong when say about two instructions. With -mxgot option we
>> get
>> >> >> three instructions.
>> >> >>
>> >> >> lui     at,0x0
>> >> >> addu    at,at,gp
>> >> >> lw      t9,0(at)
>> >> >>
>> >> >> In case of MIPS global offset table is used not only to call
>> external
>> >> >> functions / access
>> >> >> external data but for local calls / access under some conditions. So
>> >> >> using -mxgot we can
>> >> >> easily grow the code size and reduce performance.
>> >> >
>> >> >
>> >> > How much is the actual performance hit?
>> >>
>> >> Multi-GOT is an attempt to bypass say limitation of MIPS architecture.
>> >> It's not my invention, this feature was implemented in GNU linker more
>> >> than ten years ago. Every time when GOT exceeds ~64KB limit BFD and
>> >> gold linkers create multi-GOT layout.
>> >>
>> >> I do not think that my implementation of multi-GOT makes LLD much more
>> >> complicated. General idea remains the same - collect information about
>> >> various type of required GOT entries, layout GOT entries, write this
>> >> layout. Merging multiple GOT created for each file into larger GOT is
>> >> rather complicated routine though. From another side, creating a
>> >> separate GOT for each input file makes possible to parallelize this
>> >> process. Current implementation, where MipsGotSection maintains a
>> >> single `GotEntries` vector for all files, does not allow to process
>> >> multiple input files at the same time without some sort of "locks".
>> >
>> >
>> > I understand that you are just trying to implement a MIPS ABI, and I
>> also
>> > understand that you made your effort to write good code. Your code
>> seems to
>> > be a straightforward implementation of the ABI if I understand it
>> correctly.
>> > But still new code inevitably adds complexity, and that's particularly
>> true
>> > for this patch that introduces a new notion of "multi-GOT" only for
>> MIPS.
>> > Also, it is not my fault to say that this feature is too odd, because I
>> > think it's a consequence of MIPS ABI's peculiarities. I believe many
>> > peculiarities in the MIPS ABI could have been fixed by now since they
>> were
>> > implemented more than 10 years ago.
>> >
>> > I really do not want to add this much complexity to our relocation
>> > processing code which is already too complicated. Even I don't
>> understand
>> > the exact behavior of the current code, and I'm am trying to refactor
>> that
>> > code now. This patch could make my refactoring impossible.
>> >
>> >> Performance degradation in case of using -mxgot depends on
>> >> application. My tests show that application use -mxgot slower on
>> >> 1%-4%. But it's more important that there are large applications which
>> >> cannot be linked without multi-GOT at all even if they built with
>> >> -mxgot option. Because there are some relocations which operate by
>> >> 16-bit GOT index only.
>> >>
>>
>> I don't necessarily disagree, but what you're saying is true for a
>> bunch of features in MIPS, which is the in-tree lld backend with more,
>> let's say, "peculiarities". I'm under the impression that as a project
>> we should make a call and decide whether we want to support the MIPS
>> ABI entirely or just don't support it at all (unless there's an OK
>> reason to drop some features). In this case, unless I'm reading the
>> patch incorrectly, this is needed to link/self-host clang/lld, so this
>> feature seems needed (in some form). What do you think?
>>
>
> We should support MIPS, but my point is that 1) the current code is
> already complicated that even I do not understand, and 2) this patch will
> likely to make it impossible to hack anymore because of an additional
> complexity. So adding this code right now is not a good idea. I want to
> explore other way to circumvent this ABI, and if it is impossible to do, I
> want to implement it in a really good way, probably after cleaning up the
> current code with the multi-GOT in mind.
>
> Also, the MIPS ABI needs updating. Only MIPS is very weird among ELF ABIs.
> Someone really needs to make an effort to streamline it. I'm ok to accept
> MIPS peculiarities if it is transient ("transient" can be like 10 years),
> but if no effort is being made to make it compatible with other ELF ABIs, I
> think I can say "no, this really needs fixing."
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170419/9d37bf85/attachment.html>


More information about the llvm-commits mailing list