[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

James Y Knight via llvm-dev llvm-dev at lists.llvm.org
Thu May 23 14:02:12 PDT 2019


You have raised an interesting question here about model for
decision-making, which unfortunately I don't think anyone really has an
answer for. I certainly don't.

Anyways, yes, my response was indeed intended to be mostly-positive
encouragement.

It seems to me that LLVM, as a project, generally encourages doing as much
development in the upstream project as possible, while also being generally
welcoming and helpful to closed-source downstream forks. That said, we
cannot ignore the critical requirement that upstream developers continue to
be able to make changes without having to remember all of the arbitrary
untested/undocumented constraints closed-source downstream project may have.

That's why I like the formulation of this proposal as a "code clean-up",
rather than a "feature". As a feature, it really cannot exist without a
backend using it upstream, because we have no ability to test it, so it
will be necessarily be broken. On the other hand, as a cleanup, it's
entirely acceptable that only some of the code has abstracted away the
number 8.

I'd suggest that this proposed cleanup should be allowed to happen -- but
that upstream, we do NOT expose it as an overrideable value (e.g., don't
put it in a target hook). Instead, make a non-overrideble function which
always just returns 8. Document it as returning the number of bits in a
character -- and that it's always 8 right now, but with the possibility
that someday llvm may handle other bit-widths.

Any downstream users who wish to change the value now, will need to patch
it to be a target hook (or whatever) in their fork -- and also fix anything
that's not using the abstraction properly, because we don't actually
support any other values upstream (due to no ability to test them).

On Tue, May 14, 2019 at 7:09 AM Jesper Antonsson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Thanks to everyone that have weighed in on this! There have been
> remarks along the lines that the proposal lacks consensus to move
> forward, and that might very well be true. However, the decision model
> is not fully clear to me, so before giving up, I'd like to summarize
> the discussion to the best of my ability and ask for clarification on
> how the community makes its decisions.
>
> (First, I would like to emphasize that this is merely my
> interpretation, which may be flawed. If so, my apologies.) The RFC
> tl;dr is "without supporting non-8-bit bytes, we want to abstract the
> magic 8's used for bytesize to a DataLayout getter".
>
> It garnered some five supporters, including four companies representing
> out-of-tree backends: my own Ericsson, Synopsys, CML Microsystems and
> Codasip. I also count James Y Knight's (Google) remarks as positive.
>
> Two commenters I interpret as clearly against, among other things
> citing the need for a committment for an in-tree-backend using non-8-
> bit bytes to undertake such a change: JF Bastien and Philip Reames.
>
> Nine additional commenters provided side remarks to the discussion, and
> my interpretation was that five were in some small regard abstraction-
> positive as they provided input on the details of the abstraction,
> whereas two were more skeptical, and two were neither.
>
>
> Now to the decision model: If full consensus is required, we clearly
> don't have it, and we can stop here. However, if the abstraction were
> in place, we'd have a large majority against replacing them with 8s.
> And I guess it would be hard to move forward on very many things if
> everyone in a huge community can veto?
>
> Or should we take an insider-outsider perspective, then perhaps the
> opposers are in  majority as their in-tree voices are what counts?
>
> Or should we take a company perspective, then there's four companies
> backing this, while I'm unsure if e.g. JF is speaking for Apple?
>
> Or should we apply some reputation weights, so some people can veto?
>
> Or is it an informal mix of all of the above? I'm sorry if I'm being
> obnoxious, I really don't mean to. I'd just sleep better with clarity
> in closure, so that the discussion don't just fizzle out and I give up
> while having a sense that the community is leaning toward the positive
> on this.
>
> BR,
> Jesper
>
>
> On Thu, 2019-05-02 at 12:20 +0000, Jesper Antonsson via llvm-dev wrote:
> >    A. This RFC outlines a proposal regarding non-8-bit-byte support
> > that
> >       got positive reception at a Round Table at EuroLLVM19. The
> > general
> >       topic has been brought up several times before and one good
> > overview
> >       can be found in a FOSDEM 2017 presentation by Jones and Cook:
> >
>
> https://protect2.fireeye.com/url?k=0092ea36-5c19e105-0092aaad-865bb277df6a-f2acface76df5bb9&u=https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/
> >
> > In a nutshell, the proposal is for the llvm community to
> > allow/encourage interested parties to gradually remove "magic
> > numbers",
> > e.g. assumptions on the size of bytes from the codebase. Overview,
> > rationale and some example refactorings follows.
> >
> > Overview:
> >
> > LLVM currently assumes 8-bit bytes, while there exist a few out-of-
> > tree
> > llvm targets that utilize bytes of other sizes, including our
> > (Ericsson's) proprietary target. The main issues are the magic number
> > 8
> > and "/8" and "*8" all over the place and the use of i8 pointers.
> >
> > There's considerable agreement that the use of magic numbers is not
> > good coding style, and removing these ones would be of particular
> > benefit, even though the effort would not be complete and no in-tree
> > target with tests exist to guarantee that all gains are maintained.
> >
> > Ericsson is willing to drive this effort. During EuroLLVM19, there
> > seemed to be sufficient positive interest from other companies for us
> > to expect help with reviewing patch sets. Ericsson has been
> > performing
> > nightly integration towards top-of-tree with this backend for years,
> > catching and fixing new 8-bit-byte continuously. Thus we're able to
> > commit to doing similar upstream fixes for the long haul in a no-
> > drama
> > way.
> >
> > Rationale:
> >
> > Benefits of moving toward a byte-size agnostic llvm include:
> > * Less magic numbers in the codebase.
> > * A reduced effort to maintain out-of-tree targets with non-8-bit
> > bytes
> > as contributors follow the established patterns. (One company has
> > told
> > us that they created but eventually gave up on a 16-bit byte target
> > due
> > to too-high integration burden.)
> > * A reduction in duplicate efforts as some of the adaptation work
> > would
> > happen in-tree rather than in several out-of-tree targets.
> > * For up-and-coming targets that have non-8-bit-byte sizes, time to
> > market using llvm would be far quicker.
> > * A higher probability of LLVM being the compiler of choice for such
> > targets.
> > * Eventually, as the patch set required to make llvm fully byte size
> > agnostic becomes small enough, the effort to provide a mock in-tree
> > target with some other byte size should be surmountable.
> >
> > As cons, one could see a burden for the in-tree community to maintain
> > whatever gains that have been had. However the onus should be on
> > interested parties to mend any bit-rot. The impact of not having as
> > much magic numbers and such should if anything make the code more
> > easy
> > to understand. The permission to go ahead would be under the
> > condition
> > that significant added complexities are avoided. Another con would be
> > added compilation time e.g. in cases where the byte size is a run-
> > time
> > variable rather than a constant. However, this cost seems negligible
> > in
> > practice.
> >
> > Refactoring examples:
> > https://reviews.llvm.org/D61432
> >
> > Best Regards,
> > Jesper
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190523/f44a6110/attachment.html>


More information about the llvm-dev mailing list