[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

Thu May 9 12:46:49 PDT 2019

I agree that consensus seems to be missing. There's definitely some
assumptions, and more in particular, API and usage assumptions around
8 bit bytes in the backends. Also: How do you plan on keeping these
assumptions from creeping back in?

-eric

On Thu, May 9, 2019 at 10:30 AM JF Bastien via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
>
>
> On May 9, 2019, at 5:29 AM, Jesper Antonsson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> On Wed, 2019-05-08 at 11:12 -0700, Philip Reames wrote:
>
> On 5/8/19 1:25 AM, Jesper Antonsson wrote:
>
> On Mon, 2019-05-06 at 15:56 -0700, Philip Reames via llvm-dev
> wrote:
>
> On 5/6/19 2:43 AM, Tim Northover via llvm-dev wrote:
>
> On Mon, 6 May 2019 at 10:13, James Courtier-Dutton via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
> Although the above is mentioning bytes, looking at the "/
> 8"   and "& 0x7" makes it look like the author meant octets
> and
> not bytes.
> Bytes can be any size of bits.
>
>
> I don't think you'll have much luck trying to make that stick
> for a
> general audience, or even a general compiler-writer audience.
> Byte
> is
> far too strongly associated with 8 bits these days.
>
>
> +1 Please don't try; insisting on a distinction will confuse many
> new
> contributors.
>
>
> Yes, my interpretation is that the community is leaning toward
> addressable unit as terminology.
>
> Octets are only ever 8 bits.
>
>
> You might be able to convert all uses of byte to octet and
> abandon
> byte entirely, but at that point why bother? It feels like a
> change
> just for the sake of pedantry.
>
> I like the "addressable unit" name, though it's a bit long
> (AddrUnit
> seems OK). It at least signals to a reader that there might be
> something weird going on. Getting someone writing new code to
> think
> in
> those terms is a different matter, of course, but I don't think
> any
> of
> the changes under discussion really help there.
>
> BTW, is there an open source backend (in a fork, I assume) that
> does
> this? So that we can get some kind of idea of the real scope of
> the
> changes needed.
>
>
> Strongly agreed.
>
> My personal take is this is an invasive enough change with enough
> likely
> ongoing maintenance fall out to require substantial justification
> before
> the work was undertaken upstream.
>
>
> My hope and belief is that having good names instead of these
> magical
> numbers won't be a burden but rather an improvement long-term.
>
>  A open source backend proposed for
> inclusion upstream would be one part of that.
>
>
> That is not on the table right now. However, as the work required
> to
> make such an inclusion happen will be reduced by this cleanup, the
> likelihood that it happens in the future should increase.
>
>
> Given this, I'm not sure the community as a whole should take on the
> burden of supporting non b-byte addressable units.  I see this as a
> precondition.  To be clear, I don't care *which* backend there is,
> doesn't have to be yours, but the presence of at least one would seem
> necessary for testing if nothing else.
>
>
> I agree that an in-tree target is needed for actual support. However,
> we're merely suggesting a gradual cleanup of magic numbers in order to
> make the code a bit more readable and make life easier for a number of
> downstream targets. It will not result in support, but it would make
> any effort to create support (or maintain support downstream)
> significantly smaller. This would also make it a bit more likely that
> LLVM is the compiler of choice for such targets, some of which might
> want to upstream eventually.
>
> The onus is on interested parties to maintain any gains, and Ericsson
> is offering to do that in a no-drama way with the help of other
> companies that have voiced their interest. We continuously merge and
> test against top of tree and would act accordingly, if allowed.
>
> As the discussion is subsiding, I'm unsure about how to conclude this
> RFC. Several parties have said they support this effort, others have
> pitched in with suggestions on terminology and such (which perhaps
> indicates that they are not opposed in general). JF Bastien and you ask
> for in-tree targets, although JF did indicate that it made sense to
> first clean up.
>
>
> I don’t think you have consensus to move forward at this point in time. My expectation, which I think represents LLVM’s historical approach, is that a path to full support be planned out before this effort starts. Concretely, I expect a real-world backend to be committed to LLVM as a necessary step. What I meant upthread was: yes it makes sense to do cleanups before landing a backend, but someone has to commit to upstreaming a backend before you start the cleanups. When I say a backend I don’t mean a toy, I mean a real backend.
>
> Right now we have no commitment on anybody landing a backend, and we don’t really have a concrete idea of what you’re even proposing to change or how. You’re focusing on “magic numbers” like everyone agrees 8 is the root of all evil, and it’s really not. Let’s say someone promises to upstream a backend, what concretely do you need to change, and in which projects, to get there? Are you changing clang, and how? What about libc++? Linker? LLVM, and how? Is IR going to change? If not, do you keep all the i8* around, and how do you work around not having void* in IR?
>
> The above is, I think, necessary but not sufficient to moving forward.
>
>
> On "byte" vs "addressable unit", we've been thinking a bit and are
> leaning toward staying with the prevalent "byte" terminology for as
> long as upstream is 8-bit-only to avoid mixed terminology or larger
> patches. However, we're flexible on this, and I've uploaded a twin
> patch (in D61725) to my original showcase showing how "addressable
> unit" could look.
>
>
>  Active contribution from
> the sponsors in other areas would also be a key factor.
>
>
> I'm not sure how to interpret that, but our team here at Ericsson
> is
> fairly large, we have been working with this out-of-tree backend
> since
> 2011 and as a group, we contribute to upstream e.g. by helping out
> with
> the fixedpoint upstreaming, by solving and filing TRs (we're pretty
> good at testing I'd say), improving debug information and more.
>
>
> Ok.
>
> p.s. If my wording came across as implying any disrespect, sorry!  I
> was
> making a general point, not thinking about how it might be read in
> context.
>
>
> No problem, thanks!
>
>
>
> Cheers.
>
> Tim.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev