[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

JF Bastien via llvm-dev llvm-dev at lists.llvm.org
Thu May 9 10:29:34 PDT 2019

> On May 9, 2019, at 5:29 AM, Jesper Antonsson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> On Wed, 2019-05-08 at 11:12 -0700, Philip Reames wrote:
>> On 5/8/19 1:25 AM, Jesper Antonsson wrote:
>>> On Mon, 2019-05-06 at 15:56 -0700, Philip Reames via llvm-dev
>>> wrote:
>>>> On 5/6/19 2:43 AM, Tim Northover via llvm-dev wrote:
>>>>> On Mon, 6 May 2019 at 10:13, James Courtier-Dutton via llvm-dev
>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> Although the above is mentioning bytes, looking at the "/
>>>>>> 8"   and "& 0x7" makes it look like the author meant octets
>>>>>> and
>>>>>> not bytes.
>>>>>> Bytes can be any size of bits.
>>>>> I don't think you'll have much luck trying to make that stick
>>>>> for a
>>>>> general audience, or even a general compiler-writer audience.
>>>>> Byte
>>>>> is
>>>>> far too strongly associated with 8 bits these days.
>>>> +1 Please don't try; insisting on a distinction will confuse many
>>>> new
>>>> contributors.
>>> Yes, my interpretation is that the community is leaning toward
>>> addressable unit as terminology.
>>>>>> Octets are only ever 8 bits.
>>>>> You might be able to convert all uses of byte to octet and
>>>>> abandon
>>>>> byte entirely, but at that point why bother? It feels like a
>>>>> change
>>>>> just for the sake of pedantry.
>>>>> I like the "addressable unit" name, though it's a bit long
>>>>> (AddrUnit
>>>>> seems OK). It at least signals to a reader that there might be
>>>>> something weird going on. Getting someone writing new code to
>>>>> think
>>>>> in
>>>>> those terms is a different matter, of course, but I don't think
>>>>> any
>>>>> of
>>>>> the changes under discussion really help there.
>>>>> BTW, is there an open source backend (in a fork, I assume) that
>>>>> does
>>>>> this? So that we can get some kind of idea of the real scope of
>>>>> the
>>>>> changes needed.
>>>> Strongly agreed.
>>>> My personal take is this is an invasive enough change with enough
>>>> likely
>>>> ongoing maintenance fall out to require substantial justification
>>>> before
>>>> the work was undertaken upstream.
>>> My hope and belief is that having good names instead of these
>>> magical
>>> numbers won't be a burden but rather an improvement long-term. 
>>>>  A open source backend proposed for
>>>> inclusion upstream would be one part of that.
>>> That is not on the table right now. However, as the work required
>>> to
>>> make such an inclusion happen will be reduced by this cleanup, the
>>> likelihood that it happens in the future should increase.
>> Given this, I'm not sure the community as a whole should take on the
>> burden of supporting non b-byte addressable units.  I see this as a
>> precondition.  To be clear, I don't care *which* backend there is,
>> doesn't have to be yours, but the presence of at least one would seem
>> necessary for testing if nothing else.
> I agree that an in-tree target is needed for actual support. However,
> we're merely suggesting a gradual cleanup of magic numbers in order to
> make the code a bit more readable and make life easier for a number of
> downstream targets. It will not result in support, but it would make
> any effort to create support (or maintain support downstream)
> significantly smaller. This would also make it a bit more likely that
> LLVM is the compiler of choice for such targets, some of which might
> want to upstream eventually.
> The onus is on interested parties to maintain any gains, and Ericsson
> is offering to do that in a no-drama way with the help of other
> companies that have voiced their interest. We continuously merge and
> test against top of tree and would act accordingly, if allowed.
> As the discussion is subsiding, I'm unsure about how to conclude this
> RFC. Several parties have said they support this effort, others have
> pitched in with suggestions on terminology and such (which perhaps
> indicates that they are not opposed in general). JF Bastien and you ask
> for in-tree targets, although JF did indicate that it made sense to
> first clean up.

I don’t think you have consensus to move forward at this point in time. My expectation, which I think represents LLVM’s historical approach, is that a path to full support be planned out before this effort starts. Concretely, I expect a real-world backend to be committed to LLVM as a necessary step. What I meant upthread was: yes it makes sense to do cleanups before landing a backend, but someone has to commit to upstreaming a backend before you start the cleanups. When I say a backend I don’t mean a toy, I mean a real backend.

Right now we have no commitment on anybody landing a backend, and we don’t really have a concrete idea of what you’re even proposing to change or how. You’re focusing on “magic numbers” like everyone agrees 8 is the root of all evil, and it’s really not. Let’s say someone promises to upstream a backend, what concretely do you need to change, and in which projects, to get there? Are you changing clang, and how? What about libc++? Linker? LLVM, and how? Is IR going to change? If not, do you keep all the i8* around, and how do you work around not having void* in IR?

The above is, I think, necessary but not sufficient to moving forward.

> On "byte" vs "addressable unit", we've been thinking a bit and are
> leaning toward staying with the prevalent "byte" terminology for as
> long as upstream is 8-bit-only to avoid mixed terminology or larger
> patches. However, we're flexible on this, and I've uploaded a twin
> patch (in D61725) to my original showcase showing how "addressable
> unit" could look. 
>>>>  Active contribution from
>>>> the sponsors in other areas would also be a key factor.
>>> I'm not sure how to interpret that, but our team here at Ericsson
>>> is
>>> fairly large, we have been working with this out-of-tree backend
>>> since
>>> 2011 and as a group, we contribute to upstream e.g. by helping out
>>> with
>>> the fixedpoint upstreaming, by solving and filing TRs (we're pretty
>>> good at testing I'd say), improving debug information and more.
>> Ok.
>> p.s. If my wording came across as implying any disrespect, sorry!  I
>> was
>> making a general point, not thinking about how it might be read in
>> context. 
> No problem, thanks!
>>>>> Cheers.
>>>>> Tim.
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190509/0b1134d5/attachment-0001.html>

More information about the llvm-dev mailing list