<div dir="ltr"><div dir="ltr">Hi Jesper,<br><br>thank you for working on this. My company (Codasip) would definitely be interested in having this feature upstream. I think that this is actually important for a suprisingly large number of people who currently have to maintain their changes downstream. I have a couple of questions and comments:<br><br>1. Do you plan on supporting truly arbitrary values as the byte size or are there in fact going to be limitations (e.g. the value has to be a multiple of 8 and lower or equal to 64)? I recall that we had a customer asking about 36-bit bytes.<br>2. If you define a byte to be e.g. 16 bits wide, does it mean that "char" is also 16 bits wide? If yes then how to do you define types like int8_t from stdint.h?<br>3. Have you thought about the possibility to support different byte sizes for data and code?<br>4. I realize that this is a separate issue but fully supporting non-8-bit bytes requires also changes to other parts of a typical toolchain, namely linker (ld/lld) and debugger (gdb/lldb). Do you maintain out-of-tree changes in this area as well?<br><br>Thank you,<br>Pavel<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 2, 2019 at 2:20 PM Jesper Antonsson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">   A. This RFC outlines a proposal regarding non-8-bit-byte support that<br>

      got positive reception at a Round Table at EuroLLVM19. The general<br>

      topic has been brought up several times before and one good overview<br>

      can be found in a FOSDEM 2017 presentation by Jones and Cook:<br>

<a href="https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/" rel="noreferrer" target="_blank">https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/</a><br>

<br>

In a nutshell, the proposal is for the llvm community to<br>

allow/encourage interested parties to gradually remove "magic numbers",<br>

e.g. assumptions on the size of bytes from the codebase. Overview,<br>

rationale and some example refactorings follows.<br>

<br>

Overview:<br>

<br>

LLVM currently assumes 8-bit bytes, while there exist a few out-of-tree <br>

llvm targets that utilize bytes of other sizes, including our<br>

(Ericsson's) proprietary target. The main issues are the magic number 8<br>

and "/8" and "*8" all over the place and the use of i8 pointers.<br>

<br>

There's considerable agreement that the use of magic numbers is not<br>

good coding style, and removing these ones would be of particular<br>

benefit, even though the effort would not be complete and no in-tree<br>

target with tests exist to guarantee that all gains are maintained.<br>

<br>

Ericsson is willing to drive this effort. During EuroLLVM19, there<br>

seemed to be sufficient positive interest from other companies for us<br>

to expect help with reviewing patch sets. Ericsson has been performing<br>

nightly integration towards top-of-tree with this backend for years,<br>

catching and fixing new 8-bit-byte continuously. Thus we're able to<br>

commit to doing similar upstream fixes for the long haul in a no-drama<br>

way.<br>

<br>

Rationale:<br>

<br>

Benefits of moving toward a byte-size agnostic llvm include:<br>

* Less magic numbers in the codebase.<br>

* A reduced effort to maintain out-of-tree targets with non-8-bit bytes<br>

as contributors follow the established patterns. (One company has told<br>

us that they created but eventually gave up on a 16-bit byte target due<br>

to too-high integration burden.)<br>

* A reduction in duplicate efforts as some of the adaptation work would<br>

happen in-tree rather than in several out-of-tree targets.<br>

* For up-and-coming targets that have non-8-bit-byte sizes, time to<br>

market using llvm would be far quicker.<br>

* A higher probability of LLVM being the compiler of choice for such<br>

targets.<br>

* Eventually, as the patch set required to make llvm fully byte size<br>

agnostic becomes small enough, the effort to provide a mock in-tree<br>

target with some other byte size should be surmountable.<br>

<br>

As cons, one could see a burden for the in-tree community to maintain<br>

whatever gains that have been had. However the onus should be on<br>

interested parties to mend any bit-rot. The impact of not having as<br>

much magic numbers and such should if anything make the code more easy<br>

to understand. The permission to go ahead would be under the condition<br>

that significant added complexities are avoided. Another con would be<br>

added compilation time e.g. in cases where the byte size is a run-time<br>

variable rather than a constant. However, this cost seems negligible in<br>

practice.<br>

<br>

Refactoring examples:<br>

<a href="https://reviews.llvm.org/D61432" rel="noreferrer" target="_blank">https://reviews.llvm.org/D61432</a><br>

<br>

Best Regards,<br>

Jesper<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote></div></div>