[llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

Sean Kilmurray via llvm-dev llvm-dev at lists.llvm.org
Fri May 3 01:27:12 PDT 2019


Hi Jesper,

My company (CML Microsystems) would definitely be interested in having this feature upstream too. 

We currently maintain an out of tree backend that has a minimum addressable size of 16 bits and this is implemented using the method outlined by Jones and Cook of Embecosm that you refer to in the RFC. 

Our implementation is slightly different than the one you’ve proposed  in that we  used the concept of bitPerChar and only support multiples of 8 bits for that char width.

I would be happy to help out with the work in any way I can.

Regards
Sean Kilmurray

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Jesper Antonsson via llvm-dev
Sent: 02 May 2019 13:21
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

   A. This RFC outlines a proposal regarding non-8-bit-byte support that
      got positive reception at a Round Table at EuroLLVM19. The general
      topic has been brought up several times before and one good overview
      can be found in a FOSDEM 2017 presentation by Jones and Cook:
https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/

In a nutshell, the proposal is for the llvm community to
allow/encourage interested parties to gradually remove "magic numbers",
e.g. assumptions on the size of bytes from the codebase. Overview,
rationale and some example refactorings follows.

Overview:

LLVM currently assumes 8-bit bytes, while there exist a few out-of-tree 
llvm targets that utilize bytes of other sizes, including our
(Ericsson's) proprietary target. The main issues are the magic number 8
and "/8" and "*8" all over the place and the use of i8 pointers.

There's considerable agreement that the use of magic numbers is not
good coding style, and removing these ones would be of particular
benefit, even though the effort would not be complete and no in-tree
target with tests exist to guarantee that all gains are maintained.

Ericsson is willing to drive this effort. During EuroLLVM19, there
seemed to be sufficient positive interest from other companies for us
to expect help with reviewing patch sets. Ericsson has been performing
nightly integration towards top-of-tree with this backend for years,
catching and fixing new 8-bit-byte continuously. Thus we're able to
commit to doing similar upstream fixes for the long haul in a no-drama
way.

Rationale:

Benefits of moving toward a byte-size agnostic llvm include:
* Less magic numbers in the codebase.
* A reduced effort to maintain out-of-tree targets with non-8-bit bytes
as contributors follow the established patterns. (One company has told
us that they created but eventually gave up on a 16-bit byte target due
to too-high integration burden.)
* A reduction in duplicate efforts as some of the adaptation work would
happen in-tree rather than in several out-of-tree targets.
* For up-and-coming targets that have non-8-bit-byte sizes, time to
market using llvm would be far quicker.
* A higher probability of LLVM being the compiler of choice for such
targets.
* Eventually, as the patch set required to make llvm fully byte size
agnostic becomes small enough, the effort to provide a mock in-tree
target with some other byte size should be surmountable.

As cons, one could see a burden for the in-tree community to maintain
whatever gains that have been had. However the onus should be on
interested parties to mend any bit-rot. The impact of not having as
much magic numbers and such should if anything make the code more easy
to understand. The permission to go ahead would be under the condition
that significant added complexities are avoided. Another con would be
added compilation time e.g. in cases where the byte size is a run-time
variable rather than a constant. However, this cost seems negligible in
practice.

Refactoring examples:
https://reviews.llvm.org/D61432

Best Regards,
Jesper
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190503/a9afe1ae/attachment.html>
-------------- next part --------------
?Hi Jesper,

My company (CML Microsystems) would definitely be interested in having this feature upstream too.

We currently maintain an out of tree backend that has a minimum addressable size of 16 bits and this is implemented using the method outlined by Jones and Cook of Embecosm that you refer to in the RFC.

Our implementation is slightly different than the one you’ve proposed  in that we  used the concept of bitPerChar and only support multiples of 8 bits for that char width.

I would be happy to help out with the work in any way I can.

Regards
Sean Kilmurray


PLEASE READ: Information in this email, including any attachments, is
intended solely for the addressee(s). Access to this information by anyone
else is unauthorised and in these circumstances the use, disclosure, copying
or distribution of this information is strictly prohibited. If you are not the
intended recipient, please let us know by replying to the sender and
immediately delete this email from your system.
This information has been transmitted over a public network and neither 
CML nor or any of its controlled entities accepts responsibility for the
accuracy or completeness of this message. Unless otherwise stated, opinions
expressed in this e-mail are those of the author and are not endorsed by CML.
-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Jesper Antonsson via llvm-dev
Sent: 02 May 2019 13:21
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] RFC: On removing magic numbers assuming 8-bit bytes

   A. This RFC outlines a proposal regarding non-8-bit-byte support that
      got positive reception at a Round Table at EuroLLVM19. The general
      topic has been brought up several times before and one good overview
      can be found in a FOSDEM 2017 presentation by Jones and Cook:
https://archive.fosdem.org/2017/schedule/event/llvm_16_bit/

In a nutshell, the proposal is for the llvm community to
allow/encourage interested parties to gradually remove "magic numbers",
e.g. assumptions on the size of bytes from the codebase. Overview,
rationale and some example refactorings follows.

Overview:

LLVM currently assumes 8-bit bytes, while there exist a few out-of-tree
llvm targets that utilize bytes of other sizes, including our
(Ericsson's) proprietary target. The main issues are the magic number 8
and "/8" and "*8" all over the place and the use of i8 pointers.

There's considerable agreement that the use of magic numbers is not
good coding style, and removing these ones would be of particular
benefit, even though the effort would not be complete and no in-tree
target with tests exist to guarantee that all gains are maintained.

Ericsson is willing to drive this effort. During EuroLLVM19, there
seemed to be sufficient positive interest from other companies for us
to expect help with reviewing patch sets. Ericsson has been performing
nightly integration towards top-of-tree with this backend for years,
catching and fixing new 8-bit-byte continuously. Thus we're able to
commit to doing similar upstream fixes for the long haul in a no-drama
way.

Rationale:

Benefits of moving toward a byte-size agnostic llvm include:
* Less magic numbers in the codebase.
* A reduced effort to maintain out-of-tree targets with non-8-bit bytes
as contributors follow the established patterns. (One company has told
us that they created but eventually gave up on a 16-bit byte target due
to too-high integration burden.)
* A reduction in duplicate efforts as some of the adaptation work would
happen in-tree rather than in several out-of-tree targets.
* For up-and-coming targets that have non-8-bit-byte sizes, time to
market using llvm would be far quicker.
* A higher probability of LLVM being the compiler of choice for such
targets.
* Eventually, as the patch set required to make llvm fully byte size
agnostic becomes small enough, the effort to provide a mock in-tree
target with some other byte size should be surmountable.

As cons, one could see a burden for the in-tree community to maintain
whatever gains that have been had. However the onus should be on
interested parties to mend any bit-rot. The impact of not having as
much magic numbers and such should if anything make the code more easy
to understand. The permission to go ahead would be under the condition
that significant added complexities are avoided. Another con would be
added compilation time e.g. in cases where the byte size is a run-time
variable rather than a constant. However, this cost seems negligible in
practice.

Refactoring examples:
https://reviews.llvm.org/D61432

Best Regards,
Jesper
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list