[llvm-dev] RFC: On non 8-bit bytes and the target for it

Finkel, Hal J. via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 1 04:17:41 PDT 2019

On 11/1/19 5:41 AM, Dmitriy Borisenkov via llvm-dev wrote:
A summary of the discussion so far.

It seems that there are two possible solutions on how to move forward with non 8 bits byte:

1. Commit changes without tests. Chris Lattner, Mikael Holmen, Jeroen Dobbelaere, Jesper Antonsson support this idea.
James Y Knight says that at least magic numbers should be removed "at least where it arguably helps code clarity". This might be not exactly the scope of the changes discussed, but it's probably worth do discuss code clarity having concrete patches.
GCC (according to James Y Knight) has the same practice meaning non-8 bits byte is supported but there are no tests in upstream and we have downstream contributors who will fix the bugs if they appear in the LLVM core.
David Chisnall raised a question about what to count as a byte (which defines the scope of the changes) and we suggest to use all 5 criteria he granted:
> - The smallest unit that can be loaded / stored at a time.
> - The smallest unit that can be addressed with a raw pointer in a specific address space.
> - The largest unit whose encoding is opaque to anything above the ISA.
> - The type used to represent `char` in C.
> - The type that has a size that all other types are a multiple of.
But if DSPs are less restrictive about byte, some of the criteria could be removed.

2. Use an iconic target. PDP10 was suggested as a candidate. This opinion found support from Tim Northover, Joerg Sonenberger, Mehdi AMINI, Philip Reames. It's not clear though does this opinion oppose upstreaming non-8-bits byte without tests or just a dummy and TVM targets options.

So if there is no strong opposition to the solution 1 from the people supporting an iconic target option, we could probably move to the patches.

I also support option 1, although having an in-tree target would certainly be better. That having been said, I'm assuming that the byte size will be part of the DataLayout and, as a result, we can certainly have unit tests and tests for IR-level changes in the usual way. We should only omit tests where we can't currently have them.


Kind regards, Dmitry Borisenkov

On Thu, Oct 31, 2019 at 8:51 AM Mikael Holmén via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
On Wed, 2019-10-30 at 15:30 -0700, Chris Lattner via llvm-dev wrote:
> > On Oct 30, 2019, at 3:07 AM, Jeroen Dobbelaere via llvm-dev <
> > llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
> >
> > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of JF
> > > Bastien via
> >
> > [..]
> > > Is it relevant to any modern compiler though?
> > >
> > > I strongly agree with Tim. As I said in previous threads, unless
> > > people will have
> > > actual testable targets for this type of thing, I think we
> > > shouldn’t add
> > > maintenance burden. This isn’t really C or C++ anymore because so
> > > much code
> > > assumes CHAR_BIT == 8, or at a minimum CHAR_BIT % 8 == 0, that
> > > we’re
> > > supporting a different language. IMO they should use a different
> > > language, and
> > > C / C++ should only allow CHAR_BIT % 8 == 0 (and only for small
> > > values of
> > > CHAR_BIT).
> >
> > We (Synopsys ASIP Designer team) and our customers tend to
> > disagree: our customers do create plenty of cpu architectures
> > with non-8-bit characters (and non-8-bit addressable memories). We
> > are able to provide them with a working c/c++ compiler solution.
> > Maybe some support libraries are not supported out of the box, but
> > for these kind of architectures that is acceptable.
> > (Besides that, llvm is also more than just c/c++)
> I agree - there are a lot of weird accelerators with LLVM backends,
> many of them aren’t targeted by C compilers/code.  The ones that do
> have C frontends often use weird dialects or lots of builtins, but
> they are still useful to support.
> I find this thread to be a bit confusing: it seems that people are
> aware that such chips exists (even today) but some folks are reticent
> to add generic support for them.  While I can see the concern about
> inventing new backends just for testing, I don’t see an argument
> against generalizing the core and leaving it untested (in
> master).  If any bugs creep in, then people with downstream targets
> can fix them in core.

Thanks Chris! This is what we would like to see as well!

We have a 16bit byte target downstream and we live pretty much on top-
of-tree since we pull from llvm every day. Every now and then we find
new 8bit byte assumptions in the code that break things for us that we
fix downstream.

If we were allowed, we would be happy to upstream such fixes which
would make life easier both for us (as we would need to maintain fewer
downstream diffs) and (hopefully) for others living downstream with
other non-8bit byte targets.

Now, while we try to fix things in ways that would work for several
different byte sizes, what _we_ actually really test is 16bit bytes, so
I'm sure we fail to generalize things enough for all sizes, but at
least our contributions will make things more general than today.

And I imagine that if other downstream targets use other byte sizes
than us they would also notice when things break and would also pitch
in and generalize it further so that it in the end works for all users.


> -Chris
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191101/8ce2e8c1/attachment-0001.html>

More information about the llvm-dev mailing list