[llvm-dev] RFC: On non 8-bit bytes and the target for it

Adrian Prantl via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 1 08:40:23 PDT 2019

> On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> A summary of the discussion so far.
> It seems that there are two possible solutions on how to move forward with non 8 bits byte:
> 1. Commit changes without tests. Chris Lattner, Mikael Holmen, Jeroen Dobbelaere, Jesper Antonsson support this idea.
> James Y Knight says that at least magic numbers should be removed "at least where it arguably helps code clarity". This might be not exactly the scope of the changes discussed, but it's probably worth do discuss code clarity having concrete patches.
> GCC (according to James Y Knight) has the same practice meaning non-8 bits byte is supported but there are no tests in upstream and we have downstream contributors who will fix the bugs if they appear in the LLVM core.
> David Chisnall raised a question about what to count as a byte (which defines the scope of the changes) and we suggest to use all 5 criteria he granted:
> > - The smallest unit that can be loaded / stored at a time.
> > - The smallest unit that can be addressed with a raw pointer in a specific address space.
> > - The largest unit whose encoding is opaque to anything above the ISA.
> > - The type used to represent `char` in C.
> > - The type that has a size that all other types are a multiple of.
> But if DSPs are less restrictive about byte, some of the criteria could be removed.
> 2. Use an iconic target. PDP10 was suggested as a candidate. This opinion found support from Tim Northover, Joerg Sonenberger, Mehdi AMINI, Philip Reames. It's not clear though does this opinion oppose upstreaming non-8-bits byte without tests or just a dummy and TVM targets options.
> So if there is no strong opposition to the solution 1 from the people supporting an iconic target option, we could probably move to the patches.

I'm in camp (2). Any changes that are not tested are an invitation to upstream developers to "simplify" the code, not knowing that those changes are important. Anyone who commits untested changes to LLVM will inevitably face an uphill battle against benevolent NFC refactorings that break these changes because the expectation of how the code is supposed to behave is not codified in a test. In the short term option (1) sounds more appealing because they can start right away, but I'm going to predict that it will be more expensive for the downstream maintainers of non 8-bit targets in the long term.

-- adrian

More information about the llvm-dev mailing list