[llvm-dev] RFC: On non 8-bit bytes and the target for it

Mon Nov 4 16:09:57 PST 2019

> But if the failure is for, say, non-8-bit-bytes that are only actually
exercised downstream, I'm less OK with patches being reverted upstream no
matter the severity of the breakage to such targets downstream.

I think that's totally reasonable and is the same yardstick applied already
- reversion is only justifiable if a testcase can be provided. Anything
else would be undue burden on the original committer and would require
shotgun debugging.

On Mon, 4 Nov 2019 at 16:07, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Mon, Nov 4, 2019 at 4:00 PM James Molloy <james at jamesmolloy.co.uk>
> wrote:
>
>> Hi,
>>
>> To throw my hat into the ring, we also maintain a downstream target that
>> has a subset of this problem - it has non-8-bit-addressable memory. This
>> means that %uglygeps don't work, pointers can't be casted to i8* and expect
>> to be byte-addressed (conversions to memcpy/memset that use i8* aren't
>> welcome either), and GEP lowering code is slightly different.
>>
>> We maintain this currently as a pile of known technical debt. We have a
>> CGP pass to decompose GEPs and custom-expand them taking our word size into
>> account, but everything before that is just waiting for InstCombine to
>> break something. AliasAnalysis also makes offsetting assumptions that are
>> totally bogus because our pointers are word-addressed and therefore so are
>> pointer offsets.
>>
>> We'd be really keen to help here. We're keen to upstream anything we
>> possibly can (and have been, over the past few months). We've have several
>> discussions about how best to approach upstream with this, and the sticking
>> point has always been lack of a testing target. It's always felt to me that
>> the idea of addressable granule should be a fairly reasonable DataLayout
>> addition; We can test DataLayout changes purely via opt without requiring a
>> target that uses them. Lowering to instructions was always the testing
>> sticking point.
>>
>> We'd be keen to help out what the community decides to do here. I
>> personally feel it's reasonable that:
>>   - LangRef/DataLayout is updated with semantically coherent changes.
>>   - The midend optimizer is updated by someone who cares about those
>> changes and tests are added that use the new DataLayout.
>>   - Developers that don't care about those changes maintain a best-effort
>> approach, which is exactly what we do right now; there are features that
>> are tested but are still esoteric enough that I might reasonably break them
>> without realising (OperandBundles come to mind), so I don't think there's
>> any change in mindset here.
>>   - Developers that care perform downstream testing and provide review
>> feedback / revert if really bad / fixes. Again, this is how LLVM works
>> right now - I'd guess that >80% of our real-world test coverage comes from
>> downstream users deploying ToT LLVM rather than the upstream LIT tests /
>> builders.
>>
>
> This last bit is the bit I'd worry about if the bug were only visible in
> an out-of-tree target. If it's visible with a DataLayout change that can be
> tested upstream, then I'd be OK with an upstream patch being reverted given
> a test case with custom DataLayout showing the failure.
>
> But if the failure is for, say, non-8-bit-bytes that are only actually
> exercised downstream, I'm less OK with patches being reverted upstream no
> matter the severity of the breakage to such targets downstream.
>
> - Dave
>
>
>
>>
>> Cheers,
>>
>> James
>>
>> On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> > On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>> > It seems that there are two possible solutions on how to move
>>>>> forward with non 8 bits byte:
>>>>> >
>>>>> > 1. Commit changes without tests. Chris Lattner, Mikael Holmen,
>>>>> Jeroen Dobbelaere, Jesper Antonsson support this idea.
>>>>> > James Y Knight says that at least magic numbers should be removed
>>>>> "at least where it arguably helps code clarity". This might be not exactly
>>>>> the scope of the changes discussed, but it's probably worth do discuss code
>>>>> clarity having concrete patches.
>>>>> > GCC (according to James Y Knight) has the same practice meaning
>>>>> non-8 bits byte is supported but there are no tests in upstream and we have
>>>>> downstream contributors who will fix the bugs if they appear in the LLVM
>>>>> core.
>>>>> > David Chisnall raised a question about what to count as a byte
>>>>> (which defines the scope of the changes) and we suggest to use all 5
>>>>> criteria he granted:
>>>>> > > - The smallest unit that can be loaded / stored at a time.
>>>>> > > - The smallest unit that can be addressed with a raw pointer in a
>>>>> specific address space.
>>>>> > > - The largest unit whose encoding is opaque to anything above the
>>>>> ISA.
>>>>> > > - The type used to represent `char` in C.
>>>>> > > - The type that has a size that all other types are a multiple of.
>>>>> > But if DSPs are less restrictive about byte, some of the criteria
>>>>> could be removed.
>>>>> >
>>>>> > 2. Use an iconic target. PDP10 was suggested as a candidate. This
>>>>> opinion found support from Tim Northover, Joerg Sonenberger, Mehdi AMINI,
>>>>> Philip Reames. It's not clear though does this opinion oppose upstreaming
>>>>> non-8-bits byte without tests or just a dummy and TVM targets options.
>>>>> >
>>>>> > So if there is no strong opposition to the solution 1 from the
>>>>> people supporting an iconic target option, we could probably move to the
>>>>> patches.
>>>>>
>>>>> I'm in camp (2). Any changes that are not tested are an invitation to
>>>>> upstream developers to "simplify" the code, not knowing that those changes
>>>>> are important. Anyone who commits untested changes to LLVM will inevitably
>>>>> face an uphill battle against benevolent NFC refactorings that break these
>>>>> changes because the expectation of how the code is supposed to behave is
>>>>> not codified in a test. In the short term option (1) sounds more appealing
>>>>> because they can start right away, but I'm going to predict that it will be
>>>>> more expensive for the downstream maintainers of non 8-bit targets in the
>>>>> long term.
>>>>>
>>>>
>>>> I've worked on multiple codebases where an option existed in order to
>>>> satisfy an extremely small userbase, with little or no testing,
>>>>
>>>
>>> In those situations, were the core developers responsible for those
>>> features/users? Yeah, if I needed to support a certain observable feature
>>> of clang continuing to work, I'd want tests (I'm pretty serious about
>>> testing, FWIW).
>>>
>>>
>>>> and as such, I'm adamantly opposed to repeating it.
>>>>
>>>
>>>>
>>> In addition to what Adrian said, where the weird option exists but is
>>>> constantly being incidentally broken, I've seen the opposite problem:
>>>> people become afraid to refactor a section of code because it might break
>>>> the weird option.  You might say "if there aren't any tests, people should
>>>> feel free to refactor the code; their only responsibility is to make sure
>>>> the tests will still work", but honestly, I've seen the opposite: without
>>>> tests, it's just presumed that touching certain parts of code is likely to
>>>> break things, and so after a time, an aura of untouchability starts to
>>>> surround regions of the code.  And then, the more time goes by, the more
>>>> that code becomes unfamiliar to everyone (because no one is actively
>>>> maintaining it).  In the long run, the cost of an unmaintained option may
>>>> be far more than the cost of a maintained one.
>>>>
>>>
>>>
>>> I'm not actually opposed to this situation - LLVM as a project is pretty
>>> happy about making big structural changes to the codebase & holding the
>>> test coverage and downstream users accountable for ensuring quality. We
>>> rarely avoid changes due to risk of breakage & as a community push back a
>>> fair bit on reviewers suggesting we should - if someone can't demonstrate
>>> the breakage in an upstream test (or has a pretty good track record of true
>>> positives that may take some time to investigate - and thus it might be
>>> better in the short term to revert while waiting for that evidence to be
>>> provided) the changes tend to go in and stay in.
>>>
>>> Yeah, I think some parts of the code may become complicated enough to
>>> warrant separate testing - but most of the code that might move to
>>> constants for byte width, iterate over bits to that byte width, etc, will
>>> be tested on one value & might have bugs on other values that will be found
>>> (or not) downstream - best effort and all. But in cases where the code to
>>> handle novel byte widths becomes more complicated - some abstraction and
>>> unit testing would seem quite appropriate.
>>>
>>>
>>>
>>>> In short: please don't commit changes without tests.  Even if the test
>>>> is nothing but making sure this works:
>>>>
>>>> int main(int argc, char *argv[]) {
>>>>   return argv[argc - 1][0];
>>>> }
>>>>
>>>> That at least would give some freedom from the guilt of breaking
>>>> something important.
>>>>
>>>
>>> It's hard to make sure that works in a meaningful sense in this context
>>> - without a non-8-bit-byte target in upstream LLVM, which is the point of
>>> contention/discussion. It's unclear if there's a suitable target/community
>>> to provide/maintain such a target upstream. I don't think there's a
>>> "cheap"/stub/trivial target we could create that would provide what you're
>>> suggesting without bitrotting quickly and being removed (more quickly than,
>>> I think, the sort of patches to support but not provide, non-8-bit-byte
>>> targets).
>>>
>>> Though it's hard to guess without seeing the sort of patches that'd be
>>> needed.
>>>
>>> - Dave
>>>
>>>
>>>>
>>>> -- Jorg
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191104/f34766a6/attachment.html>