<div dir="ltr">Hi,<div><br></div><div>To throw my hat into the ring, we also maintain a downstream target that has a subset of this problem - it has non-8-bit-addressable memory. This means that %uglygeps don't work, pointers can't be casted to i8* and expect to be byte-addressed (conversions to memcpy/memset that use i8* aren't welcome either), and GEP lowering code is slightly different.</div><div><br></div><div>We maintain this currently as a pile of known technical debt. We have a CGP pass to decompose GEPs and custom-expand them taking our word size into account, but everything before that is just waiting for InstCombine to break something. AliasAnalysis also makes offsetting assumptions that are totally bogus because our pointers are word-addressed and therefore so are pointer offsets.</div><div><br></div><div>We'd be really keen to help here. We're keen to upstream anything we possibly can (and have been, over the past few months). We've have several discussions about how best to approach upstream with this, and the sticking point has always been lack of a testing target. It's always felt to me that the idea of addressable granule should be a fairly reasonable DataLayout addition; We can test DataLayout changes purely via opt without requiring a target that uses them. Lowering to instructions was always the testing sticking point.</div><div><br></div><div>We'd be keen to help out what the community decides to do here. I personally feel it's reasonable that:</div><div> - LangRef/DataLayout is updated with semantically coherent changes.</div><div> - The midend optimizer is updated by someone who cares about those changes and tests are added that use the new DataLayout.</div><div> - Developers that don't care about those changes maintain a best-effort approach, which is exactly what we do right now; there are features that are tested but are still esoteric enough that I might reasonably break them without realising (OperandBundles come to mind), so I don't think there's any change in mindset here.</div><div> - Developers that care perform downstream testing and provide review feedback / revert if really bad / fixes. Again, this is how LLVM works right now - I'd guess that >80% of our real-world test coverage comes from downstream users deploying ToT LLVM rather than the upstream LIT tests / builders.</div><div><br></div><div>Cheers,</div><div><br></div><div>James</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 4 Nov 2019 at 12:16, David Blaikie via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Nov 2, 2019 at 12:45 AM Jorg Brown via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Nov 1, 2019 at 8:40 AM Adrian Prantl via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:</div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> On Nov 1, 2019, at 3:41 AM, Dmitriy Borisenkov via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>> It seems that there are two possible solutions on how to move forward with non 8 bits byte:<br>
> <br>
> 1. Commit changes without tests. Chris Lattner, Mikael Holmen, Jeroen Dobbelaere, Jesper Antonsson support this idea.<br>
> James Y Knight says that at least magic numbers should be removed "at least where it arguably helps code clarity". This might be not exactly the scope of the changes discussed, but it's probably worth do discuss code clarity having concrete patches.<br>
> GCC (according to James Y Knight) has the same practice meaning non-8 bits byte is supported but there are no tests in upstream and we have downstream contributors who will fix the bugs if they appear in the LLVM core.<br>
> David Chisnall raised a question about what to count as a byte (which defines the scope of the changes) and we suggest to use all 5 criteria he granted:<br>
> > - The smallest unit that can be loaded / stored at a time.<br>
> > - The smallest unit that can be addressed with a raw pointer in a specific address space.<br>
> > - The largest unit whose encoding is opaque to anything above the ISA.<br>
> > - The type used to represent `char` in C.<br>
> > - The type that has a size that all other types are a multiple of.<br>
> But if DSPs are less restrictive about byte, some of the criteria could be removed.<br>
> <br>
> 2. Use an iconic target. PDP10 was suggested as a candidate. This opinion found support from Tim Northover, Joerg Sonenberger, Mehdi AMINI, Philip Reames. It's not clear though does this opinion oppose upstreaming non-8-bits byte without tests or just a dummy and TVM targets options.<br>
> <br>
> So if there is no strong opposition to the solution 1 from the people supporting an iconic target option, we could probably move to the patches.<br>
<br>
I'm in camp (2). Any changes that are not tested are an invitation to upstream developers to "simplify" the code, not knowing that those changes are important. Anyone who commits untested changes to LLVM will inevitably face an uphill battle against benevolent NFC refactorings that break these changes because the expectation of how the code is supposed to behave is not codified in a test. In the short term option (1) sounds more appealing because they can start right away, but I'm going to predict that it will be more expensive for the downstream maintainers of non 8-bit targets in the long term.<br></blockquote><div><br></div><div>I've worked on multiple codebases where an option existed in order to satisfy an extremely small userbase, with little or no testing, </div></div></div></blockquote><div><br>In those situations, were the core developers responsible for those features/users? Yeah, if I needed to support a certain observable feature of clang continuing to work, I'd want tests (I'm pretty serious about testing, FWIW).<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>and as such, I'm adamantly opposed to repeating it. </div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div> </div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>In addition to what Adrian said, where the weird option exists but is constantly being incidentally broken, I've seen the opposite problem: people become afraid to refactor a section of code because it might break the weird option. You might say "if there aren't any tests, people should feel free to refactor the code; their only responsibility is to make sure the tests will still work", but honestly, I've seen the opposite: without tests, it's just presumed that touching certain parts of code is likely to break things, and so after a time, an aura of untouchability starts to surround regions of the code. And then, the more time goes by, the more that code becomes unfamiliar to everyone (because no one is actively maintaining it). In the long run, the cost of an unmaintained option may be far more than the cost of a maintained one.<br></div></div></div></blockquote><div><br></div><div><br>I'm not actually opposed to this situation - LLVM as a project is pretty happy about making big structural changes to the codebase & holding the test coverage and downstream users accountable for ensuring quality. We rarely avoid changes due to risk of breakage & as a community push back a fair bit on reviewers suggesting we should - if someone can't demonstrate the breakage in an upstream test (or has a pretty good track record of true positives that may take some time to investigate - and thus it might be better in the short term to revert while waiting for that evidence to be provided) the changes tend to go in and stay in.<br><br>Yeah, I think some parts of the code may become complicated enough to warrant separate testing - but most of the code that might move to constants for byte width, iterate over bits to that byte width, etc, will be tested on one value & might have bugs on other values that will be found (or not) downstream - best effort and all. But in cases where the code to handle novel byte widths becomes more complicated - some abstraction and unit testing would seem quite appropriate.<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div><div>In short: please don't commit changes without tests. Even if the test is nothing but making sure this works:<br></div><div><br></div><div>int main(int argc, char *argv[]) {</div><div> return argv[argc - 1][0];</div><div>}</div><div><br></div><div>That at least would give some freedom from the guilt of breaking something important.</div></div></div></blockquote><div><br>It's hard to make sure that works in a meaningful sense in this context - without a non-8-bit-byte target in upstream LLVM, which is the point of contention/discussion. It's unclear if there's a suitable target/community to provide/maintain such a target upstream. I don't think there's a "cheap"/stub/trivial target we could create that would provide what you're suggesting without bitrotting quickly and being removed (more quickly than, I think, the sort of patches to support but not provide, non-8-bit-byte targets).<br><br>Though it's hard to guess without seeing the sort of patches that'd be needed.<br><br>- Dave<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><br></div><div>-- Jorg</div><div><br></div></div></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>