[LLVMdev] running clang format on the Mips target

Fri Jan 3 08:05:27 PST 2014

> From: Dr D. Chisnall [dc552 at hermes.cam.ac.uk] on behalf of David Chisnall [David.Chisnall at cl.cam.ac.uk]
> Sent: 24 December 2013 14:33
> To: Daniel Sanders
> Cc: Reed Kotler; LLVMdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] running clang format on the Mips target
> 
> On 24 Dec 2013, at 13:01, Daniel Sanders <Daniel.Sanders at imgtec.com> wrote:
> 
> > I'm keen to get the bugfixes you mention upstreamed. Some of my colleagues have had a look at your git repo
> > and tell me that some of the bugfixes lack testcases but should otherwise be ok to upstream. It also sounds like
> > you have working MIPS-IV support. If this is the case then I'm keen to get that upstreamed as well since it will
> > go some way towards properly supporting the older ISA's (MIPS-II and MIPS-III in particular) in use by some
> > Linux distro's, OpenBSD, etc.
> 
> Some have test cases, but the lack is the main reason why I haven't pushed them upstreamed yet.  We have to
> demo our platform to DARPA in early January, so my current priority is getting it working enough for the demo
> code to build, even if that requires some quite embarrassing hacks.  After that deadline is passed, I 
> intend to tidy up the code and separate out all of the generic-MIPS parts from the parts specific to our CPU.

That sounds good to me. I hope the demo goes well.

It sounds like late January or early/mid February may be a good time to do the proposed clang-formatting since the
generic parts of your work will likely be upstream by then. Does that sound good to you?

> We do have MIPS IV working, as our base processor (BERI, which we've almost finished the paperwork required
> to open source and should be pushing our very soon) is a MIPS IV implementation.  I've not poked at MIPS III, but
> I now have a Loongson 2F to play with, so I may have a poke at some point.
>
> This would have been easier if a more systematic approach had been taken to the instruction definitions, starting
> with an ISA reference and defining them, along with their minimum version, and then working on patterns once the
> assembler was working.

I agree that that it would have been better to have the complete ISA version information in the .td files even with the
decision in LLVM 3.0 (prior to which the MIPS target was experimental) to not support the older ISA's. The older
ISA's still wouldn't be officially supported but it would have been easier to add them and it's likely that a considerable
amount would have worked with only minor patches.

> > Yes, all patches are reviewed either before or after commit. A wider review of the backend would be a sensible thing
> > to do at some point in the near future. However, I have to balance the desirability of a perfect design and
> > implementation with impracticality of achieving it and the business needs of our company.
> 
> Some design decisions, such as the way 64-bit registers are treated as 64-bit registers with 32-bit subregisters rather
> than as registers capable of containing 64-bit or 32-bit subvalues have complicated things a lot.  Some of this is
> simplified, but I still find I have to duplicate patterns or add some hacky code because SelectionDAG decides that
> something should be an i32 and then it's regarded as a different register to something that takes an i64.
> 
> Lots of things look like they're work-arounds for TableGen or SelectionDAG limitations, but don't document in the code
> what these limitations are.

Unfortunately, allowing multiple types in the same register has similar issues. When there are multiple choices of types
(after type-inferencing) it requires you to explicitly specify types to resolve it to a single choice. This then forces you back
into duplicating the patterns to cover all the types you wanted to cover. I've hit this problem in a few places in the
implementation of MSA.

> > I will therefore need to work this around existing schedules and deadlines without disrupting them. At the moment, my
> > aim is to free up developer-time from style issues in patches. The time saved in this area is likely to be significant since
> > our team is split between three/four timezones and as a result discussions about patches can take multiple days.
> >
> > I'm also keen to hear specific criticisms from outside our team. Could you elaborate on the issues you mention?
> 
> There are almost no comments in the code.  Things like the expansion of JALR rely on some magic and some documentation
> would be very helpful.
> 
> There have been some very odd implementation choices, for example deciding to special-case hardware register 29,
> when it is actually less code to support all 31 hardware registers.
> 
> There are loads of test cases for [d]la with immediate arguments, but [d]la is almost always used with a symbol as the
> argument and this case wasn't working.
> 
> No effort has been made over the course of development to ensure that we get the same output via the assembler as we
> do via direct object code emission.  This is a very simple and obvious sanity check, but the back end can't consume its
> own output, which means that bugs are very easy to slip in.
> 
> David

Thanks, those are fair criticisms.

Many of them are a result of us using binutils or direct-object-emission in our LLVM-based toolchains. This has resulted in a
strong focus on code-generation and direct-object emission but the assembler has had far less attention. This has been fine
for our purposes so far but may be something we ought to re-visit. I'll discuss this with the managers when I'm back in the office.

The comments and documentation situation has started to improve but has a long way to go. Jack Carter in particular has
been raising lack of API and implementation level comments in non-trivial functions in several internal patch reviews.
We currently lack documentation for many user-level issues (e.g. what intrinsics are supported, the constraints available to
inline assembly, etc) and higher level design issues (e.g. the meaning of 'SE', magic holding things together, valid but unexpected
code-selection, etc.). I need to think about the best way to do some of this since inaccurate or out-of-date documentation is
arguably worse than no documentation and it can be difficult to identify the documentation affected by a given piece of code.