[PATCH] D23561: [RISCV 4/10] Add basic RISCV{InstrFormats, InstrInfo, RegisterInfo, }.td

Thu Aug 18 08:09:31 PDT 2016

asb added inline comments.

================
Comment at: lib/Target/RISCV/RISCVInstrFormats.td:33
@@ +32,3 @@
+
+class FR<bits<7> opcode, bits<3> funct3, bits<7> funct7, dag outs, dag ins,
+         string asmstr, list<dag> pattern> : RISCVInst<outs, ins, asmstr, pattern>
----------------
t.p.northover wrote:
> Nit: since the spec seems to list instructions from bit 31 down to 0 (as do TableGen 0bxxxx literals), definitions might read more naturally if you make the order "funct7, funct3, opcode".
That's a good point. Obviously the advantage of listing the opcode first is it reads more logically when writing something in RISCVInstrInfo that inherits from it. i.e. you first specify the high-level opcode then might select a more specific options by varying the funct bits. More closely matching the presentation in the spec is attractive though.

================
Comment at: lib/Target/RISCV/RISCVInstrFormats.td:97
@@ +96,3 @@
+{
+  bits<13> imm13;
+  bits<5> rs1;
----------------
t.p.northover wrote:
> This immediate doesn't appear to have a bit 0 in the spec (it fits into exactly the same fields as imm12 does).
> 
> So what we're really deciding here is what the immediate in the MCInst representation of "BEQ #x" should be. Valid answers are "x" and "the bits representing how x is encoded", with newer targets tending to favour the latter.
> 
> This is mostly because it makes it harder by design to construct an invalid MCInst when you're doing it manually. Other than that, it's just punting complexity around the various MC layer components:
>   
> | MC component | with "x" | with bits |
> |--|--|--|
> | AsmParser | nop | encodes |
> | Disassembler | decodes | nop |
> | AsmPrinter | nop | decodes |
> | Fixup handling | encodes | nop |
> 
> Either way "imm13" is probably misleading as a name and it's probably a good idea to be consistent between formats if possible (I note FU has taken the second approach).
I am incredibly glad you raised this issue, as it's something I've agonised back and forth over. I considered writing something up about this in the patch description, but thought it may be of limited interest. The same issue you raise applies to FUJ vs FU.

I started going in the direction of `imm13` because:
1) this better matches the spec which describes immediates in terms of slicing the integer logically represented 
2) the idea that the MCInst representation of 'BEQ #x` should be 'x' appealed to me.

The logical value is a 13-bit immediate with the restriction that the LSB must be 0, but of course this is encoded in 12 bits. Arguably imm13m0 would be more descriptive (imm13, masked bit0) but this description is perhaps a little opaque to most. As you've spotted, I ended up going a different direction with FU. After deciding `imm32` was misleading, I tried out with `imm20shl12` as a description of the value that's logically a 20-bit immediate shifted left 12 places. However I think I found that for whatever reason the code manipulating this didn't read cleanly. 

The argument that now FU has its arguments defined in terms of the bit encoding, FSB and FUJ are inconsistent with it is persuasive. I'll modify so FSB takes an imm12 and FUJ takes an imm20.

https://reviews.llvm.org/D23561