[PATCH] D23561: [RISCV 4/10] Add basic RISCV{InstrFormats, InstrInfo, RegisterInfo, }.td

Wed Aug 31 15:39:39 PDT 2016

jyknight added inline comments.

================
Comment at: lib/Target/RISCV/RISCVRegisterInfo.td:64
@@ +63,3 @@
+  (sequence "X%u", 0, 31)
+)>;
+
----------------
asb wrote:
> There have been some responses to this that haven't been picked up by phabricator: [1](http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160829/386399.html), [2](http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160829/386425.html).
> 
> To restate the discussion there - I would like to initially focus on MC concerns in this patchset and only introduce codegen concerns when they can be tested. Would you be happy with a comment here explaining that the allocation order should be changed when codegen is implemented?
Yes, a TODO note sounds fine. I just didn't want it to get forgotten just because a future change might not touch this line of code.

================
Comment at: lib/Target/RISCV/RISCVRegisterInfo.td:66
@@ +65,2 @@
+
+def GPR32 : RegisterClass<"RISCV", [i32], 32, (add GPR)>;
----------------
asb wrote:
> asb wrote:
> > Could you elaborate on where things go wrong for SPARC? If it's conceptual failing of having a single set of register IDs, what precisely is the issue? There doesn't for instance seem to be an assumption that any given `Register` is a member of only a single `RegisterClass`. Or is it likely due to bugs in tablegen or elsewhere?
> > 
> > In PPC and AArch64, the 32-bit registers are defined as subregisters of the 64-bit GPRs but I'm not sure if this is as correct for RISC-V. The compiler is either targeting RV32 or RV64, and when targeting RV64 there is no instruction that will modify only 32-bits of the register. Values are always held in sign-extended format so bit 31 will be copied in bits 32-63 if you for instance execute an `ADDW` or one of the other RV64 instructions defined to work on 32-bit quantities. Load instructions defined to either zero bits 32-63 or to sign-extend the loaded value.
> Ok, having had a look at this some more I think I mostly answered my own question. When writing to a register in RV64 (e.g. with `ADDW`) we do indeed need to model in-register sign extension, which I think is similar to MIPS. But the operands to `ADDW` or `SUBW` come from truncating the 64-bit register. Arguably the most sensible way of modelling this is by having a 32-bit register which is a subregister of the matching 64-bit GPR.
> 
> What I really want is that almost all instructions are defined as taking register operands from the 'GPR' set, which depending on if the target is RV32 or RV64 may be GPR32 or GPR64 (or in the future with RV128, GPR128). A small number of instructions as `SLLW` are defined to take a 32-bit subregisters, and in the future with RV128 we would have `ADDD`, `SLLD` and friends which take the 64-bit subregister of one of the GPR128 register set.
> 
> I'll have more of a think about what I want to do here.
Originally, the SPARC 64bit support was written by reusing the 32bit instruction patterns in r178527. This definitely didn't work, because the instructions specified the 32-bit register class. It apparently "almost" worked though -- until the return value needed to be spilled, and got stored in 4-bytes of memory instead of 8.

That particular thing been fixed since -- by adding separate 64-bit instructions -- but the legacy of specifying a single register set still remains. I'm not really sure what exact problems it causes, I've not really pushed on sparc64, since I only really care about sparc32.

But I do think the right thing really is to specify separate 32bit and 64bit registers (with a subreg relationship), and separately specify the integer instructions for each mode.

As far as modeling ADDW, I believe it's fine to model it as having the 32-bit subregs as input/outputs; that it mangles the upper bits while doing so should be fine. 

https://reviews.llvm.org/D23561