[llvm-dev] [RFC] Implementing a general purpose 64-bit target (RISC-V 64-bit) with i64 as the only legal integer type

Thu Oct 4 10:21:33 PDT 2018

Hi Alex,

I don't have anything to add with respect to the base instruction
sets, but you asked me to comment on possible interactions with the
vector extension. For context, like many SIMD instruction sets, RVV
supports sub-GPR data widths. That is, the vector unit can operate on
elements that are 8, 16, 32, or 64 bit wide (the last one only on
RV64I or RV32IFD) and vectors of i8, i16, etc. intuitively should be
legal.

That raises questions about whether the corresponding scalar types
should be legal too, not just because code will be inserting and
extracting vector elements, but also because RVV directly supports
scalar operations of the same element widths too. For example, if you
have a vector register holding 8 bit elements, you can also use it do
to scalar 8 bit integer operations (taking your inputs from lane 0 and
broadcasting the result back into all lanes). So in the sense of what
operations are supported in hardware, arguably all of i8, i16, i32 and
i64 could be legal on RV64IV. In reality that's probably unacceptable
because it would make lots of purely scalar code enable the vector
unit and move scalars back and forth between vector registers and
GPRs, instead of just legalizing the smaller integers in GPRs. There's
also other complications relating to how the element width of the
vector registers is determined (this part of the spec is still in flux
currently).

I have not done any significant work on these sub-XLEN vector element
widths yet, partly because of the aforementioned details still being
in flux. Still, considering the severe disadvantages of "i8, i16, i32,
i64 legal", I am leaning towards keeping XLenVT as the only legal
integer type in RVV. The ensuing legalization of smaller scalars is a
a bit of an obstacle if some scalar operations should happen in the
vector register field instead of in GPRs, but that seems like the
least bad option. It's not great if we have to reverse engineer from
legalization output that something was e.g. an 8 bit ISD::MULHU to be
able to emit an 8 bit "scalar vmulhu" instruction, but it beats the
above alternative.

So in summary, I don't know for sure but it looks like we'll want to
stick with "only i32 is legal for RV32, only i64 is legal for RV64"
despite RVV bringing in vectors of i8, i16, etc. and even native
support for the corresponding scalars.

Cheers,
Robin

On Wed, 3 Oct 2018 at 11:27, Alex Bradbury via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> # Purpose of this RFC
> This RFC describes the challenges of modelling the 64-bit RISC-V target (RV64)
> and details the two most obvious implementation choices:
> 1) Having i64 as the only legal integer type
> 2) Introducing i32 subregisters
>
> I've worked on implementing both approaches and fleshed out a pretty complete
> implementation of 1), which is my preferred option. With this RFC, I would
> welcome further feedback and insight, as well as suggestions or comments on
> the target-independent modifications (e.g. TargetInstrInfo hooks) I suggest as
> worthwhile.
>
> # Background: RV64
> The RISC-V instruction set is structured as a set of bases (RV32I, RV32E,
> RV64I, RV128I) with a series of optional extensions (e.g. M for
> multiply/divide, A for atomics, F+D for single+double precision floating
> point). It's important to note that RV64I is not just RV32I with some
> additional instructions, it's a completely different base where operations
> work on 64-bit rather than 32-bit values. RV64I also introduces 10 new
> instructions: ld/sd (64-bit load/store), addiw, slliw, srliw, sraiw, addw,
> subw, sllw, srlw, sraw. The `*W` instructions all produce a sign-extended
> result and take the lower 32-bits of their operands as inputs. Unlike MIPS64,
> there is no requirement that inputs to these `*W` are sign-extended in order
> to avoid unpredictable behaviour.
>
> # Background: RISC-V backend implementation.
> Other backends aiming to support both 32-bit and 64-bit architecture variants
> handle this by defining two versions of each instruction with overlapping
> encodings, with one marked as isCodeGenOnly.  This leads to unwanted
> duplication, both in terms of tablegen descriptions and throughout the C++
> implementation of the backend (e.g. any code checking for RISCV::ADD would
> also want to check for RISCV::ADD64). Fortunately we can avoid this thanks to
> the work Krzysztof Parzyszek contributed to support variable-sized register
> classes <http://lists.llvm.org/pipermail/llvm-dev/2016-September/105027.html>.
> The in-tree RISC-V backend exploits this, parameterising the base instruction
> definitions by XLEN (the size of the general purpose registers).
>
> # Option 1: Have i64 as the only legal type
> ## Approach
> Every register class in RISCVRegisterInfo.td is parameterised by XLenVT, which
> is i32 for RV32 and i64 for RV64. No subregisters are defined, meaning i32 is
> not a legal type. Patterns for the `*W` instructions tend to look something
> like:
>
>     def : Pat<(sext_inreg (add GPR:$rs1, GPR:$rs2), i32),
>               (ADDW GPR:$rs1, GPR:$rs2)>;
>
> Essentially all patterns for RV32I are also valid for RV64I.
>
> ## Changes needed
> * Introduction of new patterns, RV64I-specific immediate materialisation
>
> * A number of SelectionDAG nodes generated from LLVM intrinsics take i32
> arguments and the DAG legalizer doesn't currently know how to legalize them.
> Promoting these arguments is trivial but requires additions to
> LegalizeIntegerTypes.cpp. So far I've had to do this for
> frameaddr/returnaddr/prefetch, but there are likely more.
>
> * The shift amount type is i64. If the shift amount operand is smaller than
> this, SelectionDAGBuilder will zero-extend it (changed from any-extend in
> rL125457). i32->i64 zero-extension is more expensive than sign-extension, but
> it's unnecessary anyway as only the lower 6 bits are used. Introduce
> TargetLowering::getExtendForShiftAmount which is called during
> SelectionDAGBuilder::visitShift.
>
> * When promoting setcc operands, DAGTypeLegalizer::PromoteSetCCOperands makes
> the arbitrary choice to zero-extend. It is cheaper to sign-extend from i32 to
> i64, so introduce TargetLowering::isSExtCheaperThanZExt(FromTY, ToTy). For now
> this is only used through PromoteSetCCOperands, but perhaps there are other
> cases where it would be useful?
>
> * When 32-bit srl is legalized, the dag combiner will try to reduce the bits
> in the mask in: (srl (and val, 0xffffffff), imm) based on the knowledge of the
> lower bits that will be shifted out. This means a tablegen pattern matching
> 0xffffff won't work. Custom selection code in RISCVDAGToDAGISel can recognize
> when this has happened and produce SRLIW.
>
> * New i64 versions of the target-specific intrinsics added to aid the lowering
> of part-word atomicrmw must be defined.
>
> * RV64F (single-precision floating point) requires a little extra work due to
> the fact i32 is not a legal type. When call lowering happens post-legalisation
> (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32
> to i32 can't be introduced. There's a similar challenge for RV32D. Introduce
> target-specific DAG nodes that perform bitcast+sext for f32->i64 and
> trunc+bitcast for i64->f32. Custom-lower ISD::BITCAST to ensure these nodes
> are selected.
>
> ## Questions
> Does anyone have any reservations about this approach of having i64 as the
> only legal type?
>
> Some of the target hooks could perhaps be replaced with more heroics in the
> backend. What are people's feelings here?
>
> # Option 2: Model 32-bit subregs
> ## Approach
> Define 32-bit subregisters for the GPRs that can be used in patterns and
> instruction definitions. The following node types are potentially useful:
> * `EXTRACT_SUBREG`: Supports getting the lower 32-bits of a 64-bit register
> * `INSERT_SUBREG`: Assumes only the lower bits are modified. Can be used with
> `IMPLICIT_DEF` to indicate that the upper bits are undefined. You can't
> directly represent sign-extension, but you can do what Mips64 does and define
> extra patterns to catch redundant sign-extension after one of the `*W`
> instructions.
> * `SUBREG_TO_REG`: a constant argument asserts the value of the bits left in
> the upper portion of the register. This is perfect for zero-extension, and not
> much good for the sign-extension RISC-V performs.
>
> You end up with patterns like:
>
>     def : Pat<(anyext GPR32:$reg),
>               (SUBREG_TO_REG (i64 0), GPR32:$reg, sub_32)>;
> def : Pat<(trunc GPR:$reg), (EXTRACT_SUBREG GPR:$reg, sub_32)>;
>
> def : Pat<(add GPR32:$src, GPR32:$src2),
> (ADDW GPR32:$src, GPR32:$src2)>;
>
> def : Pat<(add GPR32:$rs1, simm12_i32:$imm12),
> (ADDIW GPR32:$rs1, simm12_i32:$imm12)>;
>
> ## Changes needed
> * 32-bit subregisters must be defined. Some register classes need GPR32
> versions, e.g. GPR, GPRNoX0, GPRC.
>
> * The RISCVAsmParser and RISCVDisassembler must be modified to support the new
> register classes used for the 32-bit subregs.
>
> * The calling convention implementation must handle promotion of i32
> arguments/returns to i64.
>
> * The `*W` instructions must be defined using GPR32.
>
> * New `Operand<i32>` types must be defined and used in the `*W` instructions.
>
> * When defining a variable-sized register class you specify a DefaultMode.
> This must be set to i64 to avoid breaking RV32 compilation.
>
> * This gives enough to define working support for the `*W` operations, but to
> enable codegen for the other integer instructions requires either duplication
> or smarts. To write patterns using i32 you need to define a new variant of the
> instruction. TableGen changes might remove the need for this. Even with such
> support, it's not particularly desirable to write a bunch of new patterns for
> instructions other than the `*W` ones.
>
> I'm sure solutions are possible, but given that the i64-only approach
> seems to work very well, I'm not sure it's worth pushing further.
>
> # Conclusion
> Taking full advantage of support for variable-sized register classes and
> sticking with i64 as the only legal integer type seems very workable and is
> definitely my preference based on the work I've done. I'd be really interested
> if anyone has any particular concerns or advice, or feedback on the suggested
> new target hooks.
>
> Best,
>
> Alex Bradbury, lowRISC CIC
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev