[llvm-dev] [RFC] Implementing a general purpose 64-bit target (RISC-V 64-bit) with i64 as the only legal integer type

Bruce Hoult via llvm-dev llvm-dev at lists.llvm.org
Wed Oct 3 18:48:04 PDT 2018


Only having i64 seems cleaner to me. Of course you can still have i32 in
the code up until legalisation.

I think the only real downside is you can end up with 64 bit arithmetic on
things that are actually 32 bit, followed by a sext? That can be cleaned up
to a *w instruction in most cases, and already is.

Example:

----------- ops.c
int add(int a, int b){return a+b;}
int sub(int a, int b){return a-b;}
int mul(int a, int b){return a*b;}
int div(int a, int b){return a/b;}

unsigned addu(unsigned a, unsigned b){return a+b;}
unsigned subu(unsigned a, unsigned b){return a-b;}
unsigned mulu(unsigned a, unsigned b){return a*b;}
unsigned divu(unsigned a, unsigned b){return a/b;}
-----------
bruce at nuc:~/riscv/tests$ clang -O -c ops.c --target=riscv64 -march=rv64gc
bruce at nuc:~/riscv/tests$ riscv64-unknown-elf-objdump -d ops.o

ops.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <add>:
   0: 9d2d                addw a0,a0,a1
   2: 8082                ret

0000000000000004 <sub>:
   4: 9d0d                subw a0,a0,a1
   6: 8082                ret

0000000000000008 <mul>:
   8: 02a58533          mul a0,a1,a0
   c: 2501                sext.w a0,a0
   e: 8082                ret

0000000000000010 <div>:
  10: 02b54533          div a0,a0,a1
  14: 2501                sext.w a0,a0
  16: 8082                ret

0000000000000018 <addu>:
  18: 9d2d                addw a0,a0,a1
  1a: 8082                ret

000000000000001c <subu>:
  1c: 9d0d                subw a0,a0,a1
  1e: 8082                ret

0000000000000020 <mulu>:
  20: 02a58533          mul a0,a1,a0
  24: 2501                sext.w a0,a0
  26: 8082                ret

0000000000000028 <divu>:
  28: 00000637          lui a2,0x0
  2c: 0006069b          sext.w a3,a2
  30: 1682                slli a3,a3,0x20
  32: 367d                addiw a2,a2,-1
  34: 1602                slli a2,a2,0x20
  36: 9201                srli a2,a2,0x20
  38: 8e55                or a2,a2,a3
  3a: 8df1                and a1,a1,a2
  3c: 8d71                and a0,a0,a2
  3e: 02b55533          divu a0,a0,a1
  42: 2501                sext.w a0,a0
  44: 8082                ret

The divu is pretty bad. The add/sub/addu/subu are perfect.

The mul/mulu and div could all be cleaned up to use a *w instruction and
drop the sext.w. Is this not happening because the information has been
lost that the inputs are restricted to i32?

I did this test using branch "experimental" at
https://github.com/brucehoult/llvm-project-20170507 which contains recent
(Sep 21) LLVM ToT with lowRISC patches applied.

On Wed, Oct 3, 2018 at 2:27 AM, Alex Bradbury via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> # Purpose of this RFC
> This RFC describes the challenges of modelling the 64-bit RISC-V target
> (RV64)
> and details the two most obvious implementation choices:
> 1) Having i64 as the only legal integer type
> 2) Introducing i32 subregisters
>
> I've worked on implementing both approaches and fleshed out a pretty
> complete
> implementation of 1), which is my preferred option. With this RFC, I would
> welcome further feedback and insight, as well as suggestions or comments on
> the target-independent modifications (e.g. TargetInstrInfo hooks) I
> suggest as
> worthwhile.
>
> # Background: RV64
> The RISC-V instruction set is structured as a set of bases (RV32I, RV32E,
> RV64I, RV128I) with a series of optional extensions (e.g. M for
> multiply/divide, A for atomics, F+D for single+double precision floating
> point). It's important to note that RV64I is not just RV32I with some
> additional instructions, it's a completely different base where operations
> work on 64-bit rather than 32-bit values. RV64I also introduces 10 new
> instructions: ld/sd (64-bit load/store), addiw, slliw, srliw, sraiw, addw,
> subw, sllw, srlw, sraw. The `*W` instructions all produce a sign-extended
> result and take the lower 32-bits of their operands as inputs. Unlike
> MIPS64,
> there is no requirement that inputs to these `*W` are sign-extended in
> order
> to avoid unpredictable behaviour.
>
> # Background: RISC-V backend implementation.
> Other backends aiming to support both 32-bit and 64-bit architecture
> variants
> handle this by defining two versions of each instruction with overlapping
> encodings, with one marked as isCodeGenOnly.  This leads to unwanted
> duplication, both in terms of tablegen descriptions and throughout the C++
> implementation of the backend (e.g. any code checking for RISCV::ADD would
> also want to check for RISCV::ADD64). Fortunately we can avoid this thanks
> to
> the work Krzysztof Parzyszek contributed to support variable-sized register
> classes <http://lists.llvm.org/pipermail/llvm-dev/2016-
> September/105027.html>.
> The in-tree RISC-V backend exploits this, parameterising the base
> instruction
> definitions by XLEN (the size of the general purpose registers).
>
> # Option 1: Have i64 as the only legal type
> ## Approach
> Every register class in RISCVRegisterInfo.td is parameterised by XLenVT,
> which
> is i32 for RV32 and i64 for RV64. No subregisters are defined, meaning i32
> is
> not a legal type. Patterns for the `*W` instructions tend to look something
> like:
>
>     def : Pat<(sext_inreg (add GPR:$rs1, GPR:$rs2), i32),
>               (ADDW GPR:$rs1, GPR:$rs2)>;
>
> Essentially all patterns for RV32I are also valid for RV64I.
>
> ## Changes needed
> * Introduction of new patterns, RV64I-specific immediate materialisation
>
> * A number of SelectionDAG nodes generated from LLVM intrinsics take i32
> arguments and the DAG legalizer doesn't currently know how to legalize
> them.
> Promoting these arguments is trivial but requires additions to
> LegalizeIntegerTypes.cpp. So far I've had to do this for
> frameaddr/returnaddr/prefetch, but there are likely more.
>
> * The shift amount type is i64. If the shift amount operand is smaller than
> this, SelectionDAGBuilder will zero-extend it (changed from any-extend in
> rL125457). i32->i64 zero-extension is more expensive than sign-extension,
> but
> it's unnecessary anyway as only the lower 6 bits are used. Introduce
> TargetLowering::getExtendForShiftAmount which is called during
> SelectionDAGBuilder::visitShift.
>
> * When promoting setcc operands, DAGTypeLegalizer::PromoteSetCCOperands
> makes
> the arbitrary choice to zero-extend. It is cheaper to sign-extend from i32
> to
> i64, so introduce TargetLowering::isSExtCheaperThanZExt(FromTY, ToTy).
> For now
> this is only used through PromoteSetCCOperands, but perhaps there are other
> cases where it would be useful?
>
> * When 32-bit srl is legalized, the dag combiner will try to reduce the
> bits
> in the mask in: (srl (and val, 0xffffffff), imm) based on the knowledge of
> the
> lower bits that will be shifted out. This means a tablegen pattern matching
> 0xffffff won't work. Custom selection code in RISCVDAGToDAGISel can
> recognize
> when this has happened and produce SRLIW.
>
> * New i64 versions of the target-specific intrinsics added to aid the
> lowering
> of part-word atomicrmw must be defined.
>
> * RV64F (single-precision floating point) requires a little extra work due
> to
> the fact i32 is not a legal type. When call lowering happens
> post-legalisation
> (e.g. when an intrinsic was inserted during legalisation). A bitcast from
> f32
> to i32 can't be introduced. There's a similar challenge for RV32D.
> Introduce
> target-specific DAG nodes that perform bitcast+sext for f32->i64 and
> trunc+bitcast for i64->f32. Custom-lower ISD::BITCAST to ensure these nodes
> are selected.
>
> ## Questions
> Does anyone have any reservations about this approach of having i64 as the
> only legal type?
>
> Some of the target hooks could perhaps be replaced with more heroics in the
> backend. What are people's feelings here?
>
> # Option 2: Model 32-bit subregs
> ## Approach
> Define 32-bit subregisters for the GPRs that can be used in patterns and
> instruction definitions. The following node types are potentially useful:
> * `EXTRACT_SUBREG`: Supports getting the lower 32-bits of a 64-bit register
> * `INSERT_SUBREG`: Assumes only the lower bits are modified. Can be used
> with
> `IMPLICIT_DEF` to indicate that the upper bits are undefined. You can't
> directly represent sign-extension, but you can do what Mips64 does and
> define
> extra patterns to catch redundant sign-extension after one of the `*W`
> instructions.
> * `SUBREG_TO_REG`: a constant argument asserts the value of the bits left
> in
> the upper portion of the register. This is perfect for zero-extension, and
> not
> much good for the sign-extension RISC-V performs.
>
> You end up with patterns like:
>
>     def : Pat<(anyext GPR32:$reg),
>               (SUBREG_TO_REG (i64 0), GPR32:$reg, sub_32)>;
> def : Pat<(trunc GPR:$reg), (EXTRACT_SUBREG GPR:$reg, sub_32)>;
>
> def : Pat<(add GPR32:$src, GPR32:$src2),
> (ADDW GPR32:$src, GPR32:$src2)>;
>
> def : Pat<(add GPR32:$rs1, simm12_i32:$imm12),
> (ADDIW GPR32:$rs1, simm12_i32:$imm12)>;
>
> ## Changes needed
> * 32-bit subregisters must be defined. Some register classes need GPR32
> versions, e.g. GPR, GPRNoX0, GPRC.
>
> * The RISCVAsmParser and RISCVDisassembler must be modified to support the
> new
> register classes used for the 32-bit subregs.
>
> * The calling convention implementation must handle promotion of i32
> arguments/returns to i64.
>
> * The `*W` instructions must be defined using GPR32.
>
> * New `Operand<i32>` types must be defined and used in the `*W`
> instructions.
>
> * When defining a variable-sized register class you specify a DefaultMode.
> This must be set to i64 to avoid breaking RV32 compilation.
>
> * This gives enough to define working support for the `*W` operations, but
> to
> enable codegen for the other integer instructions requires either
> duplication
> or smarts. To write patterns using i32 you need to define a new variant of
> the
> instruction. TableGen changes might remove the need for this. Even with
> such
> support, it's not particularly desirable to write a bunch of new patterns
> for
> instructions other than the `*W` ones.
>
> I'm sure solutions are possible, but given that the i64-only approach
> seems to work very well, I'm not sure it's worth pushing further.
>
> # Conclusion
> Taking full advantage of support for variable-sized register classes and
> sticking with i64 as the only legal integer type seems very workable and is
> definitely my preference based on the work I've done. I'd be really
> interested
> if anyone has any particular concerns or advice, or feedback on the
> suggested
> new target hooks.
>
> Best,
>
> Alex Bradbury, lowRISC CIC
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181003/ee2c10c8/attachment-0001.html>


More information about the llvm-dev mailing list