[PATCH] D102505: [RISCV] Support vector types in combination with fastcc

Thu May 27 09:12:27 PDT 2021

frasercrmck added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:7160
+    if (unsigned Reg =
+            allocateRVVReg(ValVT, ValNo, FirstMaskArgument, State, TLI)) {
+      // Fixed-length vectors are located in the corresponding scalable-vector
----------------
frasercrmck wrote:
> HsiangKai wrote:
> > frasercrmck wrote:
> > > frasercrmck wrote:
> > > > HsiangKai wrote:
> > > > > Should we allow all the vector registers used to pass vector arguments under fastcc, instead of limiting to v8 to v23?
> > > > Sure, that's a possibility. I think it opens up some interesting new situations if we combine that with the first mask going to v0, like (with max-lmul=8):
> > > > 
> > > > ```
> > > > define fastcc <4 x i1> @foo(<32 x i32> %x, <8 x i32> %y, <32 x i32> %z, <32 x i32> %w, <4 x i1> %m1, <4 x i1> %m2, <4 x i1> %m3) {
> > > > ; %x -> $v8m8
> > > > ; %y -> $v2m2
> > > > ; %z -> $v16m8
> > > > ; %w -> $v24m8
> > > > ; %m1 -> $v0
> > > > ; %m2 -> $v1
> > > > ; %m3 -> $v4
> > > > ```
> > > > 
> > > > Do you think that's worth it? It's slightly harder to reason about (finding which operand goes to which register takes a bit of back-and-forth) but the allocation is certainly improved. I suppose that's the goal of fastcc.
> > > Now I'm seeing a surprising register allocation failure when I allocate all registers to function arguments.
> > > 
> > > ```
> > > define fastcc <vscale x 32 x i32> @foo(<vscale x 32 x i32> %x, <vscale x 32 x i32> %y, <vscale x 32 x i32> %z, i32 %w) {
> > > ```
> > > 
> > > Gives me the following lowering code, with %x -> v0m8/v8m8, %y -> v16m8/v24m8 and %z -> indirect(x10) and indirect(x12):
> > > 
> > > ```
> > > bb.0 (%ir-block.0):
> > >   liveins: $v0m8, $v8m8, $v16m8, $v24m8, $x10, $x12
> > >   %5:gpr = COPY $x12
> > >   %4:gpr = COPY $x10
> > >   %3:vrm8 = COPY $v24m8
> > >   %2:vrm8 = COPY $v16m8
> > >   %1:vrm8 = COPY $v8m8
> > >   %0:vrm8 = COPY $v0m8
> > >   %6:vrm8 = VL8RE32_V %4:gpr :: (load unknown-size, align 64)
> > >   %7:gpr = ADDI %4:gpr, 64
> > >   %8:vrm8 = VL8RE32_V %7:gpr :: (load unknown-size, align 64)
> > > ```
> > > 
> > > Then the machine scheduler decides to do this:
> > > 
> > > ```
> > > 0B	bb.0 (%ir-block.0):
> > > 	  liveins: $v0m8, $v8m8, $v16m8, $v24m8, $x10, $x12
> > > 32B	  %4:gpr = COPY $x10
> > > 128B	  %7:gpr = ADDI %4:gpr, 64
> > > 136B	  %6:vrm8 = VL8RE32_V %4:gpr :: (load unknown-size, align 64)
> > > 144B	  %8:vrm8 = VL8RE32_V %7:gpr :: (load unknown-size, align 64)
> > > 152B	  %5:gpr = COPY $x12
> > > 160B	  %3:vrm8 = COPY $v24m8
> > > 168B	  %2:vrm8 = COPY $v16m8
> > > 176B	  %1:vrm8 = COPY $v8m8
> > > 184B	  %0:vrm8 = COPY $v0m8
> > > ```
> > > 
> > > And then register allocator is unable to allocate for `%6` or `%8` since all physical registers are occupied. I'm surprised the scheduler made that change.
> > > 
> > > I'm not the most familiar with other targets and their calling conventions, but perhaps this isn't supported. A compromise could be to leave `v24m8` free but continue to use `v1-v7`, which is still more than the base calling convention supports.
> > > 
> > > Any thoughts?
> > I am curious why the register allocator does not spill the vector registers?
> > 
> > By the way, it is strange to add 64 to the base address($x10) for the arguments. It should add `8 x vlenb`, right? I know it is another issue not related to the patch. We could prepare another patch to fix it.
> Are you asking why it doesn't spill one of the `liveins` so it can move `%6` into it? I'm not sure. I'm not an expert on the register allocator in any way, but I have seen issues with calling conventions + regalloc in the past. I wonder if it isn't able to spill physical livein registers.
> 
> Yes, it's a bit suspicious that it loads from a fixed offset from the base address. Probably not caused by this patch but I'll look into that.
Please see D103262 for a fix for that offset issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102505/new/

https://reviews.llvm.org/D102505