[PATCH] D93364: [RISCV] Load/Store vector mask types.

Wed Dec 16 23:04:08 PST 2020

HsiangKai added inline comments.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/load-mask.ll:14
+; CHECK-NEXT:    ret
+  %a = load <vscale x 64 x i1>, <vscale x 64 x i1>* %pa
+  store <vscale x 64 x i1> %a, <vscale x 64 x i1>* %pb
----------------
craig.topper wrote:
> HsiangKai wrote:
> > rogfer01 wrote:
> > > rogfer01 wrote:
> > > > craig.topper wrote:
> > > > > How do we ensure that the location we're loading/storing is the right size for this? The mask is (vlenb/8) * 64 * 1 bits. But the load/store size is (vlenb/8)*64*8 bits.
> > > > I imagine we could set `vl=max(1,vlenb/(8*8)),sew=8` in this case rather than `vl=vlmax,sew=8`. We still have to load/store at least one `i8` (hence the `max`). This is what already happens when a scalar load/store of `i1` appears in IR.
> > > > 
> > > > However I'm not sure whether this scenario in IR will happen very often. If it doesn't then I imagine the slightly less straightforward code generation may be OK?
> > > I forgot to account `lmul`, so I think a reasonable `vl` would be  `vl=max(1, (vlenb*lmul)/(8*8)))`
> > I configure the load/store using e8,m1. The load/store size is (vlenb/8)*8*8 bits, not (vlenb/8)*64*8. (vlenb/8)*64*8 is e8,m8. You could image that I treat the load/store as load/store <vscale x 8 x i8> type.
> > 
> > Why do we need to consider LMUL here? All mask types are stored in one vector registers. All the load/store for mask types should use PseudoVLE#sew#_V_M1/PseudoVSE#sew#_V_M1.
> > 
> > We will reserve a whole vector register size in stack for mask types. Use e8,m1 and vl=VLMAX should be able to correctly read out the mask values.
> You're right I did do that math wrong. So the vscale x 64 x i1 is ok. But we're using e8,m1 with vlmax for types smaller than vscale x 64 x i1 as well right?
> 
> When you say "We will reserve a whole vector register size in stack for mask types." You mean for spills and reloads? That's a different case than these IR tests right?
I think so. For vscale x 32 x i1, we still use e8,m1 with vlmax to read the whole vector out.

Yeah, what in my mind is spilling and argument passing through stack. We have not implemented frame handling in the upstream. So, I created the test cases in this way. I will prepare the frame handling for RISC-V V later.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93364/new/

https://reviews.llvm.org/D93364