[PATCH] D97895: [RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32.
Fraser Cormack via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 9 01:33:58 PST 2021
frasercrmck added inline comments.
================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:2454
+ // point.
+ // vmv.v.x vX, hi
+ // vsll.vx vX, vX, /*32*/
----------------
craig.topper wrote:
> frasercrmck wrote:
> > craig.topper wrote:
> > > frasercrmck wrote:
> > > > craig.topper wrote:
> > > > > I was thinking maybe we just need two slide1ups using SEW=32 with VL set to 2 so that we don't slide anything but the scalars we're inserting.
> > > > Crafty; I like it. Doing that later along with INSERT_VECTOR_ELT would be my preferred way to go.
> > > I was also wondering if we could do something like this for the splat
> > >
> > > ```
> > > vmv.v.x vX, hi // using SEW=64
> > > vsll.vx vX, vX, /*32*/ // clear the lower 32 bits like we're doing now.
> > > vsetvli e32 // same vl with half the lmul of the 64 bit type.
> > > vaddu.wx vX, vX, lo // should zero extend the lo value to 64 bits by zero extending. Since we cleared the lower 32 bits above this is equivalent to OR.
> > > vsetvli e64 // mask to original sew/lmul
> > > ```
> > >
> > > The nice advantage it has is that it can be done in one physical register or physical register group. The current sequence requires two.
> > Sounds like that'd work, yep.
> >
> > Off the top of my head, could you not also extend the INSERT_VECTOR_ELT sequence with a `vrgather.vi vd, vs2, 0` to splat the first element? Perhaps fewer instructions but you'd still that second register due to the non-overlap constraint. I'm not sure which is better: perhaps it's situational.
> > Off the top of my head, could you not also extend the INSERT_VECTOR_ELT sequence with a `vrgather.vi vd, vs2, 0` to splat the first element? Perhaps fewer instructions but you'd still that second register due to the non-overlap constraint. I'm not sure which is better: perhaps it's situational.
>
> I'm not following. INSERT_ELEMENT has a scalar register input input. How did it get to element 0 for `vrgather.vi vd, vs2, 0`?
>
Ah sorry I might have skipped a step. I was proposing extending the two slide1up sequence you proposed earlier to get it from element zero to all elements.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97895/new/
https://reviews.llvm.org/D97895
More information about the llvm-commits
mailing list