[PATCH] D151468: [RISCV] Use v(f)slide1up for shuffle+insert idiom
Fraser Cormack via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun May 28 02:43:26 PDT 2023
frasercrmck accepted this revision.
frasercrmck added a comment.
This revision is now accepted and ready to land.
LGTM with a minor typo
In D151468#4376270 <https://reviews.llvm.org/D151468#4376270>, @reames wrote:
> In D151468#4375309 <https://reviews.llvm.org/D151468#4375309>, @frasercrmck wrote:
>
>> ; This one matches
>> %vb = insertelement <4 x i8> poison, i8 %b, i64 0
>> %v1 = shufflevector <4 x i8> %vb, <4 x i8> poison, <4 x i32> zeroinitializer
>> %v2 = shufflevector <4 x i8> %v1, <4 x i8> %v, <4 x i32> <i32 0, i32 4, i32 5, i32 6>
>>
>> ; I wouldn't have thought the splat index should matter
>> %vb = insertelement <4 x i8> poison, i8 %b, i64 0
>> %v1 = shufflevector <4 x i8> %vb, <4 x i8> poison, <4 x i32> zeroinitializer
>> %v2 = shufflevector <4 x i8> %v1, <4 x i8> %v, <4 x i32> <i32 1, i32 4, i32 5, i32 6>
>
> This is an artifact of where I put the matching code. This isn't a sub-vector insert, and I'd placed the vslide1up match below that check. Good catch. If you're okay with it, I'd like to land this as is, and then handle this in a separate patch. I need to write dedicated matching for the vslide1down case anyways, and I'd like to have the test coverage in tree already.
Yes that sounds good.
>> ; Should be the same as the first example, but doesn't match
>> %vb = insertelement <4 x i8> poison, i8 %b, i64 0
>> %v2 = shufflevector <4 x i8> %vb, <4 x i8> %v, <4 x i32> <i32 0, i32 4, i32 5, i32 6>
>
> This looks identical to the new vslide1up_4xi8_swapped test. It does match on RV64, are you maybe testing this on RV32? There's something a bit odd going on with that one; we're widening the op type on the insert. I have not looked into that one yet.
Yes good spot - I was testing only with RV32, which I don't normally do when playing around with things.
================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vector-shuffle-vslide1up.ll:291
+; The length of the shift is less than the suffix, since we'd have to
+; materailize the splat, using the vslide1up doesn't help us.
define <4 x i32> @vslide1up_4xi32_neg1(<4 x i32> %v, i32 %b) {
----------------
typo: materialize
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D151468/new/
https://reviews.llvm.org/D151468
More information about the llvm-commits
mailing list