[PATCH] D130487: [PowerPC] Fix vector_shuffle combines when inputs are scalar_to_vector of differing types.

Nemanja Ivanovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 31 07:31:53 PDT 2022


nemanjai added a comment.

Can you also comment on whether this was thoroughly tested on both little endian and big endian systems (bootstrap, test-suite, SPEC, additional internal tests).



================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14917-14918
+    // to the size of the scalar input to the SCALAR_TO_VECTOR later on.
+    int LHSScalarSize = 128;
+    int RHSScalarSize = 128;
 
----------------
I don't follow why we need these here. They both seem to only be needed in the respective conditions (i.e. depending on whether the LHS/RHS are `scalar_to_vector` nodes). And within those conditional blocks, they are reset before they're used.

So why do we need to define them here and initialize them to the width of a vector?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:267
 
 define <16 x i8> @test_v16i8_v8i16(i16 %arg, i8 %arg1) {
 ; CHECK-LE-P8-LABEL: test_v16i8_v8i16:
----------------
The code for this one gets worse on big endian. Do we know why?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:347
 
 define <16 x i8> @test_v8i16_v16i8(i16 %arg, i8 %arg1) {
 ; CHECK-LE-P8-LABEL: test_v8i16_v16i8:
----------------
The code for this one gets worse on big endian. Do we know why?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:578
 
 define <16 x i8> @test_v16i8_v4i32(i8 %arg, i32 %arg1, <16 x i8> %a, <4 x i32> %b) {
 ; CHECK-LE-P8-LABEL: test_v16i8_v4i32:
----------------
The code for this one gets worse on big endian. Do we know why?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:659
 
 define <16 x i8> @test_v4i32_v16i8(i32 %arg, i8 %arg1) {
 ; CHECK-LE-P8-LABEL: test_v4i32_v16i8:
----------------
The code for this one gets worse on big endian. Do we know why?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:1431
 
 define <16 x i8> @test_v8i16_v4i32(<8 x i16> %a, <4 x i32> %b, i16 %arg, i32 %arg1) {
 ; CHECK-LE-P8-LABEL: test_v8i16_v4i32:
----------------
The code for this one gets worse on big endian. Do we know why?


================
Comment at: llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll:1658
 
 define <16 x i8> @test_v4i32_v8i16(i32 %arg, i16 %arg1) {
 ; CHECK-LE-P8-LABEL: test_v4i32_v8i16:
----------------
The code for this one gets worse on big endian. Do we know why?

There are probably a bunch of other places. Can you please review what is happening there? I'll stop adding further similar comments.


================
Comment at: llvm/test/CodeGen/PowerPC/v4i32_scalar_to_vector_shuffle.ll:123
 
 define void @test_v8i16_none(ptr %a) {
 ; CHECK-LE-P8-LABEL: test_v8i16_none:
----------------
The code generated for this one gets worse on all subtargets. Do we know why?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130487/new/

https://reviews.llvm.org/D130487



More information about the llvm-commits mailing list