[PATCH] D92789: [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors

Fri Dec 11 06:11:18 PST 2020

ZarkoCA marked 4 inline comments as done.
ZarkoCA added inline comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:2412
 // [HasVSX]
 // [HasVSX, IsBigEndian]
 // [HasVSX, IsLittleEndian]
----------------
Xiangling_L wrote:
> I am wondering why do we leave alone `[HasVSX, IsBigEndian]` and `[HasVSX, HasOnlySwappingMemOps, IsBigEndian]` and `HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]`?
That's a good question, I tried disabling these so as to have minimal disruption to existing code. And did it by elimination.  

So, as far as I can see the potentially big problem in the pattern matching is later implementations like `[HasVSX, IsISA3_0, HasDirectMove, IsBigEndian]` were implemented without considering 32bit codegen and I added a check there and moved on to earlier ones. At some point, I started hitting test case failures and found that codegen in those cases was correct. And in those cases I found that there were checks added for 32bit like in this patch: https://reviews.llvm.org/D17711.  

In short, I tried to do the safe thing for 32bit VSX without major disruption to existing test cases and codegen unless I could plainly tell it wasn't correct. Which in those cases I wasn't able to. 

================
Comment at: llvm/test/CodeGen/PowerPC/ppc-32bit-build-vector.ll:67
+    %0 = insertelement <16 x i32> <i32 undef, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>, i32 %newelement, i32 0
+    %load = load <16 x i8>, <16 x i8>* undef, align 1
+    %1 = zext <16 x i8> %load to <16 x i32>
----------------
Xiangling_L wrote:
> Suggestion: The testcase can be further simplified into the following by removing tedious `zext` instruction and shortening the numbers of vector.
> ```
> define dso_local fastcc void @BuildVectorICE() unnamed_addr #0 {
>   entry:
>     br label %while.body
>     while.body:                                       ; preds = %while.body, %entry
>     %newelement = phi i32 [ 0, %entry ], [ %5, %while.body ]
>     %0 = insertelement <4 x i32> <i32 undef, i32 0, i32 0, i32 0>, i32 %newelement, i32 0
>     %1 = load <4 x i32>, <4 x i32>* undef, align 1
>     %2 = add <4 x i32> %1, %0
>     %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
>     %4 = add <4 x i32> %2, %3
>     %5 = extractelement <4 x i32> %4, i32 0
>     br label %while.body
> }
> 
> ```
Thanks a lot! That simplifies it and I'm still able to get the compiler error without the patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92789/new/

https://reviews.llvm.org/D92789