[PATCH] D118376: [x86] try harder to scalarize a vector load with extracted integer op uses

Thu Jan 27 12:57:50 PST 2022

spatel marked an inline comment as done.
spatel added inline comments.

================
Comment at: llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll:1725
+; CHECK-NEXT:    vpinsrd $3, 8(%rdi), %xmm0, %xmm0
 ; CHECK-NEXT:    retq
   %vec = load <16 x i32>, <16 x i32>* %vp
----------------
RKSimon wrote:
> yuck
Yes, this diff suggests we should limit the fold a bit more. In this case, the extracts are immediately converted back to vector via scalar_to_vector or insert_vector_elt, and the sequence doesn't become a shuffle until later. 

I suspect there's no 1 right answer about where to draw the line, but this is easy to avoid - we can just make a list of bailout user opcodes (in this 1st draft, it was only ISD::STORE).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118376/new/

https://reviews.llvm.org/D118376