[PATCH] D118376: [x86] try harder to scalarize a vector load with extracted integer op uses

Thu Jan 27 08:19:29 PST 2022

spatel created this revision.
spatel added reviewers: RKSimon, pengfei, craig.topper.
Herald added subscribers: steven.zhang, hiraditya, mcrosier.
spatel requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

extract_vec_elt (load X), C --> scalar load (X+C)

As noted in the comment, DAGCombiner has this fold -- and the code in this patch is adapted from `DAGCombiner::scalarizeExtractedVectorLoad()` -- but x86 should benefit even if the loaded vector has other uses as long as we apply some other x86-specific conditions. The motivating example from #50310 is shown in vec_int_to_fp.ll.

I'm still looking over the diffs, but they all seem like wins so far.

https://reviews.llvm.org/D118376

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll
  llvm/test/CodeGen/X86/avx512-cvt.ll
  llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
  llvm/test/CodeGen/X86/bitcast-vector-bool.ll
  llvm/test/CodeGen/X86/oddsubvector.ll
  llvm/test/CodeGen/X86/pr45378.ll
  llvm/test/CodeGen/X86/scalar_widen_div.ll
  llvm/test/CodeGen/X86/shrink_vmul.ll
  llvm/test/CodeGen/X86/vec_cast.ll
  llvm/test/CodeGen/X86/vec_int_to_fp.ll
  llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-6.ll
  llvm/test/CodeGen/X86/vector-shuffle-avx512.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D118376.403661.patch
Type: text/x-patch
Size: 77957 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220127/52743544/attachment-0001.bin>