[PATCH] D65538: [X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an extractelement+store

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 31 13:55:51 PDT 2019


craig.topper created this revision.
craig.topper added reviewers: spatel, RKSimon.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.

We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it.

This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk.


https://reviews.llvm.org/D65538

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/X86/f16c-intrinsics.ll
  llvm/test/CodeGen/X86/paddus.ll
  llvm/test/CodeGen/X86/psubus.ll
  llvm/test/CodeGen/X86/sadd_sat_vec.ll
  llvm/test/CodeGen/X86/ssub_sat_vec.ll
  llvm/test/CodeGen/X86/uadd_sat_vec.ll
  llvm/test/CodeGen/X86/usub_sat_vec.ll
  llvm/test/CodeGen/X86/vector-half-conversions.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D65538.212659.patch
Type: text/x-patch
Size: 51722 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190731/7b5bcf29/attachment.bin>


More information about the llvm-commits mailing list