[PATCH] D37737: [SLPVectorizer] Merge subsequent gather loads.

Tue Sep 12 05:23:33 PDT 2017

fhahn created this revision.
Herald added subscribers: kristof.beyls, javed.absar, rengolin, aemerson.

This patch updates SLPVectorizer to try to combine subsequent scalar gather
loads into vector loads. I think this changes makes the IR simpler
(after instcombine is run); it replaces a chain of insertelement
instructions by a shufflevector instruction using the result of the
vector load.

The specific case I want to optimize is function `test1` in
 `/test/Transforms/SLPVectorizer/AArch64/merge-gather-loads.ll`. Code
like that is generated for some SGEMM kernels.
Combing the scalar loads to a vector load is beneficial in this case,
as the users of the scalar values (mul) supports indexed vector
operands on AArch64 and there is no need to duplicate the loaded scalar
values in separate vector registers. For instructions that do not
support indexed vector operands (like add in `test_add`), this is makes
things worse, as we have to do a vector load + 2 dups.

In addition to that, for architectures with complex instruction sets
(e.g. X86) this could also make things worse, if the users of the
scalar value support scalar memory operands. (e.g. assembler generated
for some functions in `test/Transforms/SLPVectorizer/X86/operandorder.ll`
uses memory operands for some scalar values)

It is my first patch in that area and I am not sure how to address the
issues mentioned above properly. Whether vectorizing the loads is beneficial
depends on the vector instructions available on the architecture. Would
it be better to have this as part of a target specific pass? There is a
LoadStoreVectorizer which may act as a base for that. Or should
backends provide information for which instructions this transformation
is beneficial as part of TargetTransformInfo?

https://reviews.llvm.org/D37737

Files:
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/AArch64/merge-gather-loads.ll
  test/Transforms/SLPVectorizer/X86/jumbled-load-multiuse.ll
  test/Transforms/SLPVectorizer/X86/operandorder.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D37737.114803.patch
Type: text/x-patch
Size: 15369 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170912/153c41f8/attachment.bin>