[PATCH] D125712: [SLP][X86] Improve reordering to consider alternate instruction bundles
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 25 09:46:40 PDT 2022
ABataev added inline comments.
================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/reorder_with_external_users.ll:121-123
+; CHECK-NEXT: [[TMP2:%.*]] = fsub <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT: [[TMP3:%.*]] = fadd <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT: [[TMP4:%.*]] = shufflevector <2 x double> [[TMP2]], <2 x double> [[TMP3]], <2 x i32> <i32 0, i32 3>
----------------
vporpo wrote:
> ABataev wrote:
> > I don't quite understand what's the difference here. Could you explain, please?
> Before this patch the pattern `shuffle + fadd + fsub` lowers to 3 instructions: blend + vector add + vector sub (the shuffle selects TMP3[0],TMP2[1], which is fadd[0],fsub[1] , the inverse of the addsub pattern).
>
> With this patch `shuffle + fadd +fsub` lowers to a single addsub instruction (the shuffle selects TMP2[0], TMP3[2] which is fsub[0],fadd[1]).
> This saves 2 instructions which means that during reordering we should keep track of this pattern since reordering it can increase the overhead.
Ok, why not fixing it in the backend? This new function you added does not affect the cost, but ignoring shuffle actually increases the cost of the tree.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D125712/new/
https://reviews.llvm.org/D125712
More information about the llvm-commits
mailing list