[PATCH] D125712: [SLP][X86] Improve reordering to consider alternate instruction bundles

Wed May 25 09:46:40 PDT 2022

ABataev added inline comments.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/reorder_with_external_users.ll:121-123
+; CHECK-NEXT:    [[TMP2:%.*]] = fsub <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT:    [[TMP3:%.*]] = fadd <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <2 x double> [[TMP2]], <2 x double> [[TMP3]], <2 x i32> <i32 0, i32 3>
----------------
vporpo wrote:
> ABataev wrote:
> > I don't quite understand what's the difference here. Could you explain, please?
> Before this patch the pattern `shuffle + fadd + fsub` lowers to 3 instructions: blend  + vector add + vector sub (the shuffle selects TMP3[0],TMP2[1], which is fadd[0],fsub[1] , the inverse of the addsub pattern).
> 
> With this patch `shuffle + fadd +fsub` lowers to a single addsub instruction (the shuffle selects TMP2[0], TMP3[2] which is fsub[0],fadd[1]). 
> This saves 2 instructions  which means that during reordering we should keep track of this pattern since reordering it can increase the overhead. 
Ok, why not fixing it in the backend? This new function you added does not affect the cost, but ignoring shuffle actually increases the cost of the tree.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125712/new/

https://reviews.llvm.org/D125712