[PATCH] D125712: [SLP][X86] Improve reordering to consider alternate instruction bundles

Wed May 25 09:24:29 PDT 2022

vporpo added inline comments.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/reorder_with_external_users.ll:121-123
+; CHECK-NEXT:    [[TMP2:%.*]] = fsub <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT:    [[TMP3:%.*]] = fadd <2 x double> [[TMP1]], <double 1.100000e+00, double 1.200000e+00>
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <2 x double> [[TMP2]], <2 x double> [[TMP3]], <2 x i32> <i32 0, i32 3>
----------------
ABataev wrote:
> I don't quite understand what's the difference here. Could you explain, please?
Before this patch the pattern `shuffle + fadd + fsub` lowers to 3 instructions: blend  + vector add + vector sub (the shuffle selects TMP3[0],TMP2[1], which is fadd[0],fsub[1] , the inverse of the addsub pattern).

With this patch `shuffle + fadd +fsub` lowers to a single addsub instruction (the shuffle selects TMP2[0], TMP3[2] which is fsub[0],fadd[1]). 
This saves 2 instructions  which means that during reordering we should keep track of this pattern since reordering it can increase the overhead. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125712/new/

https://reviews.llvm.org/D125712