[PATCH] D59710: [SLP] remove lower limit for forming reduction patterns

Wed Nov 6 10:18:27 PST 2019

spatel marked an inline comment as done.
spatel added inline comments.

================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/hadd.ll:93
+; SLM-NEXT:    [[R01:%.*]] = shufflevector <2 x i64> [[BIN_RDX2]], <2 x i64> [[BIN_RDX]], <2 x i32> <i32 0, i32 2>
 ; SLM-NEXT:    ret <2 x i64> [[R01]]
 ;
----------------
spatel wrote:
> ABataev wrote:
> > RKSimon wrote:
> > > SLM has really poor v2i64 add costs - so I'm surprised this happened - we may need SLM special handling in getArithmeticReductionCost?
> > I think it is the problem of the cost model, maybe SLM cost model is not aware of very expensive 2i64 add cost?
> Taking a look...debug output shows:
> 
> ```
> SLP: Calculating cost for tree of size 1.
> SLP: Adding cost -2 for bundle that starts with   %a0 = extractelement <2 x i64> %a, i32 0.
> SLP: Spill Cost = 0.
> SLP: Extract Cost = 0.
> SLP: Total Cost = -2.
> SLP: Adding cost 1 for reduction that starts with   %a0 = extractelement <2 x i64> %a, i32 0 (It is a splitting reduction)
> SLP: Vectorizing horizontal reduction at cost:-1. (HorRdx)
> 
> ```
@RKSimon improved the SLM costs with:
rGa091f7061068

So that will remove this test diff from this patch. 

Based on the x86 asm, we actually do want to vectorize this example, but that's yet another cost model problem.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59710/new/

https://reviews.llvm.org/D59710