[PATCH] D29540: Scalarization overhead estimation in getIntrinsicInstrCost() improved

Mon Mar 13 06:41:35 PDT 2017

hfinkel added inline comments.

================
Comment at: test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll:173
 ; VF_2-NEXT:    Found an estimated cost of 0 for VF 2 For instruction: %tmp3 = load i64, i64* %tmp1, align 8
-; VF_2-NEXT:    Found an estimated cost of 10 for VF 2 For instruction: store i64 0, i64* %tmp0, align 8
-; VF_2-NEXT:    Found an estimated cost of 10 for VF 2 For instruction: store i64 0, i64* %tmp1, align 8
----------------
jonpa wrote:
> hfinkel wrote:
> > Why do these change? (they're not intrinsics).
> getOperandsScalarizationOverhead() has been improved so that it doesn't count extraction
> costs for constants.
> 
> ARM: This is 2 for each extract, so for VF 4, 40 -> 32 makes sense, as
> well as for VF 8. Interleaved cost was 40, and now scalarizing the memop is 32.
> (AArch64: Same)
> 
> To me these changes look ok.
> 
> 
Okay, please proceed. Please include this explanation for the test case changes in the commit message, however, so it is clear what's going on.

https://reviews.llvm.org/D29540