[PATCH] D29540: Scalarization overhead estimation in getIntrinsicInstrCost() improved

Mon Mar 13 00:40:09 PDT 2017

jonpa added inline comments.

================
Comment at: test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll:173
 ; VF_2-NEXT:    Found an estimated cost of 0 for VF 2 For instruction: %tmp3 = load i64, i64* %tmp1, align 8
-; VF_2-NEXT:    Found an estimated cost of 10 for VF 2 For instruction: store i64 0, i64* %tmp0, align 8
-; VF_2-NEXT:    Found an estimated cost of 10 for VF 2 For instruction: store i64 0, i64* %tmp1, align 8
----------------
hfinkel wrote:
> Why do these change? (they're not intrinsics).
getOperandsScalarizationOverhead() has been improved so that it doesn't count extraction
costs for constants.

ARM: This is 2 for each extract, so for VF 4, 40 -> 32 makes sense, as
well as for VF 8. Interleaved cost was 40, and now scalarizing the memop is 32.
(AArch64: Same)

To me these changes look ok.

https://reviews.llvm.org/D29540