[PATCH] D55251: [X86] Enable -x86-experimental-vector-widening-legalization by default.

Tue Dec 4 01:12:25 PST 2018

chandlerc added a subscriber: asbirlea.
chandlerc added a comment.

Most of the test case changes make sense to me.

Places where we have lots more instructions are because we now need to zero-extend when using weird vector types in test cases (`<4 x i8>`) that have no realistic model in X86. Not worrisome at all.

Some of the cost model increases are surprising to me, flagged them below.

Any benchmark data? We can try to get some with this flag flipped. @asbirlea might be able to get some good data for you with Halide which has a tendancy to stress test these kinds of legalization issues because they generate large vectors and rely on the legalization to shard them and lay them out into pipelinable vector ops.

================
Comment at: test/Analysis/CostModel/X86/fptoui.ll:296-297
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %V4I8 = fptoui <4 x float> undef to <4 x i8>
-; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V8I8 = fptoui <8 x float> undef to <8 x i8>
-; AVX-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V16I8 = fptoui <16 x float> undef to <16 x i8>
+; AVX-NEXT:  Cost Model: Found an estimated cost of 24 for instruction: %V8I8 = fptoui <8 x float> undef to <8 x i8>
+; AVX-NEXT:  Cost Model: Found an estimated cost of 49 for instruction: %V16I8 = fptoui <16 x float> undef to <16 x i8>
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
----------------
This seems... a bit surprising.

================
Comment at: test/Analysis/CostModel/X86/reduce-add.ll:86
 ; AVX-LABEL: 'reduce_i32'
-; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.i32.v2i32(<2 x i32> undef)
+; AVX-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i32 @llvm.experimental.vector.reduce.add.i32.v2i32(<2 x i32> undef)
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> undef)
----------------
This also seems a bit surprising.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55251/new/

https://reviews.llvm.org/D55251