[PATCH] D26556: [InstCombine] don't widen most selects by hoisting an extend

Mon Dec 12 12:14:46 PST 2016

spatel added a comment.

In https://reviews.llvm.org/D26556#620167, @efriedma wrote:

> I'm not sure this approach is really right... narrower isn't always better.  You're just getting lucky with your AVX2 example: `<4 x i1>` as an argument happens to get passed as a 128-bit vector.  If the compare operand were a 64-bit comparison, you'd be making the code worse; consider:
>
>   define <4 x i64> @f(<4 x i32> %a, <4 x i64> %b, <4 x i64> %c) {
>     %cmp = icmp sgt <4 x i64> %b, %c
>     %ext = sext <4 x i32> %a to <4 x i64>
>     %sel = select <4 x i1> %cmp, <4 x i64> %ext, <4 x i64> <i64 42, i64 42, i64 42, i64 42>
>     ret <4 x i64> %sel
>   }
>   
>
> This is perfectly straightforward code of the sort you could write using intrinsics in C.  Move the sext, and the generated code becomes worse.

Would you say this is a backend pattern-matching hole, or do you object to the IR transform itself? Ie, if we can fix the backend, would this be a valid patch? My view is that a narrower op in IR is better in terms of value tracking and could be thought of as a strength reduction optimization, so this is the theoretically correct approach to the IR...but as always, let me know if I'm off in the weeds. :)

For reference, if we're looking at AVX2 codegen, we have this:

  define <4 x i64> @f(<4 x i32> %a, <4 x i64> %b, <4 x i64> %c) {
    %cmp = icmp sgt <4 x i64> %b, %c
    %ext = sext <4 x i32> %a to <4 x i64>
    %sel = select <4 x i1> %cmp, <4 x i64> %ext, <4 x i64> <i64 42, i64 42, i64 42, i64 42>
    ret <4 x i64> %sel
  }

  define <4 x i64> @g(<4 x i32> %a, <4 x i64> %b, <4 x i64> %c) {
    %cmp = icmp sgt <4 x i64> %b, %c
    %sel = select <4 x i1> %cmp, <4 x i32> %a, <4 x i32> <i32 42, i32 42, i32 42, i32 42>
    %ext = sext <4 x i32> %sel to <4 x i64>
    ret <4 x i64> %ext
  }

Which becomes:

  _f:                                     ## @f
  vpcmpgtq	%ymm2, %ymm1, %ymm1
  vpmovsxdq	%xmm0, %ymm0
  vbroadcastsd	LCPI0_0(%rip), %ymm2
  vblendvpd	%ymm1, %ymm0, %ymm2, %ymm0
  retq
  _g:                                     ## @g
  vpcmpgtq	%ymm2, %ymm1, %ymm1
  vextracti128	$1, %ymm1, %xmm2
  vpacksswb	%xmm2, %xmm1, %xmm1
  vbroadcastss	LCPI1_0(%rip), %xmm2
  vblendvps	%xmm1, %xmm0, %xmm2, %xmm0
  vpmovsxdq	%xmm0, %ymm0
  retq

https://reviews.llvm.org/D26556