[PATCH] D20897: [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.

Mon Jun 13 03:55:46 PDT 2016

igorb added a comment.

In http://reviews.llvm.org/D20897#453507, @RKSimon wrote:

> Except this isn't a correctness issue, its an optimization no? The code will run fine at -O0 or higher as it will lower to the vpsrav intrinsic which supports the out-of-range shift value and will give the correct result.

No, I belive it is correctness issue.  vpsrav intrinsic lowering with constant out-of-range shift value is incorrect.

For example

  define <4 x i32> @test_x86_avx2_psrav_d_fold(<4 x i32> %a0, <4 x i32> %a1) {
    %res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> <i32 2, i32 9, i32 -12, i32 23>, <4 x i32> <i32 1, i32 18, i32 35, i32 52>)
    ret <4 x i32> %res
  }
  declare <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32>, <4 x i32>) nounwind readnone

Without patch we got incorrect result:

  movl    $1, %eax
  movd    %eax, %xmm0
  retq

With the patch:

   .LCPI0_0:
  .long   1                       # 0x1
  .long   0                       # 0x0
  .long   4294967295              # 0xffffffff
  .long   0                       # 0x0

  movaps  .LCPI0_0(%rip), %xmm0   # xmm0 = [1,0,4294967295,0]
  retq

Repository:
  rL LLVM

http://reviews.llvm.org/D20897