[PATCH] D25121: [AMDGPU] Promote uniform i16 bitreverse intrinsic to i32

Sun Oct 2 20:54:45 PDT 2016

arsenm added inline comments.

> AMDGPUCodeGenPrepare.cpp:288
> +  assert(I.getIntrinsicID() == Intrinsic::bitreverse && "I must be bitreverse");
> +  assert(isI16Ty(I.getType()) && "I must be 16 bits");
> +

This should be able to handle any bitwidth < 32. We will want this for the < i16 types that will want to be legalized to i16 as well

> AMDGPUCodeGenPrepare.cpp:295-298
> +  Value *ExtOp = nullptr;
> +  Value *ExtRes = nullptr;
> +  Value *LShrOp = nullptr;
> +  Value *TruncRes = nullptr;

You should just define these when declared instead of resetting them below

> AMDGPUCodeGenPrepare.cpp:430-431
> +
> +  if (I.getIntrinsicID() == Intrinsic::bitreverse)
> +    Changed |= visitBitreverseIntrinsicInst(I);
> +

This should be a switch since other intrinsics are likely, and it can just return the result of visitBitreverseIntrinsicInst directly

> amdgpu-codegenprepare-i16-to-i32.ll:433
>  
> +declare i16 @llvm.bitreverse.i16(i16)
> +; VI-LABEL: @bitreverse_i16(

Should have a test with smaller bitwidth

> amdgpu-codegenprepare-i16-to-i32.ll:434-439
> +; VI-LABEL: @bitreverse_i16(
> +; VI: %[[A_32:[0-9]+]] = zext i16 %a to i32
> +; VI: %[[R_32:[0-9]+]] = call i32 @llvm.bitreverse.i32(i32 %[[A_32]])
> +; VI: %[[S_32:[0-9]+]] = lshr i32 %[[R_32]], 16
> +; VI: %[[R_16:[0-9]+]] = trunc i32 %[[S_32]] to i16
> +; VI: ret i16 %[[R_16]]

There should be check lines for the non-VI case for the unprompted case

https://reviews.llvm.org/D25121