[PATCH] D51925: [AMDGPU] Fix issue for zext of f16 to i32

Tue Sep 11 06:43:27 PDT 2018

dstuttard added a comment.

Looking again at the code - you're correct that it attempts to only do this transformation if the high bits are zero.
However, the code that checks this has the following telling comment:

  // (i32 zext (i16 (bitcast f16:$src))) -> fp16_zext $src
  // FIXME: It is not universally true that the high bits are zeroed on gfx9.
  if (Src.getOpcode() == ISD::BITCAST) {
    SDValue BCSrc = Src.getOperand(0);
    if (BCSrc.getValueType() == MVT::f16 &&
        fp16SrcZerosHighBits(BCSrc.getOpcode()))
      return DCI.DAG.getNode(AMDGPUISD::FP16_ZEXT, SDLoc(N), VT, BCSrc);
  }

In this particular case the BCSrc operation was an fptrunc which passes the fp16SrcZerosHighBits test - but that eventually ends up as v_mad_mixlo_f16 which doesn't ensure that the high bits are zero.

Any suggestions on how to proceed? I agree that it seems a shame to have to insert the extra AND operation blindly.

Repository:
  rL LLVM

https://reviews.llvm.org/D51925