[PATCH] D51925: [AMDGPU] Fix issue for zext of f16 to i32
David Stuttard via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 11 06:43:27 PDT 2018
dstuttard added a comment.
Looking again at the code - you're correct that it attempts to only do this transformation if the high bits are zero.
However, the code that checks this has the following telling comment:
// (i32 zext (i16 (bitcast f16:$src))) -> fp16_zext $src
// FIXME: It is not universally true that the high bits are zeroed on gfx9.
if (Src.getOpcode() == ISD::BITCAST) {
SDValue BCSrc = Src.getOperand(0);
if (BCSrc.getValueType() == MVT::f16 &&
fp16SrcZerosHighBits(BCSrc.getOpcode()))
return DCI.DAG.getNode(AMDGPUISD::FP16_ZEXT, SDLoc(N), VT, BCSrc);
}
In this particular case the BCSrc operation was an fptrunc which passes the fp16SrcZerosHighBits test - but that eventually ends up as v_mad_mixlo_f16 which doesn't ensure that the high bits are zero.
Any suggestions on how to proceed? I agree that it seems a shame to have to insert the extra AND operation blindly.
Repository:
rL LLVM
https://reviews.llvm.org/D51925
More information about the llvm-commits
mailing list