[PATCH] D117875: [AMDGPU][InstCombine] Use D16 if only f16 precision is needed

Mon Jan 24 05:45:32 PST 2022

sebastian-ne abandoned this revision.
sebastian-ne added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp:90-91
   if (match(&V, m_FPExt(PatternMatch::m_Value(CastSrc))) ||
       match(&V, m_SExt(PatternMatch::m_Value(CastSrc))) ||
       match(&V, m_ZExt(PatternMatch::m_Value(CastSrc)))) {
+    Type *CastSrcTy = CastSrc->getType()->getScalarType();
----------------
foad wrote:
> I know you haven't changed this but it seems dangerous: for a d16 input treated as unsigned you should not match sext here, and for a d16 input treated as signed you should not match zext. Or do all d16 integer inputs ignore the high bits?
You’re right, this seems to be incorrect. Good catch.
As far as I understand, the format field in the image descriptor decides if float or int conversion is used, so the compiler can’t combine d16 at all.
For A16, it’s a uint for instructions without sampler or a float for instructions with sampler. I’ll fix that.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117875/new/

https://reviews.llvm.org/D117875