[llvm] [AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (PR #141758)
Harrison Hao via llvm-commits
llvm-commits at lists.llvm.org
Wed May 28 21:04:31 PDT 2025
================
@@ -269,6 +269,68 @@ simplifyAMDGCNImageIntrinsic(const GCNSubtarget *ST,
ArgTys[0] = User->getType();
});
}
+ } else {
----------------
harrisonGPU wrote:
I don't think this makes sense, because if we encounter an image sample with a vector type and its only user is a `fptrunc` from a float vector to a half vector, we should not fall back to the "old code" path. For example:
```llvm
%38 = call reassoc nnan nsz arcp contract afn <4 x float> @llvm.amdgcn.image.sample.l.2d.v4f32.f32.v8i32.v4i32(i32 15, float %36, float %37, float 0.000000e+00, <8 x i32> %34, <4 x i32> %35, i1 false, i32 0, i32 0)
%39 = fptrunc <4 x float> %38 to <4 x half>
```
In this case, the image sample result is a vector and directly truncated to a half vector, which doesn't fit either the new extract‑element + fptrunc pattern or the old code path that expects scalar values. So simply checking isVectorTy() is not sufficient to distinguish the cases.
https://github.com/llvm/llvm-project/pull/141758
More information about the llvm-commits
mailing list