[llvm] [AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (PR #141758)

Harrison Hao via llvm-commits llvm-commits at lists.llvm.org
Wed May 28 21:04:31 PDT 2025


================
@@ -269,6 +269,68 @@ simplifyAMDGCNImageIntrinsic(const GCNSubtarget *ST,
                                        ArgTys[0] = User->getType();
                                      });
         }
+      } else {
----------------
harrisonGPU wrote:

I don't think this makes sense, because if we encounter an image sample with a vector type and its only user is a `fptrunc` from a float vector to a half vector, we should not fall back to the "old code" path. For example:
```llvm
  %38 = call reassoc nnan nsz arcp contract afn <4 x float> @llvm.amdgcn.image.sample.l.2d.v4f32.f32.v8i32.v4i32(i32 15, float %36, float %37, float 0.000000e+00, <8 x i32> %34, <4 x i32> %35, i1 false, i32 0, i32 0)
  %39 = fptrunc <4 x float> %38 to <4 x half>
```
In this case, the image sample result is a vector and directly truncated to a half vector, which doesn't fit either the new extract‑element + fptrunc pattern or the old code path that expects scalar values. So simply checking isVectorTy() is not sufficient to distinguish the cases.

https://github.com/llvm/llvm-project/pull/141758


More information about the llvm-commits mailing list