[llvm] [AMDGPU] Optimize image sample followed by llvm.amdgcn.cvt.pkrtz into d16 variant (PR #145203)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 6 22:33:37 PDT 2025


================
@@ -247,6 +247,42 @@ simplifyAMDGCNImageIntrinsic(const GCNSubtarget *ST,
                                        ArgTys[0] = User->getType();
                                      });
         }
+
+        // Fold image.sample + cvt.pkrtz -> extractelement idx0 into a single
+        // d16 image sample.
+        // Pattern to match:
+        //   %sample = call float @llvm.amdgcn.image.sample...
+        //   %pack = call <2 x half> @llvm.amdgcn.cvt.pkrtz(float %sample,
+        //   float %any)
+        //   %low = extractelement <2 x half> %pack, i64 0
+        // Replacement:
+        //   call half @llvm.amdgcn.image.sample
----------------
arsenm wrote:

[LLVM has llvm.](https://llvm.org/docs/LangRef.html#llvm-fptrunc-round-intrinsic)

https://github.com/llvm/llvm-project/pull/145203


More information about the llvm-commits mailing list