[llvm] [AMDGPU] Eliminate unnecessary packing in wider f16 vectors for sdwa/opsel-able instruction (PR #137137)
Vikash Gupta via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 8 22:31:33 PST 2025
================
@@ -207,6 +227,52 @@ class SDWADstPreserveOperand : public SDWADstOperand {
#endif
};
+class SDWAFP16ChainOperand : public SDWAOperand {
+private:
+ SIPeepholeSDWA &Parent;
+ FP16PackCandidate Candidate;
----------------
vg0204 wrote:
Currently, just handling the packed fp16 datatype as of now (look at isSrcDestFP16Bits(), also packed instruction input dtype is fp16 as in V_PACK_B32_F16)
https://github.com/llvm/llvm-project/pull/137137
More information about the llvm-commits
mailing list