[llvm] [AMDGPU] Eliminate unnecessary packing in wider f16 vectors for sdwa/opsel-able instruction (PR #137137)

Mon Dec 8 22:31:33 PST 2025

================
@@ -207,6 +227,52 @@ class SDWADstPreserveOperand : public SDWADstOperand {
 #endif
 };
 
+class SDWAFP16ChainOperand : public SDWAOperand {
+private:
+  SIPeepholeSDWA &Parent;
+  FP16PackCandidate Candidate;
----------------
vg0204 wrote:

Currently, just handling the packed fp16 datatype as of now (look at isSrcDestFP16Bits(), also packed instruction input dtype is fp16 as in V_PACK_B32_F16)

https://github.com/llvm/llvm-project/pull/137137