[llvm] [AMDGPU][True16] Support V_CEIL_F16. (PR #73108)

Fri Dec 1 05:55:45 PST 2023

================
@@ -7150,8 +7155,14 @@ void SIInstrInfo::moveToVALUImpl(SIInstrWorklist &Worklist,
     if (AMDGPU::getNamedOperandIdx(NewOpcode,
                                    AMDGPU::OpName::src0_modifiers) >= 0)
       NewInstr.addImm(0);
-    if (AMDGPU::getNamedOperandIdx(NewOpcode, AMDGPU::OpName::src0) >= 0)
-      NewInstr->addOperand(Inst.getOperand(1));
+    if (AMDGPU::hasNamedOperand(NewOpcode, AMDGPU::OpName::src0)) {
+      MachineOperand Src = Inst.getOperand(1);
+      if (AMDGPU::isTrue16Inst(NewOpcode) && ST.useRealTrue16Insts() &&
+          Src.isReg() && RI.isVGPR(MRI, Src.getReg()))
+        NewInstr.addReg(Src.getReg(), 0, AMDGPU::lo16);
----------------
kosarev wrote:

I don't quite see how that would make a difference. Adding a VGPR_16 would just create a COPY extracting that same subregister from that same super-register, so looks like just more work folding the operand?

https://github.com/llvm/llvm-project/pull/73108