[llvm] [AMDGPU][True16][CodeGen] support v_mov_b16 and v_swap_b16 in true16 format (PR #102198)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 7 11:31:55 PDT 2024
================
@@ -1460,7 +1460,15 @@ bool SIFoldOperands::tryFoldFoldableCopy(
return false;
}
- MachineOperand &OpToFold = MI.getOperand(1);
+ MachineOperand *OpToFoldPtr;
+ if (MI.getOpcode() == AMDGPU::V_MOV_B16_t16_e64) {
----------------
broxigarchen wrote:
Hi Matt. v_mov_b32_e64 only has 1 source operand so we don't the filtering.
V_MOV_B16_t16_e64 has three input operands and I think it is only instruction in the FoldableCopy list that has more than 1 operand:
`inoperandlist: src_modifers, src0, op_sel`
so it need some special handling here.
I think we could move the filtering to isFoldableCopy though, but we need still need the special handling to select correct operand. What do you think?
https://github.com/llvm/llvm-project/pull/102198
More information about the llvm-commits
mailing list