[llvm] [AMDGPU][True16][CodeGen] fix a predicate bug in VGPRImm with f16/bf16 (PR #144942)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 19 11:52:19 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Brox Chen (broxigarchen)
<details>
<summary>Changes</summary>
Fixed a typo issue that f16/bf16 VGPRImm patterrn is not guarded by the True16Predicate scope. The curly bracket is misplaced
---
Full diff: https://github.com/llvm/llvm-project/pull/144942.diff
1 Files Affected:
- (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+9-9)
``````````diff
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 56b15c11a6694..d852d09b556d1 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -2187,17 +2187,17 @@ foreach pred = [NotHasTrue16BitInsts, UseFakeTrue16Insts] in {
(VGPRImm<(i16 imm)>:$imm),
(V_MOV_B32_e32 imm:$imm)
>;
- }
- // FIXME: Workaround for ordering issue with peephole optimizer where
- // a register class copy interferes with immediate folding. Should
- // use s_mov_b32, which can be shrunk to s_movk_i32
+ // FIXME: Workaround for ordering issue with peephole optimizer where
+ // a register class copy interferes with immediate folding. Should
+ // use s_mov_b32, which can be shrunk to s_movk_i32
- foreach vt = [f16, bf16] in {
- def : GCNPat <
- (VGPRImm<(vt fpimm)>:$imm),
- (V_MOV_B32_e32 (vt (bitcast_fpimm_to_i32 $imm)))
- >;
+ foreach vt = [f16, bf16] in {
+ def : GCNPat <
+ (VGPRImm<(vt fpimm)>:$imm),
+ (V_MOV_B32_e32 (vt (bitcast_fpimm_to_i32 $imm)))
+ >;
+ }
}
}
``````````
</details>
https://github.com/llvm/llvm-project/pull/144942
More information about the llvm-commits
mailing list