[PATCH] D107187: [amdgpu] Add an enhanced conversion from i64 to f32.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 2 06:23:07 PDT 2021
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/cvt_f32_ubyte.ll:1085
; SI-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; SI-NEXT: s_movk_i32 s6, 0xff
-; SI-NEXT: v_and_b32_e32 v0, s6, v0
-; SI-NEXT: v_add_i32_e32 v0, vcc, 0, v0
-; SI-NEXT: v_ffbh_u32_e32 v2, v0
-; SI-NEXT: v_addc_u32_e64 v1, s[4:5], 0, 0, vcc
-; SI-NEXT: v_add_i32_e32 v2, vcc, 32, v2
-; SI-NEXT: v_ffbh_u32_e32 v3, v1
-; SI-NEXT: v_cmp_eq_u32_e32 vcc, 0, v1
-; SI-NEXT: v_cndmask_b32_e32 v2, v3, v2, vcc
-; SI-NEXT: v_mov_b32_e32 v3, 0xbe
-; SI-NEXT: v_sub_i32_e32 v4, vcc, v3, v2
-; SI-NEXT: v_lshl_b64 v[2:3], v[0:1], v2
-; SI-NEXT: v_cmp_ne_u64_e32 vcc, 0, v[0:1]
-; SI-NEXT: v_and_b32_e32 v1, 0x7fffffff, v3
-; SI-NEXT: v_cndmask_b32_e32 v0, 0, v4, vcc
-; SI-NEXT: s_mov_b32 s4, 0
-; SI-NEXT: v_and_b32_e32 v3, s6, v3
-; SI-NEXT: s_movk_i32 s5, 0x80
-; SI-NEXT: v_lshrrev_b32_e32 v1, 8, v1
-; SI-NEXT: v_lshlrev_b32_e32 v0, 23, v0
-; SI-NEXT: v_or_b32_e32 v0, v0, v1
-; SI-NEXT: v_cmp_eq_u64_e32 vcc, s[4:5], v[2:3]
-; SI-NEXT: v_and_b32_e32 v1, 1, v0
-; SI-NEXT: v_cndmask_b32_e32 v1, 0, v1, vcc
-; SI-NEXT: v_cmp_lt_u64_e32 vcc, s[4:5], v[2:3]
-; SI-NEXT: v_cndmask_b32_e64 v1, v1, 1, vcc
-; SI-NEXT: v_add_i32_e32 v0, vcc, v0, v1
+; SI-NEXT: v_ffbh_i32_e32 v2, 0
+; SI-NEXT: v_cmp_ne_u32_e32 vcc, -1, v2
----------------
foad wrote:
> Not related to your patch, but we should generate v_cvt_f32_ubyte0 here, shouldn't we?
Yes, but nothing is trying to reduce the bitwidth of anything right now
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107187/new/
https://reviews.llvm.org/D107187
More information about the llvm-commits
mailing list