[PATCH] D107187: [amdgpu] Add an enhanced conversion from i64 to f32.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 2 06:23:07 PDT 2021


arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/cvt_f32_ubyte.ll:1085
 ; SI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; SI-NEXT:    s_movk_i32 s6, 0xff
-; SI-NEXT:    v_and_b32_e32 v0, s6, v0
-; SI-NEXT:    v_add_i32_e32 v0, vcc, 0, v0
-; SI-NEXT:    v_ffbh_u32_e32 v2, v0
-; SI-NEXT:    v_addc_u32_e64 v1, s[4:5], 0, 0, vcc
-; SI-NEXT:    v_add_i32_e32 v2, vcc, 32, v2
-; SI-NEXT:    v_ffbh_u32_e32 v3, v1
-; SI-NEXT:    v_cmp_eq_u32_e32 vcc, 0, v1
-; SI-NEXT:    v_cndmask_b32_e32 v2, v3, v2, vcc
-; SI-NEXT:    v_mov_b32_e32 v3, 0xbe
-; SI-NEXT:    v_sub_i32_e32 v4, vcc, v3, v2
-; SI-NEXT:    v_lshl_b64 v[2:3], v[0:1], v2
-; SI-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
-; SI-NEXT:    v_and_b32_e32 v1, 0x7fffffff, v3
-; SI-NEXT:    v_cndmask_b32_e32 v0, 0, v4, vcc
-; SI-NEXT:    s_mov_b32 s4, 0
-; SI-NEXT:    v_and_b32_e32 v3, s6, v3
-; SI-NEXT:    s_movk_i32 s5, 0x80
-; SI-NEXT:    v_lshrrev_b32_e32 v1, 8, v1
-; SI-NEXT:    v_lshlrev_b32_e32 v0, 23, v0
-; SI-NEXT:    v_or_b32_e32 v0, v0, v1
-; SI-NEXT:    v_cmp_eq_u64_e32 vcc, s[4:5], v[2:3]
-; SI-NEXT:    v_and_b32_e32 v1, 1, v0
-; SI-NEXT:    v_cndmask_b32_e32 v1, 0, v1, vcc
-; SI-NEXT:    v_cmp_lt_u64_e32 vcc, s[4:5], v[2:3]
-; SI-NEXT:    v_cndmask_b32_e64 v1, v1, 1, vcc
-; SI-NEXT:    v_add_i32_e32 v0, vcc, v0, v1
+; SI-NEXT:    v_ffbh_i32_e32 v2, 0
+; SI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
----------------
foad wrote:
> Not related to your patch, but we should generate v_cvt_f32_ubyte0 here, shouldn't we?
Yes, but nothing is trying to reduce the bitwidth of anything right now


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107187/new/

https://reviews.llvm.org/D107187



More information about the llvm-commits mailing list