[PATCH] D123835: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 15 10:20:16 PDT 2022


rampitec added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/mad_64_32.ll:535-539
+; CI-NEXT:    v_mad_i64_i32 v[0:1], s[4:5], v0, v1, 0
+; CI-NEXT:    v_add_i32_e32 v2, vcc, v0, v2
+; CI-NEXT:    v_addc_u32_e32 v3, vcc, v1, v3, vcc
+; CI-NEXT:    v_add_i32_e32 v0, vcc, v0, v4
+; CI-NEXT:    v_addc_u32_e32 v1, vcc, v1, v5, vcc
----------------
arsenm wrote:
> This is a regression? It looks to be the same cycle count for more code size
Actually since gfx90a v_mad_u64/i64 is full rate, so it is even more cycles in that case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123835/new/

https://reviews.llvm.org/D123835



More information about the llvm-commits mailing list