[PATCH] D134433: [AMDGPU][GISel] Enable Matching of V2S16 G_BUILD_VECTOR

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 29 09:13:41 PDT 2022


Pierre-vh added a comment.

In D134433#3824171 <https://reviews.llvm.org/D134433#3824171>, @Petar.Avramovic wrote:

> You don't want `(<2 x s16>)` but only `s16` `G_FMAXNUM_IEEE`when lowering larger vectors, also you remove `G_BUILD_VECTOR_TRUNC` but use `G_BUILD_VECTOR` instead.
> Can you point me to the test case that requires those changes? (You didn't change legalization rules for other `<2 x s16>` instructions).
> To be more precise what D134354 <https://reviews.llvm.org/D134354> requires from Legalizer (and pre/post legalizer combiner). Maybe there is simpler way to achieve it.

D134354 <https://reviews.llvm.org/D134354> requires G_BUILD_VECTOR V2S16 to be legal, we tried other options first that didn't involve legalizer changes but we discussed with @arsenm and concluded that legalizer changes were needed to support G_BUILD_VECTOR V2S16, which is what this diff is doing.
The G_FMINNUM change seems to be needed else the following:

  ; GISEL-GFX900-LABEL: v_mad_mix_v3f32_clamp_postcvt:
  ; GISEL-GFX900:       ; %bb.0:
  ; GISEL-GFX900-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
  ; GISEL-GFX900-NEXT:    v_mad_mixlo_f16 v6, v0, v2, v4 op_sel_hi:[1,1,1] clamp
  ; GISEL-GFX900-NEXT:    v_mad_mixhi_f16 v6, v0, v2, v4 op_sel:[1,1,1] op_sel_hi:[1,1,1] clamp
  ; GISEL-GFX900-NEXT:    v_mad_mixlo_f16 v1, v1, v3, v5 op_sel_hi:[1,1,1] clamp
  ; GISEL-GFX900-NEXT:    v_mov_b32_e32 v0, v6
  ; GISEL-GFX900-NEXT:    s_setpc_b64 s[30:31]

becomes

  ; GISEL-GFX900-LABEL: v_mad_mix_v3f32_clamp_postcvt:
  ; GISEL-GFX900:       ; %bb.0:
  ; GISEL-GFX900-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
  ; GISEL-GFX900-NEXT:    v_mad_mixlo_f16 v6, v0, v2, v4 op_sel_hi:[1,1,1]
  ; GISEL-GFX900-NEXT:    v_mad_mixhi_f16 v6, v0, v2, v4 op_sel:[1,1,1] op_sel_hi:[1,1,1]
  ; GISEL-GFX900-NEXT:    v_pk_max_f16 v0, v6, 0
  ; GISEL-GFX900-NEXT:    v_lshrrev_b32_e32 v2, 16, v0
  ; GISEL-GFX900-NEXT:    v_and_b32_e32 v0, 0xffff, v0
  ; GISEL-GFX900-NEXT:    v_mad_mixlo_f16 v1, v1, v3, v5 op_sel_hi:[1,1,1]
  ; GISEL-GFX900-NEXT:    v_lshl_or_b32 v0, v2, 16, v0
  ; GISEL-GFX900-NEXT:    v_pk_max_f16 v1, v1, v1
  ; GISEL-GFX900-NEXT:    v_pk_max_f16 v0, v0, v0
  ; GISEL-GFX900-NEXT:    v_pk_max_f16 v1, v1, 0
  ; GISEL-GFX900-NEXT:    v_pk_min_f16 v0, v0, 1.0 op_sel_hi:[1,0]
  ; GISEL-GFX900-NEXT:    v_pk_max_f16 v1, v1, v1
  ; GISEL-GFX900-NEXT:    v_lshrrev_b32_e32 v2, 16, v0
  ; GISEL-GFX900-NEXT:    v_and_b32_e32 v0, 0xffff, v0
  ; GISEL-GFX900-NEXT:    v_pk_min_f16 v1, v1, 1.0
  ; GISEL-GFX900-NEXT:    v_lshl_or_b32 v0, v2, 16, v0
  ; GISEL-GFX900-NEXT:    s_setpc_b64 s[30:31]
  `

If it's really a blocker ( @arsenm ?) I can take another look at it but it seems like leaving it in causes a lot of trouble


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134433/new/

https://reviews.llvm.org/D134433



More information about the llvm-commits mailing list