[PATCH] D134354: [AMDGPU][GlobalISel] Support mad/fma_mix selection

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 19 02:11:23 PDT 2022


Pierre-vh added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUGISel.td:156
 
+def gi_vop3_mad_mix_mods :
+    GIComplexOperandMatcher<s64, "selectVOP3PMadMixMods">,
----------------
foad wrote:
> Is this required? I am confused about whether you are now matching VOP3PMadMixMods in TableGen patterns, or only doing it from the C++ code in selectG_FMA.
I'm doing it exclusively in C++.
This is needed to tell the GlobalISelEmitter that, when it sees "VOP3PMadMixMods" it needs to use the "selectVOP3PMadMixMods" function from AMDGPUInstructionSelector.
Without that, it wouldn't import patterns that use it.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:749
+                   .addReg(Src0);
+    if (!constrainSelectedInstRegOperands(*MIB, TII, TRI, RBI))
+      return false;
----------------
foad wrote:
> Is this related to the current patch? Could it be split out?
It's required. D136235 


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2752
+  (v2f16 (DivergentBinFrag<build_vector> (f16 undef), (f16 VS_32:$src1))),
   (v2f16 (V_LSHLREV_B32_e64 (i32 16), SReg_32:$src1))
 >;
----------------
foad wrote:
> Pierre-vh wrote:
> > foad wrote:
> > > This looks weird. Why would we every want to generate V_LSHLREV_B32 with an sgpr operand?
> > I just changed the pattern matching to also allow VGPR operands (which, IIRC, will be copied to the right regbank in any case). Not sure why the pattern uses a SGPR output though.
> > 
> > I did this change because it helped codegen in one test go from:
> > ```
> > v_mad_mix_f32 v0, v0, v1, v2 op_sel_hi:[1,1,1] clamp 
> > v_cvt_f16_f32_e32 v0, v0 
> > v_and_b32_e32 v1, 0xffff, v0 
> > v_lshl_or_b32 v0, v0, 16, v1 
> > ```
> > to
> > ```
> > v_mad_mix_f32 v0, v0, v1, v2 op_sel_hi:[1,1,1] clamp
> > v_cvt_f16_f32_sdwa v0, v0 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD
> > ```
> > 
> > Perhaps it'd be more adequate to duplicate the pattern and have another one that uses VReg input/output?
> Maybe split this into a separate patch and we can discuss it there? It doesn't seem to be directly related to selecting the mad_mix instruction.
v_mad_mixhi_f16_f16lo_f16lo_f16lo_undeflo_clamp_precvt regressed due to splitting the change into D136236


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134354/new/

https://reviews.llvm.org/D134354



More information about the llvm-commits mailing list