[PATCH] D37325: [AMDGPU] Use v_pk_max_f16 for fcanonicalize

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 6 11:00:56 PDT 2017


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1289
+  (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
+  (V_PK_MUL_F16 SRCMODS.OP_SEL_1, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
+>;
----------------
arsenm wrote:
> rampitec wrote:
> > arsenm wrote:
> > > rampitec wrote:
> > > > rampitec wrote:
> > > > > arsenm wrote:
> > > > > > This won't work. For now it's probably easier to just throw an S_MOV_B32 of the constant. This won't encode correctly as a direct immediate because you need to manipulate op_sel
> > > > > It was here before, this is the old code and the fix does not belong to the current change.
> > > > For some reason S_MOV_B32 does not work here. It is really a separate issue and has to be addressed separately.
> > > Why doesn't it work? What about V_MOV_B32? The constant needs to be materialized in some way.
> > > 
> > > Actually I think setting this to be FP16 zero is fine, as long as you remove the SRCMODS.OP_SEL_1 from it.
> > As I said this is really a subject for the separate patch.
> It's not, because what is here is incorrect and will not work.
It is really different, one is fix the other is optimization.
Please see D37522.


https://reviews.llvm.org/D37325





More information about the llvm-commits mailing list