[PATCH] D37325: [AMDGPU] Use v_pm_max_f16 for fcanonicalize

Wed Sep 6 08:30:18 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1289
+  (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
+  (V_PK_MUL_F16 SRCMODS.OP_SEL_1, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
+>;
----------------
rampitec wrote:
> rampitec wrote:
> > arsenm wrote:
> > > This won't work. For now it's probably easier to just throw an S_MOV_B32 of the constant. This won't encode correctly as a direct immediate because you need to manipulate op_sel
> > It was here before, this is the old code and the fix does not belong to the current change.
> For some reason S_MOV_B32 does not work here. It is really a separate issue and has to be addressed separately.
Why doesn't it work? What about V_MOV_B32? The constant needs to be materialized in some way.

Actually I think setting this to be FP16 zero is fine, as long as you remove the SRCMODS.OP_SEL_1 from it.

https://reviews.llvm.org/D37325