[PATCH] D37325: [AMDGPU] Use v_pm_max_f16 for fcanonicalize
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 6 09:04:28 PDT 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1289
+ (fcanonicalize (v2f16 (VOP3PMods v2f16:$src, i32:$src_mods))),
+ (V_PK_MUL_F16 SRCMODS.OP_SEL_1, (i32 CONST.V2FP16_ONE), $src_mods, $src, DSTCLAMP.NONE)
+>;
----------------
rampitec wrote:
> arsenm wrote:
> > rampitec wrote:
> > > rampitec wrote:
> > > > arsenm wrote:
> > > > > This won't work. For now it's probably easier to just throw an S_MOV_B32 of the constant. This won't encode correctly as a direct immediate because you need to manipulate op_sel
> > > > It was here before, this is the old code and the fix does not belong to the current change.
> > > For some reason S_MOV_B32 does not work here. It is really a separate issue and has to be addressed separately.
> > Why doesn't it work? What about V_MOV_B32? The constant needs to be materialized in some way.
> >
> > Actually I think setting this to be FP16 zero is fine, as long as you remove the SRCMODS.OP_SEL_1 from it.
> As I said this is really a subject for the separate patch.
It's not, because what is here is incorrect and will not work.
https://reviews.llvm.org/D37325
More information about the llvm-commits
mailing list