[PATCH] D28496: [AMDGPU] Implement f16 fcopysign
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 9 15:32:06 PST 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:313-314
setOperationAction(ISD::FDIV, MVT::f16, Custom);
+ if (!Subtarget->hasBFI())
+ setOperationAction(ISD::FCOPYSIGN, MVT::f16, Expand);
----------------
This shouldn't be necessary. This will only be false for R600 targets and the default is expand for illegal types
================
Comment at: lib/Target/AMDGPU/SIInstructions.td:680-683
def : Pat <
+ (fcopysign f16:$src0, f16:$src1),
+ (V_BFI_B32 (S_MOV_B32 (i32 0x00007fff)), $src0, $src1)
+>;
----------------
It is possible to have mismatched FP types for src0 and src1. If you can come up with a testcases combined with the fp casts you might need that too
================
Comment at: test/CodeGen/AMDGPU/fcopysign.f16.ll:1
+; RUN: llc -march=amdgcn -mcpu=SI -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=GCN -check-prefix=FUNC %s
+; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=VI -check-prefix=GCN -check-prefix=FUNC %s
----------------
No -mcpu=SI
https://reviews.llvm.org/D28496
More information about the llvm-commits
mailing list