[PATCH] D55163: AMDGPU: Add optimization patterns to combine fp32->fp16 conversions
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 16 22:32:36 PST 2018
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:2045
+
+ // no constant handling unlike SelectHi16Elt() due to lack of need
+
----------------
Capitalize and punctuate
================
Comment at: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:2080-2081
+bool AMDGPUDAGToDAGISel::SelectCvtRtzF16F32(SDValue In, SDValue &Src) const {
+ ConstantSDNode *CI = dyn_cast<ConstantSDNode>(In);
+ ConstantFPSDNode *CF = dyn_cast<ConstantFPSDNode>(In);
+ if (CI || CF) {
----------------
I think you should need only one or the other here, not both. Either way, you could move this out of the function (and I think there's a wrapper for this somewhere already)
================
Comment at: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:2091
+ bool LostInfo;
+ APFloat FVal = APFloat(APFloatBase::IEEEhalf(), APInt(16, Val));
+ FVal.convert(APFloatBase::IEEEsingle(), APFloatBase::rmNearestTiesToAway, &LostInfo);
----------------
You can use the constructor directly without the extra = APFloat
================
Comment at: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:2125
+
+bool AMDGPUDAGToDAGISel::SelectCvtRtzF16F32Impl(SDValue In, SDValue &Src, bool hi) const {
+ In = stripBitcast(In);
----------------
Capitalize hi
================
Comment at: lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:2154-2159
+ unsigned shiftOpcode = hi ? ISD::SHL : ISD::SRL;
+ int shiftOperand = hi ? 0 : 1;
+ uint32_t andMask = hi ? 0xffff0000u : 0xffffu;
+ int andOperand = hi ? 1 : 0;
+
+ if (In.getOpcode() == ISD::AND) {
----------------
computeKnownBits?
================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1597
+let SubtargetPredicate = isGCN in {
+
----------------
Should use isVI, or maybe these should be distinguished by GCN3Encoding? Needs a comment for why these are separated
================
Comment at: test/CodeGen/AMDGPU/cvt_pkrtz_f16_f32_combine.ll:23
+; ALL-LABEL: {{^}}packed_convert_high:
+; ALL: v_cvt_pkrtz_f16_f32{{(_e32|_e64)?}} v{{[0-9]*}}, v0, v1
+; ALL-NOT: v_cvt_pkrtz_f16_f32
----------------
Probably should explicitly test the different encodings for the different sub targets
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55163/new/
https://reviews.llvm.org/D55163
More information about the llvm-commits
mailing list