[PATCH] D36856: [AMDGPU] Use v_max_f* for fcanonicalize

Fri Aug 25 09:12:43 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/AMDGPUInstructions.td:45-47
+def FP16Denormals : Predicate<"Subtarget->hasFP16Denormals()">;
+def FP32Denormals : Predicate<"Subtarget->hasFP32Denormals()">;
+def FP64Denormals : Predicate<"Subtarget->hasFP64Denormals()">;
----------------
How / why this change?

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1280-1285
+let Predicates = [FP16Denormals], AddedComplexity = 1 in {
+def : Pat<
+  (fcanonicalize (f16 (VOP3Mods f16:$src, i32:$src_mods))),
+  (V_MAX_F16_e64 $src_mods, $src, $src_mods, $src, 0, 0)
+>;
+}
----------------
I think it would be clearer to have let Predicates = [NoFP16Denormals] rather than relying on AddedComplexity to prefer one pattern over the other

================
Comment at: test/CodeGen/AMDGPU/fcanonicalize-denorms.ll:8
+declare i32 @llvm.amdgcn.workitem.id.x() #0
+
+; GCN-LABEL:  {{^}}test_canonicalize_value_f64:
----------------
Can you merge this with fcanonicalize.ll? That one avoids multiple run lines by using the attributes on the different functions

https://reviews.llvm.org/D36856