[PATCH] D19076: AMDGPU/SI: Re-implement the lowering for 32-bit floating point division

Mon May 9 10:21:12 PDT 2016

arsenm added inline comments.

================
Comment at: test/CodeGen/AMDGPU/fdiv.ll:83-88
@@ +82,8 @@
+; SI_DAG: v_div_fixup_f32
+define void @fdiv_f32(float addrspace(1)* %out, float %a, float %b) {
+entry:
+  %0 = fdiv float %a, %b
+  store float %0, float addrspace(1)* %out
+  ret void
+}
+
----------------
Can you add a test where the fast math flag is missing from the interaction, but globally enabled?

================
Comment at: test/CodeGen/AMDGPU/fdiv.ll:129-134
@@ +128,8 @@
+; SI-DAG: v_rcp_f32
+; SI_DAG: v_fma_f32
+; SI_DAG: v_fma_f32
+; SI_DAG: v_mul_f32
+; SI_DAG: v_fma_f32
+; SI_DAG: v_fma_f32
+; SI_DAG: v_fma_f32
+; SI_DAG: v_div_fmas_f32
----------------
A lot of these are broken with _. Also note that repeating the same instruction multiple times with -DAG does not behave as expected

http://reviews.llvm.org/D19076