[PATCH] fix an invisible bug when combining repeated FP divisors
Sanjay Patel
spatel at rotateright.com
Wed May 20 13:44:45 PDT 2015
Hi nicholas, hfinkel, ab, andreadb,
This patch fixes bugs that were exposed by r237046 ( http://reviews.llvm.org/rL237046 ):
1. When replacing a division node, it's not enough to RAUW. We should call CombineTo() to delete dead nodes and combine again.
2. Because we are changing the DAG, we can't return an empty SDValue after the transform. As the code comments say:
// Visitation implementation - Implement dag node combining for different
// node types. The semantics are as follows:
// Return Value:
// SDValue.getNode() == 0 - No change was made
// SDValue.getNode() == N - N was replaced, is dead and has been handled.
// otherwise - N should be replaced by the returned Operand.
The new test case shows no difference with or without this patch, but it will crash if we re-apply r237046. If anyone sees a way to expose the bug without FMF, I'd be happy to modify the test case.
http://reviews.llvm.org/D9893
Files:
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
test/CodeGen/X86/fdiv-combine.ll
Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -8317,10 +8317,14 @@
if (Dividend != FPOne) {
SDValue NewNode = DAG.getNode(ISD::FMUL, SDLoc(U), VT, Dividend,
Reciprocal);
- DAG.ReplaceAllUsesWith(U, NewNode.getNode());
+ CombineTo(U, NewNode);
+ } else if (U != Reciprocal.getNode()) {
+ // In the absence of fast-math-flags, this user node is always the
+ // same node as Reciprocal, but with FMF they may be different nodes.
+ CombineTo(U, Reciprocal);
}
}
- return SDValue();
+ return SDValue(N, 0); // N was replaced.
}
}
Index: test/CodeGen/X86/fdiv-combine.ll
===================================================================
--- test/CodeGen/X86/fdiv-combine.ll
+++ test/CodeGen/X86/fdiv-combine.ll
@@ -27,5 +27,22 @@
ret float %div2
}
+; If the reciprocal is already calculated, we should not
+; generate an extra multiplication by 1.0.
+
+define double @div3_arcp(double %x, double %y, double %z) #0 {
+; CHECK-LABEL: div3_arcp:
+; CHECK: # BB#0:
+; CHECK-NEXT: movsd{{.*#+}} xmm2 = mem[0],zero
+; CHECK-NEXT: divsd %xmm1, %xmm2
+; CHECK-NEXT: mulsd %xmm2, %xmm0
+; CHECK-NEXT: addsd %xmm2, %xmm0
+; CHECK-NEXT: retq
+ %div1 = fdiv fast double 1.0, %y
+ %div2 = fdiv fast double %x, %y
+ %ret = fadd fast double %div2, %div1
+ ret double %ret
+}
+
; FIXME: If the backend understands 'arcp', then this attribute is unnecessary.
attributes #0 = { "unsafe-fp-math"="true" }
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D9893.26173.patch
Type: text/x-patch
Size: 1729 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150520/c5cc52f0/attachment.bin>
More information about the llvm-commits
mailing list