[PATCH] Add a fence elimination pass

Robin Morisset robin.morisset at normalesup.org
Wed Nov 12 04:44:11 PST 2014


Hi,

To dberlin: I looked at this article, but they explain well that the reason min-cut is so expensive for PRE is because it must be repeated for each computation in the function (of which there can be 10s of thousand in very large function) and must look at a potentially huge graph. In comparison we only run this twice: once for hwsync and one for lwsync. Furthermore, because the graph is stopped by any memory access (and not just use/kill of some very specific computation as in PRE), I expect each of these runs of min-cut to be quite cheap. I have not had the time to benchmark the compile-time cost of this pass (deadline tomorrow for PLDI..), but in summary I expect it to be small, even for large functions full of fences.

Thanks for the comments.

================
Comment at: lib/Target/PowerPC/PPCTargetMachine.cpp:184
@@ +183,3 @@
+    // FIXME: breaks a bunch of brittle tests
+    // addPass(createCFGSimplificationPass());
+  }
----------------
hfinkel wrote:
> If doing this is generally a good thing, then we should always do it (and on what target would that not be true?).
> 
> Otherwise, the CFG simplification pass is mostly a wrapper around the SimplifyCFG utility function, and perhaps it should just be called directly from the FencesPRE pass?
> 
I agree it might be a good thing to run it anyway on all targets, but some tests (at least) on Power contain conditional jumps based on undef, and SimplifyCFG makes a complete mess of them.

The cleanup is mostly because of requiring BreakCriticalEdges (that I have not found how to do on demand while preserving BlockFrequencyInfo yet), so calling it directly from FencesPRE would not solve the issue.

http://reviews.llvm.org/D5758






More information about the llvm-commits mailing list