[PATCH] Add a fence elimination pass

JF Bastien jfb at chromium.org
Thu Oct 16 10:26:02 PDT 2014


In the description, could you add a link to the LLVM-dev discussion for this?

What's missing to enable ARM, x86, and other architectures? Could you also use this pass on non-platform-specific fences (C++11 style fences instead)?

It would be nice to have more complex examples with bigger CFG, and with memory accesses (not just atomic ones).

As we discussed yesterday: do you think that it would be easy to generate dot graphs too, to make is easier to see what's going on?

Do you have a feel for the runtime of this pass on large codebases? I understand that there are overheads due to the required passes, but I want to make sure that you try profiling just this pass, to make sure there aren't glaring inefficiencies added.

================
Comment at: include/llvm/Transforms/Scalar.h:44
@@ -30,1 +43,3 @@
+        std::function<bool(const Instruction&)> isWeakerFence
+            = [](const Instruction &I){return false;});
 
----------------
I'm not a fan of default arguments, I think you shouldn't have them.

Can you add a description of what stronger/weaker mean, as well as FenceArgs?

`std::function` isn't in common use in the code base, can you see what similar functions use instead?

Overall you're trying to create a partial ordering for target-specific intrinsics for fences, and then manually adding this pass many times to a backend. Could you instead have the backend specify the ordering, and add the pass only once? The pass would handle each fence type that it's made aware of, in the proper strong->weak ordering.

================
Comment at: lib/Target/PowerPC/PPCTargetMachine.cpp:20
@@ -18,1 +19,3 @@
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/IntrinsicInst.h"
 #include "llvm/MC/MCStreamer.h"
----------------
Keep these sorted.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:52
@@ +51,3 @@
+
+STATISTIC(NumFencesDetected, "Number of fences pre-FencesPRE");
+STATISTIC(NumFencesInserted, "Number of fences added by FencesPRE");
----------------
"before FencesPRE" is less odd. Pre-PRE sounds weird :)

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:58
@@ +57,3 @@
+  class FencesPRE : public FunctionPass {
+    public:
+    static char ID;
----------------
`clang-format` the entire file.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:94
@@ +93,3 @@
+
+    typedef struct FlowGraphNode {
+      Instruction * I;
----------------
You don't need to `typedef struct A {} A;` in C++, just `struct A {};` is enough.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:112
@@ +111,3 @@
+      // is not in the FlowGraph map (since the non-duplicate already is).
+      std::unique_ptr<struct FlowGraphNode> DupNode;
+      FlowGraphNode() : I(nullptr), FenceBeforeIt(nullptr), Excess(0),
----------------
You don't need `struct` here (C versus C++).

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:115
@@ +114,3 @@
+                        Height(0), Mark(false), Succs(), Next(Succs.begin()),
+                        DupNode() {}
+      FlowGraphNode(Instruction * Inst) : I(Inst), FenceBeforeIt(nullptr),
----------------
You can use a delegating constructor to ` FlowGraphNode(Instruction * Inst)`, though it doesn't look like this constructor is used, so you could mark it as `= delete` instead.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:170
@@ +169,3 @@
+    if (!FenceIntrinsicID)
+      return false;
+    BFI = &getAnalysis<BlockFrequencyInfo>();
----------------
Does this ever happen? It seems like it's an error and should be an `assert`.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:179
@@ +178,3 @@
+    SmallVector<Edge, 1> Cut;
+    SmallPtrSet<Instruction*, 1> Fences;
+
----------------
Why 1 in all cases? That seems pretty low, especially since this is all stack allocated  it's pretty cheap to start with 4 or 8, though real-world numbers are better.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:209
@@ +208,3 @@
+
+    for (auto NPair : Cut) {
+      assert(NPair.first->I->getParent() == NPair.second->I->getParent());
----------------
What is this loop doing? Putting it in its own function could make things easier to understand.

================
Comment at: lib/Transforms/Scalar/FencesPRE.cpp:427
@@ +426,3 @@
+void FencesPRE::relabel(Flow &Flow, FlowGraphNode &U) const {
+  auto minHeight = UINT32_MAX;
+  for (auto &VWithCap : U.Succs) {
----------------
I'd avoid the macros here and in other places. You have a `typedef` for the types, so it would be better to use `std::numeric_limits`.

http://reviews.llvm.org/D5758






More information about the llvm-commits mailing list