[PATCH] D55851: Implement basic loop fusion pass

Thu Dec 20 09:26:53 PST 2018

kbarton marked 2 inline comments as done.
kbarton added a comment.

Responded to two of the comments. Most of the others will be addressed in the next revision, which I should hopefully be uploading soon.

================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:142-143
+
+  FusionCandidate(Loop *L, const DominatorTree *DT,
+                  const PostDominatorTree *PDT)
+      : Preheader(L->getLoopPreheader()), Header(L->getHeader()),
----------------
Meinersbur wrote:
> [suggestion] IMHO it is good practice to not have much logic in object constructors. Reasons:
> 1. When exceptions are thrown, the object is partially initialized
> 2. With class polymorphism, calling virtual functions will call the non-overidden method.
> 3. It is not possible to return anything, in this case, an error code.
> 
> 1. and 2. do not apply to this code. Point 3. is handled using the `Valid` flag.
> 
> Nonetheless, I think having lots of logic in constructor is a bad pattern. Instead, I suggest to have something return nullptr or `llvm::ErrorOr`.
I understand your points here. I did not realize issue #2 (although I agree it doesn't apply here). 

However, if I understand your suggestion, I would need to add an initialize method (or something similar) to do the logic that is currently in the constructor and return a success/fail code. The problem is that before anything can be done with the object, you need to ensure that it has been initialized. This can be accomplished by adding a boolean flag (isInitialized) to the class and ensuring that all accessors check that before doing anything with it. However, I think that is more complicated then leaving the initialization logic in the constructor.

That said, I'm not particularly tied to either approach. I can change the implementation if you feel strongly. Or, if there is an alternative that I'm not seeing, I'm happy to do that instead. 

================
Comment at: llvm/lib/Transforms/Scalar/LoopFuse.cpp:276
+// dominates FC1 and FC1 post-dominates FC0.
+using FusionCandidateSet = std::set<FusionCandidate, FusionCandidateCompare>;
+using FusionCandidateCollection = SmallVector<FusionCandidateSet, 4>;
----------------
Meinersbur wrote:
> [remark] Is there a reason to use `std::set`? Usually, the LLVM hash-based sets (`DenseSet`, `SmallPtrSet`, etc) are more performant.
This is a good question, and I should have added comments here to articulate why I eventually settled on std::set.

Ultimately, I was looking for a data structure that would keep the fusion candidates sorted automatically, and not invalidate the iterators as the set changes. In this implementation, either DenseSet or SmallPtrSet would be fine, but we have additional patches to extend fusion to move intervening code around and split up loops to create new opportunities. When this is done, the contents of the FusionCandidateSet will change, and having stable iterators simplifies the logic quite a bit (in my opinion). Similarly, always ensuring that the set is ordered properly is essential for this implementation, and that seems to be done reasonably well with std::set. 

I will add some comments in my next revision to indicate my line of reasoning here. 

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55851/new/

https://reviews.llvm.org/D55851