[PATCH] D24716: [Polly] DeLICM/DePRE (WIP)

Tobias Grosser via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 5 21:30:20 PST 2016


grosser added a comment.

Hi Michael,

thank you for this patch. It is now really nicely documented and very easy to read. Just the size is still something I am struggeling with. Below you find a list of local comments that I noted when doing a first pass over the patch. Also, I promised to suggest new parts that can be extracted out of this patch.

I also tried to understand more of the core algorithm. For this I tried to make makeValInst less agressive by applying the following patch:

       // If the definition/write is conditional, the previous write may "shine
       // through" on conditions we cannot determine. Again, return the unknown
       // value.
  -    if (!IsCertain)
  +    if (true)
         return give(isl_map_from_domain(DomainUse.take()));
  ``
  
  Interestingly all test cases fail, and gemm_complete.ll takes about 20seconds before it also fails. Is this expected? Did I correctly implement the test case you suggested on our last phone call? Do you have any idea how would a test case would look like that still passes after this change?



================
Comment at: include/polly/DeLICM.h:177
+///
+/// An element is unused at an timepoint when the element will be overwritten in
+/// the future, but it is no read in between. Another way to express this: the
----------------
a timepoint


================
Comment at: include/polly/DeLICM.h:181
+/// infinitely into the past if there no read before. Such unused elements can
+/// be overwritten by any value without changing the Scop's semantics. An
+/// example:
----------------
the scop's semantics


================
Comment at: include/polly/DeLICM.h:54
+/// timepoint 10 by Overwrite[]. Values not defined in the map have no known
+/// definition. This includes the statements instance timepoints themselves,
+/// because reads in those timepoints could either read the old or the new
----------------
"statements instance" -> "statement instance"


================
Comment at: include/polly/DeLICM.h:55
+/// definition. This includes the statements instance timepoints themselves,
+/// because reads in those timepoints could either read the old or the new
+/// value, defined only by the statement itself. But this can be changed by @p
----------------
AT these timepoints?


================
Comment at: include/polly/DeLICM.h:96
+/// Note: Lifetimes are expressed in terms of the preceding write. Hence, reads
+/// before the first read cannot expressed by this function.
+///
----------------
BE expressed


================
Comment at: include/polly/ScopInfo.h:1037
+  /// @param Oneline Print a more dense representation without line-breaks.
+  void print(raw_ostream &OS, bool Oneline = false) const;
 
----------------
This seems to be unused. Can we exclude it for the moment?


================
Comment at: include/polly/ScopInfo.h:1051
+
+llvm::raw_ostream &operator<<(llvm::raw_ostream &OS, const MemoryAccess *MA);
+
----------------
This seems unused. Can we exclude it for the moment?


================
Comment at: lib/Analysis/ScopInfo.cpp:314
+    case MK_ExitPHI:
+      OS << " MK_Array";
+      return;
----------------
Why do you call MK_ExitPHI "MK_Array"? This seems counterintuitive.

Also, the very information you print is already available as suffix in the array name. Adding this to the printing seems redundant. I am aware that we currently do not add _exitphi to the naming, but by adding this to the array name as well we would remove this inconsistency without printing redundant information.


================
Comment at: lib/Support/GICHelper.cpp:222
 DEFINE_ISLPTR(aff)
+DEFINE_ISLPTR(multi_aff)
 DEFINE_ISLPTR(pw_aff)
----------------
I believe you can just add these two definitions as [NFC} stating that they are needed for an out-of-tree patch that is in progress of upstreaming.


================
Comment at: lib/Support/GICHelper.cpp:282
+}
+
 void polly::foreachElt(NonowningIslPtr<isl_union_pw_aff> UPwAff,
----------------
These functions seem to be independent of the patch set. With the new unit-test infrastructure you added, it seems easy to test them separately. E.g. we could just iterate over the object, add the individual elements to a vector, combine them again with 'union', and check if the resulting objects are identical?

If we already add unit tests, these changes could easily be committed independently.

[Adding code like this to Polly feels generally a little bit unsatisfying, as this could be expressed a lot better with C++ bindings. This is one of the reasons I really want to get the bindings soon, as writing code/tests for code that is soon to be replaced is somehow unsatisfying. As the bindings still will take some time, it is probably OK to still upstream this.]


================
Comment at: lib/Support/GICHelper.cpp:180-182
+  replace(str, "+", ")");
+  replace(str, "%", "");
+  replace(str, "@", "");
----------------
grosser wrote:
> Meinersbur wrote:
> > Adding the type to ISL names causes `%` and `@` to be in the middle of the string. `+` appears in `llvm::Value`s of floating-point constants.
> As types don't seem to be needed, this change can be dropped from this patch, right?
nothing (phabricator does not allow me to delete this comment)


================
Comment at: lib/Transform/DeLICM.cpp:2580
+  /// zones, but we can only chose one. This is a greedy algorithm, therefore
+  /// the first processed will claim it.
+  void greedyCollapse() {
----------------
nothing (phabricator does not seem to allow me to delete this comment)


================
Comment at: lib/Transform/DeLICM.cpp:379
+}
+#endif
+/// Determine how many dimensions the scatter space of @p Schedule has.
----------------
Deadcode?


================
Comment at: lib/Transform/DeLICM.cpp:558
+// one element.
+// { Scatter[] -> Domain[] }
+
----------------
Dead comment?


================
Comment at: lib/Transform/DeLICM.cpp:1774
+        return false;
+    }
+
----------------
Not sure why the check if a scop can be DeLICM-ed is part of computeCommon. It seems orthogonal to the actual computations here. Would it make sense to extract this to isCompatible(Scop &S) and hoist this call to a higher level of the call tree? I would expect this for example to be called even before we start with building up any of the data structures needed in DeLICM, which means before computeZone() and the DefUse.compute call()?


================
Comment at: lib/Transform/DeLICM.cpp:1791
+      }
+    }
+
----------------
Any reason you do not use scop->getReads(), scop->getWrites(), scop->getMayWrites()?

One might claim that your code prevents us from scanning the scop four times for different kinds of accesses, but this is certainly not a large performance difference (if measurable at all
given the later more costly computations). If we want to add such a specialization, I would prefer to add this after the initial delicm implementation as a well-benchmarked and consequently justified performance optimization.


================
Comment at: lib/Transform/DeLICM.cpp:1907
+
+  /// Determine whether to knowledges are conflicting each other.
+  ///
----------------
two


================
Comment at: lib/Transform/DeLICM.cpp:1931
+      // generator would need to reload the scalar after the scop, but it
+      // doesn't have the information to where it is mapped to. Only the
+      // MemoryAccesses have that information, not the ScopArrayInfo.
----------------
does not


================
Comment at: lib/Transform/DeLICM.cpp:1976
+  /// @return { DomainDef[] -> DomainUse[] }, { DomainDef[] -> Zone[] }
+  ///         First element is the this of uses for each definition.
+  ///         The second is the lifetime of each definition.
----------------
the >this< of uses? Do you mean 'set'?


================
Comment at: lib/Transform/DeLICM.cpp:2246
+        // incoming values (+ maybe another incoming edge from an unrelated
+        // block. Since we cannot directly represent it as a single
+        // llvm::Value from multiple exiting block, it is represented using
----------------
block). (missing ')')


================
Comment at: lib/Transform/DeLICM.cpp:2378
+  /// @param SAI         The ScopArrayInfo representing the scalar's memory to
+  /// map.
+  /// @param ReadTarget  { DomainRead[] -> Element[] }
----------------
maybe indent 'map' here?


================
Comment at: lib/Transform/DeLICM.cpp:2417
+  ///
+  /// Start trying mapping of scalars that are used in the same statement as the
+  /// store. For every successful mapping, try to also map scalars of the
----------------
trying to map scalars


================
Comment at: lib/Transform/DeLICM.cpp:2422
+  ///
+  /// There is currently no preference in which order scalars are being tried.
+  /// Ideally, we would direct it towards a load instruction of the same array
----------------
are tried


================
Comment at: lib/Transform/DeLICM.cpp:2502
+      // Try to map MK_Value scalars.
+      if (SAI->isValueKind() && tryMapValue(SAI, EltTarget)) {
+        auto *DefAcc = DefUse.getValueDef(SAI);
----------------
It seems we can continue if tryMapValue() returns false, no need to later check if SAI is PHIKind. This also makes very clear this code path has to be taken for all SAI == ValueKind and nothing later needs to be verified.

```
      if (SAI->isValueKind()) {                                                  
        if (!tryMapValue(SAI, EltTarget))                                        
          continue;  
```

The same holds below.


================
Comment at: lib/Transform/DeLICM.cpp:17
+// The algorithms here work on the scatter space - the image space of the
+// schedule returned by Scop::getSchedule(). We call an element in that space an
+// "timepoint". Timepoints are lexicographically ordered such that we can
----------------
 >a< timepoint


================
Comment at: lib/Transform/DeLICM.cpp:25
+// they
+// do not contain the extrema. Using ISL rational sets to express these would be
+// overkill. We also cannot store them as the integer timepoints they contain;
----------------
Line wrapping broken?


================
Comment at: test/DeLICM/gemm_complete.ll:2
+; RUN: opt %loadPolly -basicaa -loop-rotate -licm -polly-delicm -analyze < %s | FileCheck %s
+; 
+; dgemm kernel
----------------
This file has a lot of trailing whitespaces.


================
Comment at: test/DeLICM/incomplete_phi.ll:2
+; RUN: opt %loadPolly -polly-flatten-schedule -polly-delicm -polly-scops -analyze < %s | FileCheck %s
+;
+target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
----------------
This also has a lot of trailing whitespaces. Please also check the other test files.


================
Comment at: unittests/CMakeLists.txt:19
   endif()
-  target_link_libraries(${test_name} Polly LLVMCore LLVMSupport)
+  target_link_libraries(${test_name} Polly LLVMCore LLVMSupport LLVMipo)
 endfunction()
----------------
Why is this change needed? It works for me even without.


https://reviews.llvm.org/D24716





More information about the llvm-commits mailing list