[PATCH] D15722: [WIP][Polly] SSA Codegen

Tobias Grosser via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 28 06:56:50 PDT 2016


grosser added a comment.

Hi Johannes,

thank you again for having looked into the SSA codegen. I just spend the day looking through this patch. It contains indeed some very nice ideas and shows that it is probably possible to directly generate SSA without going through memory.

While playing with your patch, I implemented the earlier proposed approach of using LLVM's Mem2Reg utility from inside our code generation (F1855864: 0001-IslNodeBuilder-Promote-Mem2Reg-when-finalizing-SCoP.patch <http://reviews.llvm.org/F1855864>). With an almost trivial change it seems to achieve the same goals while additionally guaranteeing a minimal set of PHI nodes and coming with a minimal risk of regressions.

I tried to find an argument why we want to go for this 500 line patch, which still needs polishing, a review of the polished version, and also brings the risk of introducing regression:

Going through the discussion I found two arguments for this patch:

1. The new patch exposed some bugs at compile time, that we missed earlier on
2. The new patch is smaller and easier to understand than what we have today

I agree that 1) may hold for some cases.

Regarding 2) at least in terms of code size there is not yet too much difference. Only looking at include/ lib/ I see 497 insertions(+), 634 deletions(-). Which is 136 instructions less for the new solution. On the other side, this patch drops a larger 120 line comment describing in detail the previous approach. In addition, the removal of getNewScalarValue that was proposed in this patch has already been incorporated in trunk. Hence, I expect that a rebased version of this patch that adds additional documentation and possibly even code to remove unnecessary PHI nodes is at least not shorter in code size.

Regarding the complexity of the approach, I personally believe that the current approach is as easy to understand as the new approach. In parts that may be because I having written a lot of documentation for the old code and I miss a concise description of the new approach with examples that illustrate the corner cases. Some areas such as non-affine-regions are probably easier to understand with the new code, as finding the right spot for the scalar stores is not easy. However, the new approach becomes possibly more difficult when we want to guarantee that a minimal set of PHI nodes is generated (which we probably want as we have a solution that can do this automatically).

To get a  better understanding of this patch, I would really need a clean version of this patch where existing review comments are addressed, the remaining Windows/SIMD bugs resolved, the PHI node minimization implemented and some overview documentation added. In the optimal case we can make a convincing statement why the new approach is linear in compile time and results in a minimal set of PHI nodes as we get it when using Mem2Reg.

Considering this trivial patch that I posted, do you (and the other reviewers) see sufficient benefits in this larger patch now that it would be worth spending several days on rebasing and reviewing the larger patch and risking the introduction of regressions?

I personally would prefer to focus currently more on your latest assumption tracking changes and give you feedback on those. Hence, I would suggest to start off with (F1855864: 0001-IslNodeBuilder-Promote-Mem2Reg-when-finalizing-SCoP.patch <http://reviews.llvm.org/F1855864>) and use it to introduce all the interesting test case changes that come with this patch. They should match almost 1:1 and would be non-controversial and clearly beneficial.

>From this baseline we can then -- when time allows and the need arises -- pickup this patch. If we only introduce it for stylistic issues, the already changed test cases will ensure that IR changes such as the vectorization issue that has been overlooked stand out nicely. However, as I personally could see additional benefits coming from this patch (e.g. because we want to run cost-models on partially generated IR), it might even make sense to introduce this change when in the future non-memory-scalars become necessary even during code generation.

If you guys strongly believe we should go for the new approach now, I can probably take another day to check it throughly for other corner cases that may cause troubles before you start rebasing. (This is not because I don't trust you, but this is a major change on a piece of code which we spent a significant amount of time to get stable, so I believe we need to be vary careful to not regress here).


================
Comment at: lib/CodeGen/BlockGenerators.cpp:458
@@ -457,3 +405,3 @@
-  return ScalarValue;
+  EscapeMap[Inst] = std::move(EscapeUsers);
 }
 
----------------
This patch nicely pointed out that getNewScalarValue is unnecessary and its uses can be replaced with getNewValue. This simplification was performed on Jan 26 in https://llvm.org/svn/llvm-project/polly/trunk@258799, such that future versions of this patch will not need to perform this simplification any more.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:1007
@@ -1006,3 @@
-  }
-}
-
----------------
Why is the last function dropped? This is a sanity check that works on the ScopInfo and is unrelated to how we generate scalar code. It verifies that no scalar write statements are in this ScopStmt, which is likely to cause trouble both for the to-memory scalar codegen as well as SSA codegen. Consequently, I think it makes sense to keep this check.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:1038
@@ -1037,3 +931,1 @@
-  generateScalarVectorLoads(Stmt, VectorBlockMap);
-
   for (Instruction &Inst : *BB)
----------------
By dropping generateScalarVectorLoads this patch changes behavior. Originally the code splatted all scalar values into a vector-value. After this change, no vector values are generated. This is likely the reason test/Isl/CodeGen/simple_vec_stride_one.ll changes.

================
Comment at: test/Isl/CodeGen/phi_loop_carried_float_5.ll:5
@@ +4,3 @@
+;
+; XFAIL: *
+;
----------------
This test case is incomplete. Without a comment it is unclear what exactly is tested here. (The test case seems to be interesting, though)

================
Comment at: test/Isl/CodeGen/phi_scalar_simple_1.ll:61
@@ -49,1 +60,3 @@
+; CHECK:          %x.addr.1.polly.lc = phi i32 [ %x.addr.0.polly.lc, %polly.loop_preheader7 ], [ %p_add.merge, %polly.merge15 ]
+
 for.cond1:                                        ; preds = %for.inc, %for.body
----------------
These check lines are misleading as the current patch is generating more (unnecessary) PHI instructions here. In the optimal case we would not even generate them. As long as we generate them, we should list all of them with CHECK-NEXT and add a TODO that some of these PHIs are unnecessary and expected to be deleted later.

================
Comment at: test/Isl/CodeGen/phi_scalar_simple_2.ll:16
@@ -15,8 +15,3 @@
 ; CHECK-LABEL: entry:
-; CHECK-DAG:     %x.addr.2.s2a = alloca i32
-; CHECK-DAG:     %x.addr.2.phiops = alloca i32
-; CHECK-DAG:     %x.addr.1.s2a = alloca i32
-; CHECK-DAG:     %x.addr.1.phiops = alloca i32
-; CHECK-DAG:     %x.addr.0.s2a = alloca i32
-; CHECK-DAG:     %x.addr.0.phiops = alloca i32
+; CHECK-NOT:     alloca
   %tmp = sext i32 %N to i64
----------------
If we are dropping here almost all CHECK lines, this test case becomes useless. Either we should make an argument that it does indeed not test anything interesting and just delete it or we should add CHECK lines similar to the ones added in phi_scalar_simple_1.ll.

If we make the argument this test case is redundant, it should be deleted in a separate commit.

================
Comment at: test/Isl/CodeGen/scalar-store-from-same-bb.ll:14
@@ -9,1 +13,3 @@
+; CHECK: polly.stmt.loop:
+; CHECK:   %p_sum = add i64 %N, 1
 
----------------
Nice!

================
Comment at: test/Isl/CodeGen/simple_vec_call.ll:35
@@ -36,1 +34,3 @@
+; CHECK: %3 = insertelement <4 x float> %2, float [[RES4]], i32 3
+; CHECK:  store <4 x float> %3
 ; CHECK: attributes [[NUW]] = { nounwind }
----------------
This is a trivial change, but mostly unrelated to the proposed patch. I would prefer to introduce REGEXP matches ahead of time to keep the review and patch focused. I did this in r267875 so this change won't show up in a possible future version of this patch.

================
Comment at: test/Isl/CodeGen/simple_vec_stride_one.ll:8
@@ +7,3 @@
+; CHECK:   store double %val_p_scalar_, double* %scevgep10, align 8, !alias.scope !0, !noalias !2
+; CHECK:   store double %val_p_scalar_, double* %scevgep11, align 8, !alias.scope !0, !noalias !2
+
----------------
Instead of a vector store, we now generate four scalar stores. This is a regression that seems to be introduced accidentally in this patch.

================
Comment at: test/Isl/CodeGen/srem-in-other-bb.ll:13
@@ -13,4 +12,3 @@
 ; CHECK:      polly.stmt.bb3:
-; CHECK:        %tmp.s2a.reload = load i64, i64* %tmp.s2a
-; CHECK:        %p_tmp3 = getelementptr inbounds float, float* %A, i64 %tmp.s2a.reload
+; CHECK:        %p_tmp3 = getelementptr inbounds float, float* %A, i64 %p_tmp
 
----------------
Trivial, but nice!

================
Comment at: test/ScopInfo/out-of-scop-use-in-region-entry-phi-node-nonaffine-subregion.ll:1
@@ -1,2 +1,2 @@
 ; RUN: opt %loadPolly -polly-codegen -S < %s | FileCheck %s
 ;
----------------
The file mode change is unrelated and should be a separate commit.  This has already been fixed in r266473.


http://reviews.llvm.org/D15722





More information about the llvm-commits mailing list