[PATCH] D16522: BlockGenerators: Replace getNewScalarValue with getNewValue

Mon Jan 25 03:49:11 PST 2016

grosser added a comment.

Hi Michael,

thank you for this fast review.


================
Comment at: lib/CodeGen/BlockGenerators.cpp:412
@@ +411,3 @@
+    if (MA->isScalarKind() && GlobalMap.count(MA->getBaseAddr()))
+      continue;
+
----------------
Meinersbur wrote:
> This is fixing a symptom only. The MemoryAccess shouldn't exist in the first place if we are going to ignore it.
> 
> In fact, invariant loads do not need any MemoryAccesses inserted at all. One can just use the loaded llvm::Value as it is inserted before the generated code. In case the load is conditional, inserting a phi with an undef on one side is simple.
Right.

However, I am not sure that dropping the memory access is the solution we want, as this would mean we do not represent this read-access in our polyhedral access functions (http://llvm.org/PR25107). This causes trouble in case we want to know the precise set of data-locations read e.g. for kernel outlining or other things (even though it currently works with our OpenMP code generation). I think the right solution is to not add the InvariantLoads to GlobalMap, but to instead model them as normal read/write memory accesses (which are code-generated accordingly).

As this is a little bit more involved and probably also requires some discussion, I would prefer to discuss and address this issue independently. Regarding this patch, I could just leave out the change above and add a FIXME saying that we currently generate redundant memory accesses as we
directly read data from the invariant-load-motioned register.






================
Comment at: lib/CodeGen/BlockGenerators.cpp:1222
@@ -1262,2 +1221,3 @@
 
-    Val = getNewScalarValue(Val, R, Stmt, LTS, *LocalBBMap);
+    Loop *L = getLoopForInst(ScalarInst);
+    Val = getNewValue(Stmt, Val, *LocalBBMap, LTS, L);
----------------
Meinersbur wrote:
> This is wrong; we don't need the loop where the value is defined but where it is used. The difference comes in to play when the value is defined in the loop, but we write its scalar value after the loop, thus use the value after the last loop iteration.
Is this not what is happening? ScalarInst comes from getAccessInstruction(), which is defined to return:

```
  /// For memory accesses of kind MK_Value the access instruction of a load      
  /// access is the instruction that uses the load. The access instruction of    
  /// a write access is the instruction that defines the llvm::Value.
```

Now, the original code was slightly different. It was using: getLoopForInst(ScalarValueInst) where ScalarValueInst was the definition of the ScalarValueInst.

So it seems the behavior was changed (maybe needs a test case?), but the result should be more correct? In case I got it wrong, could you suggest what value to use otherwise.

(In case we drop the MA->getAccessInstruction() mapping for scalar values, we can probably just use the loop around the entry block.)

================
Comment at: test/Isl/CodeGen/invariant_load_scalar_escape_alloca_sharing.ll:14
@@ -13,3 +13,3 @@
 ; CHECK:       %ncol.load = load i32, i32* @ncol
 ; CHECK-NEXT:  store i32 %ncol.load, i32* %tmp0.preload.s2a
 ;
----------------
Meinersbur wrote:
> This store became useless as well, didn't it?
Right. It is also redundant.


http://reviews.llvm.org/D16522