[polly] scop detection: could we stop iterating over uses?
Tobias Grosser
tobias at grosser.es
Tue Jun 4 22:54:24 PDT 2013
On 06/03/2013 09:22 AM, Sebastian Pop wrote:
> Hi,
>
> I see that we call hasScalarDependency over every instruction in a region, and
> that iterates over all the uses of the instruction. Here is the code of that
> function, with a nice "dirty hack" comment in it:
>
> bool ScopDetection::hasScalarDependency(Instruction &Inst,
> Region &RefRegion) const {
> for (Instruction::use_iterator UI = Inst.use_begin(), UE = Inst.use_end();
> UI != UE; ++UI)
> if (Instruction *Use = dyn_cast<Instruction>(*UI))
> if (!RefRegion.contains(Use->getParent())) {
> // DirtyHack 1: PHINode user outside the Scop is not allow, if this
> // PHINode is induction variable, the scalar to array transform may
> // break it and introduce a non-indvar PHINode, which is not allow in
> // Scop.
> // This can be fix by:
> // Introduce a IndependentBlockPrepare pass, which translate all
> // PHINodes not in Scop to array.
> // The IndependentBlockPrepare pass can also split the entry block of
> // the function to hold the alloca instruction created by scalar to
> // array. and split the exit block of the Scop so the new create load
> // instruction for escape users will not break other Scops.
> if (isa<PHINode>(Use))
> return true;
> }
>
> return false;
> }
>
> My question is how do we remove all this code: it is very expensive!
I just checked and there is no test case in the polly test cases that
would fail if we remove this function. However, I was able to create a
test case that is similar to the description in the comment.
From a first view I do not really see why PHI nodes need to be handled
in a special way. In fact, when removing this function as well as the
two asserts in the IndependentBlocks pass, the attached test case is
correctly code generated. I can see that there may be some problems with
invalidating later scops when translating out of SSA, but I do not see
how this is specific to PHI nodes. So from my point of view, I would
be OK with removing this code (and add the relevant test cases to ensure
we do the right thing) given such a patch passes the LLVM test suite
without regressions.
However, maybe ether has some more insights. I remember he was the one
adding this code in, but I don't remember the exact reasoning behind this.
Cheers,
Tobi
-------------- next part --------------
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-unknown-linux-gnu"
define void @f(i64* %A, i64 %N, i64 %M) nounwind {
entry:
fence seq_cst
br label %for.i
for.i:
%indvar = phi i64 [ 0, %entry ], [ %indvar.next, %for.i ]
%scevgep = getelementptr i64* %A, i64 %indvar
store i64 %indvar, i64* %scevgep
%indvar.next = add nsw i64 %indvar, 1
%exitcond = icmp eq i64 %indvar.next, %N
br i1 %exitcond, label %next, label %for.i
next:
fence seq_cst
br label %for.j
for.j:
%indvar.j = phi i64 [ %indvar, %next ], [ %indvar.j.next, %for.j ]
%scevgep.j = getelementptr i64* %A, i64 %indvar.j
store i64 %indvar.j, i64* %scevgep.j
fence seq_cst
%indvar.j.next = add nsw i64 %indvar.j, 1
%exitcond.j = icmp eq i64 %indvar.j.next, %M
br i1 %exitcond.j, label %return, label %for.j
return:
fence seq_cst
ret void
}
More information about the llvm-commits
mailing list