<div dir="ltr">Hi all,<div><br></div><div style>One of the reasons the Livermore Loops couldn't be vectorized is that it was using global structures to hold the arrays. Today, I'm investigating why is that so and how to fix it.</div>
<div style><br></div><div style>My investigation brought me to LoopVectorizationLegality::canVectorizeMemory():</div><div style><br></div><div style><div> if (WriteObjects.count(*it)) {</div><div> DEBUG(dbgs() << "LV: Found a possible read/write reorder:"</div>
<div> << **it <<"\n");</div><div> return false;</div><div> }</div><div><br></div><div style>In the first pass, it registers all underlying objects for writes, than it does it again for reads, if the value was already there, it's a conflict.</div>
<div style><br></div><div style>However, the read is from Foo.bl / Foo.cl and the write to Foo.al, so why is GetUnderlyingObjects() returning the same objects/pointers?</div><div style><br></div><div style>A quick look at it revealed me the problem:</div>
<div style><br></div><div style>llvm::GetUnderlyingObject(Value *V, const DataLayout *TD, unsigned MaxLookup) yields:<br></div><div style><br></div><div style>-> GEPOperator *GEP = dyn_cast<GEPOperator>(V)<br></div>
<div style>-> V = GEP->getPointerOperand();</div><div style>-> GlobalAlias *GA = dyn_cast<GlobalAlias>(V)<br></div><div style>-> V = GA->getAliasee();</div><div style>return V;</div><div style><br></div>
<div style>In this case, V is a reference to the structure, not the element. It seems to me that assigning the pointer operand from GEP is too simplistic. Either GetUnderlyingObject() should store the indices to return the correct object, or GetUnderlyingObjects() should create a special case for it (as it does with selects and phi nodes).</div>
<div style><br></div><div style>Does that make sense?</div><div style><br></div><div style>cheers,</div><div style>--renato</div><div style><br></div><div style>PS:</div><div style><br></div><div style>A simplified version of the IR:</div>
<div style><br></div><div style><div>%struct.anon = type { [256 x i64], [256 x i64], [256 x i64] }</div><div><div><br></div><div>@Foo = common global %struct.anon zeroinitializer, align 8<br></div></div><div><br></div><div>
...</div><div><br></div><div><div><div> %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom</div><div> %0 = load i64* %arrayidx, align 8</div><div> %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom</div>
<div> %1 = load i64* %arrayidx2, align 8</div><div> %mul = mul nsw i64 %1, %0</div><div> %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom</div><div> store i64 %mul, i64* %arrayidx4, align 8</div>
</div></div><div><br></div></div></div></div>