[llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Tobias Grosser
tobias at grosser.es
Thu Nov 17 11:11:46 PST 2011
On 11/17/2011 10:01 AM, Hal Finkel wrote:
> On Thu, 2011-11-17 at 13:57 +0100, Tobias Grosser wrote:
>> On 11/17/2011 12:38 AM, Hal Finkel wrote:
>>> Tobias, et al.,
>>>
>>> Attached is the my autovectorization pass.
>>
>> Very nice. Will you be at the developer summit? Maybe we could discuss
>> the integration there?
>
> Unfortunately, not this year.
>
>>
>> Here a first review of the source code.
>
> Thanks! This is great! There are a few inline comments; but for the most
> part I'll makes the changes as suggested.
>>> + if (const SCEVConstant *ConstOffSCEV =
>>> + dyn_cast<SCEVConstant>(RelOffSCEV)) {
>>> + ConstantInt *IntOff = ConstOffSCEV->getValue();
>>> + int64_t off = IntOff->getSExtValue();
>>> +
>>> + Type *VTy = cast<PointerType>(Iptr->getType())->getElementType();
>>
>> You should assert that the types of both pointers are indentical.
>
> This is done implicitly in all calling code. This, however, should be
> documented as a precondition of the function.
Yes. And just put an assert in the function that checks this. This does
not cost a lot and may save someone who did not read the comment.
>> Also you should document that this function only works for vector types.
>> I actually expected it to just work for scalar types. Vector types look
>> a little bit wired here and I am actually not even sure if it is correct
>> for vector types.
>
> What do you mean by working only for vector types? I don't think that's
> true. As I recall, it seemed to work find for both loads and stores
> using both scalar pointers and vector pointers.
Oh sorry. I thought getElementType() returns the element of a vector.
However, it returns the type pointed to by the pointer.
>>> + int64_t VTy_tss = (int64_t) TD.getTypeStoreSize(VTy);
>>> +
>>> + if (off == VTy_tss) {
>>> + return 1;
>>> + } else if (-off == VTy_tss) {
>>> + return -1;
>>> + }
>> Braces not needed.
>>
>>> + }
>> Did you think of using SE.getSizeOfExpr()?
>
> Yes. ;)
>
> I did not for two reasons. First, if we have target data, which we need
> for load/store vectorization anyway, then it is unnecessary because the
> sizeof gets constant folded. Thus, that division is just an integer
> division in the relevant cases anyway.
True.
>
> Second, correct me if I'm wrong, but I would need a signed division, not
> an unsigned division.
At least I cannot convince myself that UDiv is always correct. Just
leave it like this now.
>>> + if (FastDep) {
>>> + // Note: For this heuristic to be effective, independent operations
>>> + // must tend to be intermixed. This is likely to be true from some
>>> + // kinds of loop unrolling (but not the generic LLVM pass),
>>> + // but otherwise may require some kind of reordering pass.
>>
>> What does it mean that independent operations must be intermixed? Can
>> you give an example?
>
> It means that if you group variables into independent groups, say A and
> B, such that things in B don't use results from A and vice-verse, and
> then have: A B A B A B A B... then FastDep will be able to, if the
> instruction are otherwise compatible, vectorize to make<A B> <A B>...
> On the other hand, if you have A A A A B B B B, then you need a full
> search and dependency analysis to determine that you can move some of
> the Bs ahead of the preceding As in order to vectorize them.
Basically this means FastDep would be sufficient, if you do grouped
unrolling?
> Will do, and thanks again. I'll send an updated patch soon.
Great!
Tobi
More information about the llvm-commits
mailing list