[llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Thu Nov 17 11:30:35 PST 2011

On Thu, 2011-11-17 at 11:11 -0800, Tobias Grosser wrote:
> On 11/17/2011 10:01 AM, Hal Finkel wrote:
> > On Thu, 2011-11-17 at 13:57 +0100, Tobias Grosser wrote:
> >> On 11/17/2011 12:38 AM, Hal Finkel wrote:
> >>> Tobias, et al.,
> >>>
> >>> Attached is the my autovectorization pass.
> >>
> >> Very nice. Will you be at the developer summit? Maybe we could discuss
> >> the integration there?
> >
> > Unfortunately, not this year.
> >
> >>
> >> Here a first review of the source code.
> >
> > Thanks! This is great! There are a few inline comments; but for the most
> > part I'll makes the changes as suggested.
> 
> >>> +      if (const SCEVConstant *ConstOffSCEV =
> >>> +            dyn_cast<SCEVConstant>(RelOffSCEV)) {
> >>> +        ConstantInt *IntOff = ConstOffSCEV->getValue();
> >>> +        int64_t off = IntOff->getSExtValue();
> >>> +
> >>> +        Type *VTy = cast<PointerType>(Iptr->getType())->getElementType();
> >>
> >> You should assert that the types of both pointers are indentical.
> >
> > This is done implicitly in all calling code. This, however, should be
> > documented as a precondition of the function.
> 
> Yes. And just put an assert in the function that checks this. This does 
> not cost a lot and may save someone who did not read the comment.

Good suggestion.

> 
> >> Also you should document that this function only works for vector types.
> >> I actually expected it to just work for scalar types. Vector types look
> >> a little bit wired here and I am actually not even sure if it is correct
> >> for vector types.
> >
> > What do you mean by working only for vector types? I don't think that's
> > true. As I recall, it seemed to work find for both loads and stores
> > using both scalar pointers and vector pointers.
> 
> Oh sorry. I thought getElementType() returns the element of a vector. 
> However, it returns the type pointed to by the pointer.
> 
> >>> +        int64_t VTy_tss = (int64_t) TD.getTypeStoreSize(VTy);
> >>> +
> >>> +        if (off == VTy_tss) {
> >>> +          return 1;
> >>> +        } else if (-off == VTy_tss) {
> >>> +          return -1;
> >>> +        }
> >> Braces not needed.
> >>
> >>> +      }
> >> Did you think of using SE.getSizeOfExpr()?
> >
> > Yes. ;)
> >
> > I did not for two reasons. First, if we have target data, which we need
> > for load/store vectorization anyway, then it is unnecessary because the
> > sizeof gets constant folded. Thus, that division is just an integer
> > division in the relevant cases anyway.
> True.
> 
> >
> > Second, correct me if I'm wrong, but I would need a signed division, not
> > an unsigned division.
> 
> At least I cannot convince myself that UDiv is always correct. Just 
> leave it like this now.
> 
> >>> +        if (FastDep) {
> >>> +          // Note: For this heuristic to be effective, independent operations
> >>> +          // must tend to be intermixed. This is likely to be true from some
> >>> +          // kinds of loop unrolling (but not the generic LLVM pass),
> >>> +          // but otherwise may require some kind of reordering pass.
> >>
> >> What does it mean that independent operations must be intermixed? Can
> >> you give an example?
> >
> > It means that if you group variables into independent groups, say A and
> > B, such that things in B don't use results from A and vice-verse, and
> > then have: A B A B A B A B... then FastDep will be able to, if the
> > instruction are otherwise compatible, vectorize to make<A B>  <A B>...
> > On the other hand, if you have A A A A B B B B, then you need a full
> > search and dependency analysis to determine that you can move some of
> > the Bs ahead of the preceding As in order to vectorize them.
> 
> Basically this means FastDep would be sufficient, if you do grouped 
> unrolling?

Yes, I believe this is correct.

 -Hal

> 
> > Will do, and thanks again. I'll send an updated patch soon.
> 
> Great!
> 
> Tobi

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory