[llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Hal Finkel
hfinkel at anl.gov
Thu Nov 17 11:30:35 PST 2011
On Thu, 2011-11-17 at 11:11 -0800, Tobias Grosser wrote:
> On 11/17/2011 10:01 AM, Hal Finkel wrote:
> > On Thu, 2011-11-17 at 13:57 +0100, Tobias Grosser wrote:
> >> On 11/17/2011 12:38 AM, Hal Finkel wrote:
> >>> Tobias, et al.,
> >>>
> >>> Attached is the my autovectorization pass.
> >>
> >> Very nice. Will you be at the developer summit? Maybe we could discuss
> >> the integration there?
> >
> > Unfortunately, not this year.
> >
> >>
> >> Here a first review of the source code.
> >
> > Thanks! This is great! There are a few inline comments; but for the most
> > part I'll makes the changes as suggested.
>
> >>> + if (const SCEVConstant *ConstOffSCEV =
> >>> + dyn_cast<SCEVConstant>(RelOffSCEV)) {
> >>> + ConstantInt *IntOff = ConstOffSCEV->getValue();
> >>> + int64_t off = IntOff->getSExtValue();
> >>> +
> >>> + Type *VTy = cast<PointerType>(Iptr->getType())->getElementType();
> >>
> >> You should assert that the types of both pointers are indentical.
> >
> > This is done implicitly in all calling code. This, however, should be
> > documented as a precondition of the function.
>
> Yes. And just put an assert in the function that checks this. This does
> not cost a lot and may save someone who did not read the comment.
Good suggestion.
>
> >> Also you should document that this function only works for vector types.
> >> I actually expected it to just work for scalar types. Vector types look
> >> a little bit wired here and I am actually not even sure if it is correct
> >> for vector types.
> >
> > What do you mean by working only for vector types? I don't think that's
> > true. As I recall, it seemed to work find for both loads and stores
> > using both scalar pointers and vector pointers.
>
> Oh sorry. I thought getElementType() returns the element of a vector.
> However, it returns the type pointed to by the pointer.
>
> >>> + int64_t VTy_tss = (int64_t) TD.getTypeStoreSize(VTy);
> >>> +
> >>> + if (off == VTy_tss) {
> >>> + return 1;
> >>> + } else if (-off == VTy_tss) {
> >>> + return -1;
> >>> + }
> >> Braces not needed.
> >>
> >>> + }
> >> Did you think of using SE.getSizeOfExpr()?
> >
> > Yes. ;)
> >
> > I did not for two reasons. First, if we have target data, which we need
> > for load/store vectorization anyway, then it is unnecessary because the
> > sizeof gets constant folded. Thus, that division is just an integer
> > division in the relevant cases anyway.
> True.
>
> >
> > Second, correct me if I'm wrong, but I would need a signed division, not
> > an unsigned division.
>
> At least I cannot convince myself that UDiv is always correct. Just
> leave it like this now.
>
> >>> + if (FastDep) {
> >>> + // Note: For this heuristic to be effective, independent operations
> >>> + // must tend to be intermixed. This is likely to be true from some
> >>> + // kinds of loop unrolling (but not the generic LLVM pass),
> >>> + // but otherwise may require some kind of reordering pass.
> >>
> >> What does it mean that independent operations must be intermixed? Can
> >> you give an example?
> >
> > It means that if you group variables into independent groups, say A and
> > B, such that things in B don't use results from A and vice-verse, and
> > then have: A B A B A B A B... then FastDep will be able to, if the
> > instruction are otherwise compatible, vectorize to make<A B> <A B>...
> > On the other hand, if you have A A A A B B B B, then you need a full
> > search and dependency analysis to determine that you can move some of
> > the Bs ahead of the preceding As in order to vectorize them.
>
> Basically this means FastDep would be sufficient, if you do grouped
> unrolling?
Yes, I believe this is correct.
-Hal
>
> > Will do, and thanks again. I'll send an updated patch soon.
>
> Great!
>
> Tobi
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list