[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Tue Nov 15 15:38:58 PST 2011

Tobias,

I've attached the latest version of my autovectorization patch. I was
able to add support for using the ScalarEvolution analysis for
load/store pairing (thanks for your help!). This led to a modest
performance increase and a modest compile-time increase. This version
also has a cutoff as you suggested (although the default value is set
high (4000 instructions between pairs) because setting it lower led to
performance regressions in both run time and compile time).

At this point there seem to be no unexpected test-suite compile
failures. Some of the tests, however, do seem to compute the wrong
output. I've yet to determine whether this is due to bad instruction
fusing or some error in a later stage. I'll try running the failing
tests in the interpreter and/or on another platform.

Thanks again,
Hal

On Sat, 2011-11-12 at 00:37 +0100, Tobias Grosser wrote:
> On 11/12/2011 12:11 AM, Hal Finkel wrote:
> > On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:
> >> On 11/11/2011 11:36 PM, Hal Finkel wrote:
> >>> On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
> >>>> On 11/08/2011 11:29 PM, Hal Finkel wrote: Talking about this I
> >>>> looked again into ScalarEvolution.
> >>>>
> >>>> To analyze a load, you would do:
> >>>>
> >>>> LoadInst *Load = ... Value *Pointer = Load->getPointer(); const
> >>>> SCEV *PointerSCEV = SE->getSCEV(Pointer); const SCEVUnknown
> >>>> *PointerBase =
> >>>> dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> >>>>
> >>>> if (!PointerBase) return 'Analysis failed'
> >>>>
> >>>> const Value *BaseValue = PointerBase->getValue();
> >>>>
> >>>> You get the offset between two load addresses with
> >>>> SE->getMinusSCEV(). The size of an element is
> >>>> SE->getSizeOfExpr().
> >>>>
> >>>
> >>> The AliasAnalysis class has a set of interfaces that can be used
> >>> to preserve the analysis even when some things are changed. Does
> >>> ScalarEvolution have a similar capability?
> >>
> >> You can state that your pass preserves ScalarEvolution. In this
> >> case all analysis results are by default preserved and it is your
> >> job to invalidate the scalar evolution for the loops/values where
> >> it needs to be recalculated.
> >>
> >> The relevant functions are
> >>
> >> ScalarEvolution::forgetValue(Value *)
> >
> > Since the vectorization pass is currently just a basic-block pass, I
> > think that I should need only forgetValue, right? I suppose that I
> > would call that on all of the values that are fused.
> 
> You call it on all the values/instructions that are removed and for
> those where the result calculated has changed.
> 
> > Also, using getPointerBase to get the base pointer seems simple
> > enough, but how should I use getMinusSCEV to get the offset.
> 
> This should give you the offset from the base pointer:
> 
> LoadInst *Load = ...
> Value *Pointer = Load->getPointer();
> const SCEV *PointerSCEV = SE->getSCEV(Pointer);
> const SCEVUnknown *PointerBase =
> 	dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> 
> if (!PointerBase)
>    return 'Analysis failed'
> 
> const SCEV *OffsetFromBase = SE->getMinusSCEV(Pointer, PointerBase);
> 
> > Should I call it on each load pointer and its base pointer, or
> > between the two load pointers once I know that they share the same
> > base.
> That depends what you want.
> 
> > And once I do that, how do I get the offset (if known). I see the
> > get[Uns|S]ignedRange functions, but if there is a way to directly get
> > a constant value, then that would be more straightforward.
> 
> I assume you want to know if two load addresses are either identical, 
> have stride one (have a offset of +1) or some other more complicated stuff.
> 
> What might work is the following (entirely untested):
> 
> LoadInst *LoadOne = ...
> LoadInst *LoadTwo = ...
> 
> Value *PointerOne = LoadOne->getPointer();
> Value *PointerTwo = LoadTwo->getPointer();
> 
> const SCEV *PointerOneSCEV = SE->getSCEV(PointerOne);
> const SCEV *PointerTwoSCEV = SE->getSCEV(PointerTwo);
> 
> // If this is a trivial offset we get something like 1*sizeof(long)
> const SCEV *Offset = SE->getMinusSCEV(PointerOneSCEV, PointerTwoSCEV);
> 
> // Now we devide it by the element size
> Type *AllocTy = LoadOne->getType()->getAllocTy();
> const SCEV *TypeOfSCEV = SE->getSizeOfExpr(AllocTy);
> const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> 
> if (const SCEVConstant *IntOffsetSCEV
> 	= dyn_cast<SCEVConstant>(OffsetInElements)) {
>    ConstantInt *IntOffset = IntOffsetSCEV->getValue()
>    return IntOffset;
> } else {
>    return "This seems to be a complicated offset";
> }
> 
> const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> 
> Let me know if this or something similar worked.
> Tobi
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_bb_vectorize-20111115-2.diff
Type: text/x-patch
Size: 81446 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111115/7d99b311/attachment.bin>