[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Hal Finkel hfinkel at anl.gov
Wed Nov 16 15:38:59 PST 2011


Tobias, et al.,

Attached is the my autovectorization pass. I've fixed a bug that appears
when using -bb-vectorize-aligned-only, fixed some 80-col violations,
etc., and at least on x86_64, all test cases pass except for a few; and
all of these failures look like instruction-selection bugs. For example:

MultiSource/Applications/ClamAV - fails to compile shared_sha256.c with
an error: error in backend: Cannot select: 0x4fbcb40: v2i64 =
X86ISD::MOVLPD 0x4149e00, 0x418d930 [ID=596]
SingleSource/Benchmarks/BenchmarkGame/n-body - crashes; incorrectly
generates an aligned vector load for an unaligned load (so passing
-bb-vectorize-aligned-only will make it work for now)

 -Hal

On Tue, 2011-11-15 at 17:38 -0600, Hal Finkel wrote:
> Tobias,
> 
> I've attached the latest version of my autovectorization patch. I was
> able to add support for using the ScalarEvolution analysis for
> load/store pairing (thanks for your help!). This led to a modest
> performance increase and a modest compile-time increase. This version
> also has a cutoff as you suggested (although the default value is set
> high (4000 instructions between pairs) because setting it lower led to
> performance regressions in both run time and compile time).
> 
> At this point there seem to be no unexpected test-suite compile
> failures. Some of the tests, however, do seem to compute the wrong
> output. I've yet to determine whether this is due to bad instruction
> fusing or some error in a later stage. I'll try running the failing
> tests in the interpreter and/or on another platform.
> 
> Thanks again,
> Hal
> 
> On Sat, 2011-11-12 at 00:37 +0100, Tobias Grosser wrote:
> > On 11/12/2011 12:11 AM, Hal Finkel wrote:
> > > On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:
> > >> On 11/11/2011 11:36 PM, Hal Finkel wrote:
> > >>> On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
> > >>>> On 11/08/2011 11:29 PM, Hal Finkel wrote: Talking about this I
> > >>>> looked again into ScalarEvolution.
> > >>>>
> > >>>> To analyze a load, you would do:
> > >>>>
> > >>>> LoadInst *Load = ... Value *Pointer = Load->getPointer(); const
> > >>>> SCEV *PointerSCEV = SE->getSCEV(Pointer); const SCEVUnknown
> > >>>> *PointerBase =
> > >>>> dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> > >>>>
> > >>>> if (!PointerBase) return 'Analysis failed'
> > >>>>
> > >>>> const Value *BaseValue = PointerBase->getValue();
> > >>>>
> > >>>> You get the offset between two load addresses with
> > >>>> SE->getMinusSCEV(). The size of an element is
> > >>>> SE->getSizeOfExpr().
> > >>>>
> > >>>
> > >>> The AliasAnalysis class has a set of interfaces that can be used
> > >>> to preserve the analysis even when some things are changed. Does
> > >>> ScalarEvolution have a similar capability?
> > >>
> > >> You can state that your pass preserves ScalarEvolution. In this
> > >> case all analysis results are by default preserved and it is your
> > >> job to invalidate the scalar evolution for the loops/values where
> > >> it needs to be recalculated.
> > >>
> > >> The relevant functions are
> > >>
> > >> ScalarEvolution::forgetValue(Value *)
> > >
> > > Since the vectorization pass is currently just a basic-block pass, I
> > > think that I should need only forgetValue, right? I suppose that I
> > > would call that on all of the values that are fused.
> > 
> > You call it on all the values/instructions that are removed and for
> > those where the result calculated has changed.
> > 
> > > Also, using getPointerBase to get the base pointer seems simple
> > > enough, but how should I use getMinusSCEV to get the offset.
> > 
> > This should give you the offset from the base pointer:
> > 
> > LoadInst *Load = ...
> > Value *Pointer = Load->getPointer();
> > const SCEV *PointerSCEV = SE->getSCEV(Pointer);
> > const SCEVUnknown *PointerBase =
> > 	dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> > 
> > if (!PointerBase)
> >    return 'Analysis failed'
> > 
> > const SCEV *OffsetFromBase = SE->getMinusSCEV(Pointer, PointerBase);
> > 
> > > Should I call it on each load pointer and its base pointer, or
> > > between the two load pointers once I know that they share the same
> > > base.
> > That depends what you want.
> > 
> > > And once I do that, how do I get the offset (if known). I see the
> > > get[Uns|S]ignedRange functions, but if there is a way to directly get
> > > a constant value, then that would be more straightforward.
> > 
> > I assume you want to know if two load addresses are either identical, 
> > have stride one (have a offset of +1) or some other more complicated stuff.
> > 
> > What might work is the following (entirely untested):
> > 
> > LoadInst *LoadOne = ...
> > LoadInst *LoadTwo = ...
> > 
> > Value *PointerOne = LoadOne->getPointer();
> > Value *PointerTwo = LoadTwo->getPointer();
> > 
> > const SCEV *PointerOneSCEV = SE->getSCEV(PointerOne);
> > const SCEV *PointerTwoSCEV = SE->getSCEV(PointerTwo);
> > 
> > // If this is a trivial offset we get something like 1*sizeof(long)
> > const SCEV *Offset = SE->getMinusSCEV(PointerOneSCEV, PointerTwoSCEV);
> > 
> > // Now we devide it by the element size
> > Type *AllocTy = LoadOne->getType()->getAllocTy();
> > const SCEV *TypeOfSCEV = SE->getSizeOfExpr(AllocTy);
> > const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> > 
> > if (const SCEVConstant *IntOffsetSCEV
> > 	= dyn_cast<SCEVConstant>(OffsetInElements)) {
> >    ConstantInt *IntOffset = IntOffsetSCEV->getValue()
> >    return IntOffset;
> > } else {
> >    return "This seems to be a complicated offset";
> > }
> > 
> > const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> > 
> > Let me know if this or something similar worked.
> > Tobi
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_bb_vectorize-20111116-2.diff
Type: text/x-patch
Size: 84103 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111116/3e048403/attachment.bin>


More information about the llvm-dev mailing list