[llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Hal Finkel hfinkel at anl.gov
Wed Nov 16 15:50:23 PST 2011


Please review this patch (attached to previous e-mail) and let me know
if there is any objection to me committing it at this point.

Thanks again,
Hal

On Wed, 2011-11-16 at 17:39 -0600, Hal Finkel wrote:
> Tobias, et al.,
> 
> Attached is the my autovectorization pass. I've fixed a bug that appears
> when using -bb-vectorize-aligned-only, fixed some 80-col violations,
> etc., and at least on x86_64, all test cases pass except for a few; and
> all of these failures look like instruction-selection bugs. For example:
> 
> MultiSource/Applications/ClamAV - fails to compile shared_sha256.c with
> an error: error in backend: Cannot select: 0x4fbcb40: v2i64 =
> X86ISD::MOVLPD 0x4149e00, 0x418d930 [ID=596]
> SingleSource/Benchmarks/BenchmarkGame/n-body - crashes; incorrectly
> generates an aligned vector load for an unaligned load (so passing
> -bb-vectorize-aligned-only will make it work for now)
> 
>  -Hal
> 
> On Tue, 2011-11-15 at 17:38 -0600, Hal Finkel wrote:
> > Tobias,
> > 
> > I've attached the latest version of my autovectorization patch. I was
> > able to add support for using the ScalarEvolution analysis for
> > load/store pairing (thanks for your help!). This led to a modest
> > performance increase and a modest compile-time increase. This version
> > also has a cutoff as you suggested (although the default value is set
> > high (4000 instructions between pairs) because setting it lower led to
> > performance regressions in both run time and compile time).
> > 
> > At this point there seem to be no unexpected test-suite compile
> > failures. Some of the tests, however, do seem to compute the wrong
> > output. I've yet to determine whether this is due to bad instruction
> > fusing or some error in a later stage. I'll try running the failing
> > tests in the interpreter and/or on another platform.
> > 
> > Thanks again,
> > Hal
> > 
> > On Sat, 2011-11-12 at 00:37 +0100, Tobias Grosser wrote:
> > > On 11/12/2011 12:11 AM, Hal Finkel wrote:
> > > > On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:
> > > >> On 11/11/2011 11:36 PM, Hal Finkel wrote:
> > > >>> On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
> > > >>>> On 11/08/2011 11:29 PM, Hal Finkel wrote: Talking about this I
> > > >>>> looked again into ScalarEvolution.
> > > >>>>
> > > >>>> To analyze a load, you would do:
> > > >>>>
> > > >>>> LoadInst *Load = ... Value *Pointer = Load->getPointer(); const
> > > >>>> SCEV *PointerSCEV = SE->getSCEV(Pointer); const SCEVUnknown
> > > >>>> *PointerBase =
> > > >>>> dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> > > >>>>
> > > >>>> if (!PointerBase) return 'Analysis failed'
> > > >>>>
> > > >>>> const Value *BaseValue = PointerBase->getValue();
> > > >>>>
> > > >>>> You get the offset between two load addresses with
> > > >>>> SE->getMinusSCEV(). The size of an element is
> > > >>>> SE->getSizeOfExpr().
> > > >>>>
> > > >>>
> > > >>> The AliasAnalysis class has a set of interfaces that can be used
> > > >>> to preserve the analysis even when some things are changed. Does
> > > >>> ScalarEvolution have a similar capability?
> > > >>
> > > >> You can state that your pass preserves ScalarEvolution. In this
> > > >> case all analysis results are by default preserved and it is your
> > > >> job to invalidate the scalar evolution for the loops/values where
> > > >> it needs to be recalculated.
> > > >>
> > > >> The relevant functions are
> > > >>
> > > >> ScalarEvolution::forgetValue(Value *)
> > > >
> > > > Since the vectorization pass is currently just a basic-block pass, I
> > > > think that I should need only forgetValue, right? I suppose that I
> > > > would call that on all of the values that are fused.
> > > 
> > > You call it on all the values/instructions that are removed and for
> > > those where the result calculated has changed.
> > > 
> > > > Also, using getPointerBase to get the base pointer seems simple
> > > > enough, but how should I use getMinusSCEV to get the offset.
> > > 
> > > This should give you the offset from the base pointer:
> > > 
> > > LoadInst *Load = ...
> > > Value *Pointer = Load->getPointer();
> > > const SCEV *PointerSCEV = SE->getSCEV(Pointer);
> > > const SCEVUnknown *PointerBase =
> > > 	dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> > > 
> > > if (!PointerBase)
> > >    return 'Analysis failed'
> > > 
> > > const SCEV *OffsetFromBase = SE->getMinusSCEV(Pointer, PointerBase);
> > > 
> > > > Should I call it on each load pointer and its base pointer, or
> > > > between the two load pointers once I know that they share the same
> > > > base.
> > > That depends what you want.
> > > 
> > > > And once I do that, how do I get the offset (if known). I see the
> > > > get[Uns|S]ignedRange functions, but if there is a way to directly get
> > > > a constant value, then that would be more straightforward.
> > > 
> > > I assume you want to know if two load addresses are either identical, 
> > > have stride one (have a offset of +1) or some other more complicated stuff.
> > > 
> > > What might work is the following (entirely untested):
> > > 
> > > LoadInst *LoadOne = ...
> > > LoadInst *LoadTwo = ...
> > > 
> > > Value *PointerOne = LoadOne->getPointer();
> > > Value *PointerTwo = LoadTwo->getPointer();
> > > 
> > > const SCEV *PointerOneSCEV = SE->getSCEV(PointerOne);
> > > const SCEV *PointerTwoSCEV = SE->getSCEV(PointerTwo);
> > > 
> > > // If this is a trivial offset we get something like 1*sizeof(long)
> > > const SCEV *Offset = SE->getMinusSCEV(PointerOneSCEV, PointerTwoSCEV);
> > > 
> > > // Now we devide it by the element size
> > > Type *AllocTy = LoadOne->getType()->getAllocTy();
> > > const SCEV *TypeOfSCEV = SE->getSizeOfExpr(AllocTy);
> > > const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> > > 
> > > if (const SCEVConstant *IntOffsetSCEV
> > > 	= dyn_cast<SCEVConstant>(OffsetInElements)) {
> > >    ConstantInt *IntOffset = IntOffsetSCEV->getValue()
> > >    return IntOffset;
> > > } else {
> > >    return "This seems to be a complicated offset";
> > > }
> > > 
> > > const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> > > 
> > > Let me know if this or something similar worked.
> > > Tobi
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list