[PATCH] Bug 21233 - Catch more cases in SLP vectorizer
Sanjin Sijaric
ssijaric at codeaurora.org
Thu Oct 23 17:19:35 PDT 2014
Hi Arnold,
Thanks for looking into this.
The attached test case is reduced from an internal application, and it needs our internal post-increment pass to transform it into the IR that ends up needing more SCEV in SLP to get vectorized.
Ana mentioned that the post-increment pass is going to get upstreamed, which will then trigger this behavior. I'll come up with a small C test case in the meantime that will expose it.
Thanks,
Sanjin
-----Original Message-----
From: Arnold Schwaighofer [mailto:aschwaighofer at apple.com]
Sent: Wednesday, October 22, 2014 1:13 PM
To: Sanjin Sijaric
Cc: llvm commits
Subject: Re: [PATCH] Bug 21233 - Catch more cases in SLP vectorizer
Sanjin can you post the code that leads to this IR?
> On Oct 22, 2014, at 12:21 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
>
> SCEV-AA does not pre-compute it just uses SCEV once you query it. At this point you pay for the cost of computing SCEV of the two pointers.
>
> SCEV is a lazy analysis you pay per query.
>
> After your patch we will see more SCEV queries, Nadav at one point actively tried to reduce the number of SCEV queries (hence the extra code in isConsecutive).
>
>
> bool BoUpSLP::isConsecutiveAccess(Value *A, Value *B) { Value *PtrA =
> getPointerOperand(A); Value *PtrB = getPointerOperand(B); unsigned
> ASA = getAddressSpaceOperand(A); unsigned ASB =
> getAddressSpaceOperand(B);
>
> // Check that the address spaces match and that the pointers are valid.
> if (!PtrA || !PtrB || (ASA != ASB))
> return false;
>
> // Make sure that A and B are different pointers of the same type.
> if (PtrA == PtrB || PtrA->getType() != PtrB->getType())
> return false;
>
> unsigned PtrBitWidth = DL->getPointerSizeInBits(ASA); Type *Ty =
> cast<PointerType>(PtrA->getType())->getElementType();
> APInt Size(PtrBitWidth, DL->getTypeStoreSize(Ty));
>
> APInt OffsetA(PtrBitWidth, 0), OffsetB(PtrBitWidth, 0); PtrA =
> PtrA->stripAndAccumulateInBoundsConstantOffsets(*DL, OffsetA); PtrB =
> PtrB->stripAndAccumulateInBoundsConstantOffsets(*DL, OffsetB);
>
> APInt OffsetDelta = OffsetB - OffsetA;
>
> // Check if they are based on the same pointer. That makes the
> offsets // sufficient.
> if (PtrA == PtrB)
> return OffsetDelta == Size;
>
> // Compute the necessary base pointer delta to have the necessary
> final delta // equal to the size.
> APInt BaseDelta = Size - OffsetDelta;
>
> // Otherwise compute the distance with SCEV between the base pointers.
> const SCEV *PtrSCEVA = SE->getSCEV(PtrA); const SCEV *PtrSCEVB =
> SE->getSCEV(PtrB); const SCEV *C = SE->getConstant(BaseDelta); const
> SCEV *X = SE->getAddExpr(PtrSCEVA, C); return X == PtrSCEVB; }
>
>
> The question is what is the overhead of using SCEV more aggressively?
>
>
>> On Oct 22, 2014, at 11:44 AM, Sanjin Sijaric <ssijaric at codeaurora.org> wrote:
>>
>> Hi Andy,
>>
>> I get the following with the additional pass:
>>
>> Unnamed pass: implement Pass::getPassName()
>> Combine redundant instructions
>> Unnamed pass: implement Pass::getPassName()
>> Scalar Evolution Analysis <-----
>> ScalarEvolution-based Alias Analysis <----
>> SLP Vectorizer <----
>> Unnamed pass: implement Pass::getPassName()
>> Simplify the CFG
>>
>> Without the pass:
>>
>> Combine redundant instructions
>> Unnamed pass: implement Pass::getPassName()
>> Scalar Evolution Analysis <------
>> SLP Vectorizer <-----
>> Unnamed pass: implement Pass::getPassName()
>> Simplify the CFG
>>
>> SCEV-AA preserves ScalarEvolution, so it won't get recomputed by the SLP vectorizer again. Scev-aa should only get used from within the SLP vectorizer via "alias(...)" function, so it shouldn't result in too much overhead.
>>
>> Thanks,
>> Sanjin
>>
>> -----Original Message-----
>> From: Andrew Trick [mailto:atrick at apple.com]
>> Sent: Wednesday, October 22, 2014 10:09 AM
>> To: Sanjin Sijaric
>> Cc: Hal Finkel; mcrosier at codeaurora.org; llvm commits; Arnold
>> Schwaighofer
>> Subject: Re: [PATCH] Bug 21233 - Catch more cases in SLP vectorizer
>>
>> Hi Sanjin,
>>
>> Can you look at -debug-pass=Structure before and after your pass? Are there any differences. My understanding is that SCEV already runs before SLP vectorizer and this patch only enables it’s use in alias analysis, but I may be wrong about that. I also would like to think that the SLP vectorizer will only ask for SCEV/AA for a subset of memory ops. Will this change cause SCEV to be computed for all addresses?
>>
>> What I’m getting at here is, could this be a compile time issue?
>>
>> -Andy
>>
>>> On Oct 22, 2014, at 8:49 AM, Sanjin Sijaric <ssijaric at codeaurora.org> wrote:
>>>
>>> Hi Chad and Hal,
>>>
>>> Thanks for catching this, Chad. It was not intentional. The new patch is attached.
>>>
>>> Sanjin
>>>
>>> -----Original Message-----
>>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>>> Sent: Wednesday, October 22, 2014 8:08 AM
>>> To: mcrosier at codeaurora.org
>>> Cc: llvm-commits at cs.uiuc.edu; Sanjin Sijaric; Andrew Trick; Arnold
>>> Schwaighofer
>>> Subject: Re: [PATCH] Bug 21233 - Catch more cases in SLP vectorizer
>>>
>>> ----- Original Message -----
>>>> From: "Chad Rosier" <mcrosier at codeaurora.org>
>>>> To: "Sanjin Sijaric" <ssijaric at codeaurora.org>
>>>> Cc: llvm-commits at cs.uiuc.edu
>>>> Sent: Wednesday, October 22, 2014 9:57:36 AM
>>>> Subject: Re: [PATCH] Bug 21233 - Catch more cases in SLP vectorizer
>>>>
>>>> Hi Sanjin,
>>>>
>>>>> The attached patch addresses bug #21233. I didn't see a way to
>>>>> vectorize the attached test case without using SCEV-AA along with
>>>>> a small change in SLP vectorizer to retrieve the base pointer
>>>>> using SCEV analysis.
>>>>
>>>> + if (UseSCEVAAForSLP)
>>>> + MPM.add(createScalarEvolutionAliasAnalysisPass());
>>>> + MPM.add(createSLPVectorizerPass()); // Vectorize parallel
>>>> scalar
>>>> chains.
>>>>
>>>> This looks to be add a call to the SLP vectorizer pass. Is this
>>>> change indented? I imagine you just want to add the SCEV-AA pass
>>>> before the existing SLP vectorizer pass a few lines above this change.
>>>>
>>>>> Can someone please have a look? Is it safe to use SCEV-AA, or
>>>>> should I pursue a different approach?
>>>>
>>>> I'm not the right person to comment on the use of SCEV-AA in the
>>>> SLP vectorizer. Perhaps, Arnold, Hal, or someone else could
>>>> provide their opinion.
>>>
>>> We'd obviously not been using SCEV-AA because the current pass manager does not allow it to be preserved. As a result, I'm not sure how well tested it is (it is also pretty simple, so I'm not super concerned, only somewhat concerned). My thought has been that, once we have the new pass manager, we can do a cost-benefit analysis on using SCEV-AA more generally (especially since doing so might allow us to strip down BasicAA somewhat). I'm not opposed to gaining some more experience with SCEV-AA by using it in a targeted place like this.
>>>
>>> Andy, do you have an opinion on this? Arnold?
>>>
>>> -Hal
>>>
>>>>
>>>> Chad
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>> <PATCH>
>>
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list