[LLVMdev] How to broaden the SLP vectorizer's search

Frank Winter fwinter at jlab.org
Fri Aug 8 12:16:49 PDT 2014


Hi Nadav,

increasing the number of instructions per bucket indeed results in a 
completely vectorized version of the given small function. For a 
medium-size function I had to increase the bucket size to 8192 to 
achieve full vectorization.

If I then try this setup on one of my larger functions (containing one 
huge basic block) it seems that the O(n^2) algorithm you were talking 
about hits me hard here. After 10min in SLP I sent the termination 
signal by hand.

I remember the documentation saying that SLP is a quite versatile 
vectorizer. I was wondering if I only need to vectorize the 
load/store/arithmetic a single basic block and nothing else, are there 
any parts in SLP that I could deactivate in order to reduce the time 
needed for optimization?

Thanks,
Frank


On 08/08/2014 01:13 PM, Nadav Rotem wrote:
> Hi Frank,
>
> Thanks for working on this. Please look at vectorizeStoreChains. In 
> this function we process all of the stores in the function in buckets 
> of 16 elements because constructing consecutive stores is implemented 
> using an O(n^2) algorithm. You can try to increase this threshold to 
> 128 and see if it helps.
>
> I also agree with Renato and Chad that adding a flag to tell the 
> SLP-vectorizer to put more effort (compile time) into the problem is a 
> good idea.
>
> Thanks,
> Nadav
>
>
>
>> On Aug 8, 2014, at 8:27 AM, Frank Winter <fwinter at jlab.org 
>> <mailto:fwinter at jlab.org>> wrote:
>>
>> I changed the max. recursion depth to 36, and tried then 1000 (from 
>> the original value of 12) and it did not improve SLP's optimization 
>> capabilities on my input function. For example, the attached function 
>> is (by design) perfectly vectorizable into 4-packed single precision 
>> SIMD load/add/store. The SLP vectorizer just does nothing with it.
>>
>> I ran
>>
>> opt -default-data-layout="e-m:e-i64:64-f80:128-n8:16:32:64-S128" 
>> -basicaa -slp-vectorizer -S < mod_vec_p_vec.ll
>>
>> with RecursionMaxDepth = 12, 36, and 1000.
>>
>> Thanks,
>> Frank
>>
>>
>> On 08/07/2014 12:57 PM, Renato Golin wrote:
>>> On 7 August 2014 17:33, Chad Rosier <mcrosier at codeaurora.org 
>>> <mailto:mcrosier at codeaurora.org>> wrote:
>>>> You might consider filing a bug (llvm.org/bugs 
>>>> <http://llvm.org/bugs>) requesting a flag, but I
>>>> don't know if the code owners want to expose such a flag.
>>> I'm not sure that's a good idea as a raw access to that limit, as
>>> there are no guarantees that it'll stay the same. But maybe a flag
>>> turning some "aggressive" behaviour from SLP that would then change
>>> that threshold (and maybe some others) would be a good flag to have, I
>>> think.
>>>
>>> This could maybe be a way to deprecate the BB vectorizer faster than
>>> we would otherwise. But that would depend on all missing BB features
>>> to be implemented in SLP.
>>>
>>> An item in bugzilla seems appropriate.
>>>
>>> cheers,
>>> --renato
>>
>>
>> <mod_vec_p_vec.ll>_______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu 
>> <mailto:LLVMdev at cs.uiuc.edu>http://llvm.cs.uiuc.edu 
>> <http://llvm.cs.uiuc.edu/>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140808/33765da4/attachment.html>


More information about the llvm-dev mailing list