[LLVMdev] How to broaden the SLP vectorizer's search
Frank Winter
fwinter at jlab.org
Fri Aug 8 12:16:49 PDT 2014
Hi Nadav,
increasing the number of instructions per bucket indeed results in a
completely vectorized version of the given small function. For a
medium-size function I had to increase the bucket size to 8192 to
achieve full vectorization.
If I then try this setup on one of my larger functions (containing one
huge basic block) it seems that the O(n^2) algorithm you were talking
about hits me hard here. After 10min in SLP I sent the termination
signal by hand.
I remember the documentation saying that SLP is a quite versatile
vectorizer. I was wondering if I only need to vectorize the
load/store/arithmetic a single basic block and nothing else, are there
any parts in SLP that I could deactivate in order to reduce the time
needed for optimization?
Thanks,
Frank
On 08/08/2014 01:13 PM, Nadav Rotem wrote:
> Hi Frank,
>
> Thanks for working on this. Please look at vectorizeStoreChains. In
> this function we process all of the stores in the function in buckets
> of 16 elements because constructing consecutive stores is implemented
> using an O(n^2) algorithm. You can try to increase this threshold to
> 128 and see if it helps.
>
> I also agree with Renato and Chad that adding a flag to tell the
> SLP-vectorizer to put more effort (compile time) into the problem is a
> good idea.
>
> Thanks,
> Nadav
>
>
>
>> On Aug 8, 2014, at 8:27 AM, Frank Winter <fwinter at jlab.org
>> <mailto:fwinter at jlab.org>> wrote:
>>
>> I changed the max. recursion depth to 36, and tried then 1000 (from
>> the original value of 12) and it did not improve SLP's optimization
>> capabilities on my input function. For example, the attached function
>> is (by design) perfectly vectorizable into 4-packed single precision
>> SIMD load/add/store. The SLP vectorizer just does nothing with it.
>>
>> I ran
>>
>> opt -default-data-layout="e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>> -basicaa -slp-vectorizer -S < mod_vec_p_vec.ll
>>
>> with RecursionMaxDepth = 12, 36, and 1000.
>>
>> Thanks,
>> Frank
>>
>>
>> On 08/07/2014 12:57 PM, Renato Golin wrote:
>>> On 7 August 2014 17:33, Chad Rosier <mcrosier at codeaurora.org
>>> <mailto:mcrosier at codeaurora.org>> wrote:
>>>> You might consider filing a bug (llvm.org/bugs
>>>> <http://llvm.org/bugs>) requesting a flag, but I
>>>> don't know if the code owners want to expose such a flag.
>>> I'm not sure that's a good idea as a raw access to that limit, as
>>> there are no guarantees that it'll stay the same. But maybe a flag
>>> turning some "aggressive" behaviour from SLP that would then change
>>> that threshold (and maybe some others) would be a good flag to have, I
>>> think.
>>>
>>> This could maybe be a way to deprecate the BB vectorizer faster than
>>> we would otherwise. But that would depend on all missing BB features
>>> to be implemented in SLP.
>>>
>>> An item in bugzilla seems appropriate.
>>>
>>> cheers,
>>> --renato
>>
>>
>> <mod_vec_p_vec.ll>_______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu
>> <mailto:LLVMdev at cs.uiuc.edu>http://llvm.cs.uiuc.edu
>> <http://llvm.cs.uiuc.edu/>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140808/33765da4/attachment.html>
More information about the llvm-dev
mailing list