[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Fri Dec 30 01:09:39 PST 2011

On 12/29/2011 06:32 PM, Hal Finkel wrote:
> On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote:
>> On 12/14/2011 01:25 AM, Hal Finkel wrote:
>> One thing that I would still like to have is a test case where
>> bb-vectorize-search-limit is needed to avoid exponential compile time
>> growth and another test case that is not optimized, if
>> bb-vectorize-search-limit is set to a value less than 4000. I think
>> those cases are very valuable to understand the performance behavior of
>> this code.
>
> Good idea, I'll add these test cases.
>
>> Especially, as I am not yet sure why we need a value as high
>> as 4000.
>
> I am not exactly sure why that turned out to be the best number, but
> I'll try this again in combination with my load/store reordering patch
> and see if such a large value still seems best.

They reason why I am surprised about this value, is that I believe 
partial loop unrolling would not yield bbs of this size. Code size 
limits should prevent size. However, loop unrolling seems to be the 
major reason why two accesses to adjacent memory may be placed far away. 
Without loop unrolling, at a distance of 4000 the fact that two 
instructions access adjacent memory locations seems to be completely 
random and the probability that the following instructions perform the 
same calculations seems low. Also, I believe at 4000 the compile time 
should already be significant higher.

As it seems my intuition is wrong, I am very eager to see and understand 
an example where a search limit of 4000 is really needed.

Cheers
Tobi