[LLVMdev] Trip count and Loop Vectorizer

Arnold Schwaighofer aschwaighofer at apple.com
Fri Sep 27 12:56:20 PDT 2013


We have frame work for simple heuristics in place: 

1.) We vectorize loops with unknown trip count.
2.) You can find our partial unroll heuristic in selectUnrollFactor. It probably needs tuning.

  // We unroll the loop in order to expose ILP and reduce the loop overhead.
  // There are many micro-architectural considerations that we can't predict
  // at this level. For example frontend pressure (on decode or fetch) due to
  // code size, or the number and capabilities of the execution ports.
  //
  // We use the following heuristics to select the unroll factor:
  // 1. If the code has reductions the we unroll in order to break the cross
  // iteration dependency.
  // 2. If the loop is really small then we unroll in order to reduce the loop
  // overhead. 

   <<< This is the heuristic that works against your example.
    
  // 3. We don't unroll if we think that we will spill registers to memory due
  // to the increased register pressure.




On Sep 27, 2013, at 2:21 PM, Murali, Sriram <sriram.murali at intel.com> wrote:

> Hey Arnold,
> I have run into this situation many times while benchmarking.
> I think it is best if this is addressed using a simple heuristic. For that, we need to identify the loop cost and decide if it makes sense to completely unroll the loop, or partially unroll. I am unsure of the optimal way to implement this though.
> 
> I want to run it by the list to get any ideas floating around :)
> Thanks
> Sriram
> 
> -----Original Message-----
> From: Arnold Schwaighofer [mailto:aschwaighofer at apple.com] 
> Sent: Friday, September 27, 2013 1:54 PM
> To: Murali, Sriram
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Trip count and Loop Vectorizer
> 
> 
> On Sep 27, 2013, at 12:47 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> 
>> so you could infer that n must be smaller than 8 (because you know the range of the other dimension). The question is how often does such an example occur, where this is possible, to make such an effort justifiable?
> smaller equal, of course ;)




More information about the llvm-dev mailing list