[LLVMdev] LLVM Loop Vectorizer

Sat Oct 6 10:45:05 PDT 2012

I'd like to see any llvm ir pass that can potentially use target info be split into two parts. There is the main transformation and target neutral analysis portion which should not access  target library.  Then there is a separate analysis portion that interface target info. All the target interfacing portion for all the transformation modules should live in a separate directory. I.e. A separate TargetAnalysis abstraction layer. 

Evan

On Oct 5, 2012, at 2:52 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> 
> 
> ----- Original Message -----
>> From: "Andrew Trick" <atrick at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "Nadav Rotem" <nrotem at apple.com>, "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu>
>> Sent: Friday, October 5, 2012 4:27:11 PM
>> Subject: Re: [LLVMdev] LLVM Loop Vectorizer
>> 
>> 
>> On Oct 5, 2012, at 1:47 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> I don't really understand where you want to draw the line. Should
>>> the inliner get target-specific input?
>> 
>> Inlining always does a canonical transformation. It can take whatever
>> target data is available at it's level for heuristics, but that
>> doesn't make it a target lowering pass.
> 
> Agreed. I've recorded some additional thoughts below.
> 
>> 
>> Similarly, full unrolling is a canonical transformation that may use
>> target-specific heuristics. Contrast that with partial unrolling or
>> vectorization, which are anti-canonical transformations.
>> 
>>> InstCombine?
>> 
>> I think there is too much temptation currently to use the canonical
>> InstCombine pass to facilitate instruction selection. It should only
>> facilitate downstream IR analysis and simplification.
>> 
>> I see no problem conceptually running an anti-canonical InstCombine
>> as part of codegen that makes use of target hooks. I think this
>> would make ISEL problems easier to deal with, and will eventually be
>> necessary anyway to clean up after other target lowering passes.
>> Obviously not a perfect solution, but better than doing everything
>> in one CodeGenPrepare pass.
> 
> 1. We should not have code to canonicalize target-specific intrinsics inside InstCombine. These should be handled via callbacks somehow into the Targets.
> 2. InstCombine currently makes decisions regarding canonical forms that it shouldn't, for example, it currently does not form shuffle masks that don't already appear because of a concern over increasing register pressure. There should be target-specific input into this decisions because on some targets some shuffle masks have a very low cost regardless of whether these already appear.
> 
>> 
>>> How about Polly? I think that the answer to all of these questions
>>> is probably, at some level, yes.
>> 
>> There is always the option of splitting a loop optimization problem
>> into an early, canonical run to aid analysis, followed by a late
>> target lowering run to optimize codegen. That's only a problem when
>> the canonicalization can badly pessimize the code in a way that
>> loses information or is hard to recover from.
> 
> For something like Polly to do a good job, as I understand it, it really should have access to some target-specific data. This is for vectorization, and also for understanding the memory hierarchy (I know we don't have this now, but I think we will at some point specifically because of this use case). 
> 
> -Hal
> 
>> 
>> -Andy
> 
> -- 
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev