[PATCH] Allow BB duplication threshold to be adjusted through JumpThreading's ctor

Mon Sep 29 21:10:38 PDT 2014

On Mon, 29 Sep, 2014 at 5:10 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>>  From: "Michael Liao" <michael.liao at intel.com>
>>  To: "michael liao" <michael.liao at intel.com>, nrotem at apple.com, 
>> hfinkel at anl.gov
>>  Cc: spatel at rotateright.com, llvm-commits at cs.uiuc.edu
>>  Sent: Monday, September 29, 2014 6:34:36 PM
>>  Subject: Re: [PATCH] Allow BB duplication threshold to be adjusted 
>> through JumpThreading's ctor
>>  
>>  Hi Hal
>>  
>>  Yeah, "noduplicate" could prevent duplicating of barrier calls but
>>  that
>>  patch wants to address the potential issue on processors with
>>  divergent
>>  control flow, commonly found in GPUs, e.g. AMD/NVIDIA ones. The
>>  scenario is that, if BB is duplicated to exploit more jump 
>> threading,
>>  targets with divergent CF may execute more instructions if the
>>  condition is a divergent one.
>>  
>>  For updating that threshold from TTI, yeah, if we are interested in
>>  that case. I could come another patch considering both TTI and
>>  user-specified threshold.
> 
> I suppose that I don't understand what you mean by "if we are 
> interested." Generally speaking, ctor parameters are useful only for 
> clients who are not using the standard optimization pipeline, and 
> we'd like the standard optimization pipeline to generally work well 
> for a wide range of targets. Thus, a TTI interface is preferred.

OK, I will add another patch with TTI support.

> 
> 
> From a cost modeling perspective, how can you tell whether the 
> instruction duplication will be worthwhile. Can this be something 
> like 2*(instruction costs) <= (branch cost)?

To be honest, I have no concrete answer as the instruction cost may be 
changed significantly after merging two BB, which is not fully 
considered in the current cost model. E.g., if inst-fold kicks in after 
duplicating that BB and folds all instructions. Probably a better place 
to address that is to add a similar pass in backend with detailed 
target model. So far, this patch only allows brief control of that 
threshold.

Yours
- Michael

> 
> 
> Thanks again,
> Hal
> 
>>  
>>  Yours
>>  - Michael
>>  
>>  http://reviews.llvm.org/D5444
>>  
>>  
>>  
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory