[PATCH] D12882: [SimplifyCFG] do not speculate fdiv by default; it's expensive (PR24818)

Thu Sep 17 08:20:32 PDT 2015

spatel added a comment.

In http://reviews.llvm.org/D12882#247604, @llvm-commits wrote:

> Can you give some context on specifically which speculative 
>  optimizations you're hoping to avoid here?  Are you only worried about 
>  if-then-else-to-select from SimplifyCFG?  What about cases like LICM?  
>  Do you think those should hoist or not?

I was only considering the if-then case in SimplifyCFG. I'm not familiar with LICM, but I don't see it using TTI. Maybe that's just an oversight though?

> My general feeling is that we *should* speculate these instructions if 

>  is legal and we have reason to predict it as being profitable (e.g. in 

>  LICM).  We do not currently have a global scheduler (or even a global 

>  schedule fixer-uper beyond some hacks in CodeGenPrep), but I've been 

>  thinking about this for other reasons.  In cases where the only use is 

>  down a particular path, adding a "fixup" which undoes hoisting seems 

>  plausible and reasonable.

> 

> For this particular case, it really seems like this should be handled 

>  via a select-to-control flow conversion for expensive operations done 

>  late in the pipeline.  IMO, selects is a better intermediate form for 

>  optimization, and we should undo the transformation late when heading 

>  into the backend.  In fact, we already have the framework for this in 

>  CodeGenPrep::OptimizeSelectInst.  Have you considered tweaking that instead?

I was hoping there was some later hook in place where we could do this, but I hadn't looked further yet, so I'm glad to see that CGP has it.

But if that is the right spot to do (undo) this transform, that implies that using TTI here in SimplifyCFG was the wrong choice, doesn't it? Ie, the TTI cost model cites this exact case as a reason for its existence:

  TCC_Expensive = 4 ///< The cost of a 'div' instruction on x86.

If we're not going to honor the TTI cost model here in SimplifyCFG, we should go back to whatever logic was used before r228826. (Note: I'm not advocating one way or the other...yet; I'm just looking for the right answer.)

This is another way of asking the general question I've run into several times lately ( PR24818 <https://llvm.org/bugs/show_bug.cgi?id=24818>, PR24766 <https://llvm.org/bugs/show_bug.cgi?id=24766>, PR22723 <https://llvm.org/bugs/show_bug.cgi?id=22723>, PR24743 <https://llvm.org/bugs/show_bug.cgi?id=24743> ):
What defines the "canonical" or "optimal" IR? Less instructions? Less basic blocks? Less register pressure? Is it art or science?

http://reviews.llvm.org/D12882