[PATCH][InstCombiner] Expose opportunities to merge subtract and comparison

Tue Sep 3 07:12:00 PDT 2013

Hi Quentin,

On 27/08/13 21:34, Quentin Colombet wrote:
> Hi,
>
> Here is a patch to expose more subtract and comparison fusion opportunities.
> I am not sure the approach I have taken is the best one, so let me know if you
> think I should proceed differently.
>
> Thanks for the feedbacks/reviews.
>
> ** Context **
>
> Several architectures use the same instruction to perform both a comparison and
> a subtract. Our current instruction selection framework does not allow to
> consider different basic blocks to expose such fusion opportunities. As it is,
> these instructions are “merged” by CSE in the MI IR level.
>
>
> ** Proposed Solution **
>
> To increase the likelihood of CSE to apply in such situation, the proposed patch
> reorders the operands of the comparison so that they matches the order of the
> most frequent subtract.

mightn't this be expensive if there is a long use list?  This worries me as
instcombine is run many times.  How about doing it in codegenprepare, which
is only run once (also, this transform is squarely aimed at codegen, so it
seems a more natural place for it)?

Ciao, Duncan.

> E.g.,
>
> icmp A, B
> …
> sub B, A
>
> B is used as the first argument of a sub with A. The comparison should match
> this for CSE to come into play.
>
> This is done during instcombine.
>
>
> ** Motivating Example **
>
> The attached scratch.cc <http://scratch.cc> shows two different functions that
> do exactly the same thing (the order of the operands in the ‘if’ and the
> condition are inversed) but produces different code quality.
> Function PASS has its comparison and subtract merged, function FAIL does not.
>
> To reproduce:
> clang -O3 -arch x86_64 scratch.cc <http://scratch.cc> -S -o -
>
>
> ** Notes **
>
> 1. Why LLVM IR for this “low-level” optimization?
> As already stated, several architectures expose such opportunities, therefore, I
> thought it may be best to do it as a target independent optimization. LLVM IR
> makes more sense for that. Doing this at MI IR level would require several
> additional target hooks or a specific pass for each target.
>
> 2. What about the canonical form?
> The optimization is only performed when both operands have the same complexity.
> We might want to break the complexity assumption (operands ordered from most to
> less complex) but I am not sure it will bring new opportunities.
> Thus, assuming we want to do that transformation at LLVM IR level, is it
> desirable to break that assumption and if yes, in which pass?
>
> Cheers,
> -Quentin
>
>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>