[PATCH] D45173: [InstCombine] Recognize idioms for ctpop and ctlz

Wed Apr 4 13:15:59 PDT 2018

spatel added a comment.

In https://reviews.llvm.org/D45173#1057039, @kparzysz wrote:

> In https://reviews.llvm.org/D45173#1056890, @spatel wrote:
>
> > I don't know if there's an actual strategy. There's no formal definition of 'canonical IR' AFAIK, so we continue to simplify code via peepholes in instcombine. Anything downstream of that has to adjust to those changes. I've dealt with that many times as an interaction between instcombine and DAG combine.
>
>
> This isn't sustainable in the long run.  Recognizing complex computations and replacing them with short equivalents (such as intrinsics that targets may provide efficient implementations of) is arguably better than only doing peephole optimizations, and yet the current model makes it really difficult to write such code.

There was some discussion about optimal graph rewriting, but I don't know if there's any work/progress on that yet.

Until then, I think we're seeing the alternatives of the current model in this patch: either we add code to instcombine and coordinate this pass with instcombine's preferred form, or we increase the pattern matching complexity here...or we acknowledge that it's impossible to match all the variants, and let it slide.

FWIW, here are reductions of the patterns that we could transform in instcombine, but I suspect we don't want to add such narrow transforms there. It's probably better to keep the specialized pattern matching cost and complexity here:

  ; https://rise4fun.com/Alive/0ej

  Name: 2_bit_sum
    %v0 = and i32 %x, 1431655765 ; 0x55555555
    %v1 = lshr i32 %x, 1
    %v2 = and i32 %v1, 1431655765
    %v3 = add i32 %v0, %v2
  =>
    %v1 = lshr i32 %x, 1
    %v2 = and i32 %v1, 1431655765
    %v3 = sub i32 %x, %v2

  ; https://rise4fun.com/Alive/ly5

  Name: shift_add_to_mul
    %s1 = and i32 %x, 252645135 ; 0x0f0f0f0f
    %s2 = lshr i32 %s1, 16
    %s3 = add i32 %s1, %s2
    %s4 = lshr i32 %s3, 8
    %s5 = add i32 %s3, %s4
    %r = and i32 %s5, 63 ; 0x3f
  =>
    %s1 = and i32 %x, 252645135 ; 0x0f0f0f0f
    %m1 = mul i32 %s1, 16843009 ; 0x01010101
    %r = lshr i32 %m1, 24

Repository:
  rL LLVM

https://reviews.llvm.org/D45173