[PATCH] D150316: [AArch64][InstCombine] Don't scalarize for bitselet instructions

Thu May 11 08:06:12 PDT 2023

dmgreen added a comment.

Hmm. I suspect that people will not like something target-specific in instcombine. I'm a little surprised to see InstCombine trying to do scalarization without any form of a costmodel though. I would expect `extract(binop(a,b))` to generally be better than `binop(extract(a),extract(b))` if the extracts were expensive. More so if the binop was cheaper on the vector side. It would be better if the extracts were optimized further though.

In the example in https://gcc.godbolt.org/z/YPe3TK79P there are variable lane extracts, which makes them even more expensive that usual. If you are interested in that case specifically then it might make sense to prevent this optimization for variable lane extracts as they are often expensive.

Otherwise we might need to fix this in the backend, perhaps by adding an AArch64 DAG combine for converted these vector operations with a single extract.

      t12: i32 = extract_vector_elt t4, t11
        t13: i32 = extract_vector_elt t6, t11
      t15: i32 = xor t13, Constant:i32<-1>
    t16: i32 = and t12, t15
      t10: v4i32 = and t2, t6
    t17: i32 = extract_vector_elt t10, t11
  t18: i32 = or t16, t17

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150316/new/

https://reviews.llvm.org/D150316