[llvm] [LLVM] Add `llvm.masked.compress` intrinsic (PR #92289)

Fri Jun 14 01:35:54 PDT 2024

lawben wrote:

Mhm, yes. I think this is a tricky one. In that case, we probably want to add a passthrough value to the call. If users add a passthrough, this could either be:
  - _undef_: if the use does not care (as this is potentially more efficient)
  - _all zeros_: this should be easy to detect if it's all constants and we can use this information to select the correct instruction if there is one (e.g., in SVE or AVX512).
  - _the source vector_ or _any other vector_: then we need to maintain the input. I'm not yet sure about how to handle this ideally in the fall back implementation. For AVX512, there are explicit instructions for this, for SVE/RISC-V, we need to generate a selection mask similar to the RISC-V [code example](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663/7?u=lawben) in the discussion thread.

If people agree that we want to expose this, then I'll add a `passthru` operand to `@llvm.masked.comress`. I think @efriedma-quic is correct and we want to expose this behavior, because doing it later is tricky. And having the option to let `passthru` be `undef` gives us the potentially faster path too.

https://github.com/llvm/llvm-project/pull/92289