[llvm] [ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (PR #104842)

Wed Aug 21 01:31:00 PDT 2024

jayfoad wrote:

> @jayfoad I don't have any examples of better codegen for CPUs - and I'm not the person who wrote that initial comment, maybe they know.
> 
> But I've definitely seen a bunch of `v_and`s in the ISA that didn't need to be there

Not sure which comment you are referring to or who wrote it.

It's hard for me to understand if this patch makes any sense without some insight into _why_ the bitcast into iN helps CPUs and hurts GPUs. (If bitcasting \<N x i1> into iN is helpful then it seems like it should be done somewhere generic so it benefits all code, not just the code generated by this pass.)

But this is all just drive-by since I'm not familiar with ScalarizeMaskedMemIntr in the first place.

https://github.com/llvm/llvm-project/pull/104842