[llvm] [ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (PR #104842)

Tue Aug 20 01:00:12 PDT 2024

jayfoad wrote:

> ScalarizedMaskedMemIntr contains an optimization where the mask is bitcast into an iN and then bit-tests with powers of two are used to determine whether to load/store/... or not.

I don't understand why this would have been a good idea in the first place. Do you have an example of how it makes codegen better for CPUs? And an example of worse codegen for GPUs?

https://github.com/llvm/llvm-project/pull/104842