[all-commits] [llvm/llvm-project] 17857d: [X86] Generate `kmov` for masking integers (#120593)

Abhishek Kaushik via All-commits all-commits at lists.llvm.org
Mon Mar 3 07:05:31 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 17857d92416da5997262318a6f62fccad9c5a156
      https://github.com/llvm/llvm-project/commit/17857d92416da5997262318a6f62fccad9c5a156
  Author: Abhishek Kaushik <abhishek.kaushik at intel.com>
  Date:   2025-03-03 (Mon, 03 Mar 2025)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    A llvm/test/CodeGen/X86/kmov.ll
    M llvm/test/CodeGen/X86/pr78897.ll

  Log Message:
  -----------
  [X86] Generate `kmov` for masking integers (#120593)

When we have an integer used as a bit mask the llvm ir looks something
like this
```
%1 = and <16 x i32> %.splat, <i32 1, i32 2, i32 4, i32 8, i32 16, i32 32, i32 64, i32 128, i32 256, i32 512, i32 1024, i32 2048, i32 4096, i32 8192, i32 16384, i32 32768>
%cmp1 = icmp ne <16 x i32> %1, zeroinitializer
```
where `.splat` is vector containing the mask in all lanes. The assembly
generated for this looks like
```
vpbroadcastd    %ecx, %zmm0
vptestmd        .LCPI0_0(%rip), %zmm0, %k1
```
where we have a constant table of powers of 2.
Instead of doing this we could just move the relevant bits directly to
`k` registers using a `kmov` instruction.
```
kmovw   %ecx, %k1
```
This is faster and also reduces code size.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list