[all-commits] [llvm/llvm-project] 90e989: [X86] Handle BSF/BSR "zero-input pass through" beh...

Simon Pilgrim via All-commits all-commits at lists.llvm.org
Thu Jan 23 05:00:21 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 90e9895a9373b3d83eefe15b34d2dc83c7bcc88f
      https://github.com/llvm/llvm-project/commit/90e9895a9373b3d83eefe15b34d2dc83c7bcc88f
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2025-01-23 (Thu, 23 Jan 2025)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/lib/Target/X86/X86InstrCompiler.td
    M llvm/lib/Target/X86/X86InstrFragments.td
    M llvm/lib/Target/X86/X86InstrInfo.cpp
    M llvm/lib/Target/X86/X86InstrMisc.td
    M llvm/lib/Target/X86/X86Subtarget.h
    M llvm/test/CodeGen/X86/bit_ceil.ll
    M llvm/test/CodeGen/X86/combine-or.ll
    M llvm/test/CodeGen/X86/ctlo.ll
    M llvm/test/CodeGen/X86/ctlz.ll
    M llvm/test/CodeGen/X86/cttz.ll
    M llvm/test/CodeGen/X86/known-never-zero.ll
    M llvm/test/CodeGen/X86/pr89877.ll
    M llvm/test/CodeGen/X86/pr90847.ll
    M llvm/test/CodeGen/X86/pr92569.ll
    M llvm/test/CodeGen/X86/scheduler-backtracking.ll
    M llvm/test/TableGen/x86-fold-tables.inc
    M llvm/test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s

  Log Message:
  -----------
  [X86] Handle BSF/BSR "zero-input pass through" behaviour (#123623)

Intel docs have been updated to be similar to AMD and now describe
BSF/BSR as not changing the destination register if the input value was
zero, which allows us to support CTTZ/CTLZ zero-input cases by setting
the destination to support a NumBits result (BSR is a bit messy as it
has to be XOR'd to create a CTLZ result). VIA/Zhaoxin x86_64 CPUs have also
been confirmed to match this behaviour.

This patch adjusts the X86ISD::BSF/BSR nodes to take a "pass through"
argument for zero-input cases, by default this is set to UNDEF to match
existing behaviour, but it can be set to a suitable value if supported.

There are still some limits to this - its only supported for x86_64
capable processors (and I've only enabled it for x86_64 codegen), and
Intel CPUs sometimes zero the upper 32-bits of a pass through register
when used for BSR32/BSF32 with a zero source value (i.e. the whole
64bits may not get passed through).

Fixes #122004



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list