[all-commits] [llvm/llvm-project] 86203b: [NVPTX] Use PRMT more widely, and improve folding ...

Alex MacLean via All-commits all-commits at lists.llvm.org
Sun Jul 13 15:07:14 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd
      https://github.com/llvm/llvm-project/commit/86203b6b33e49cc1a8ce6d7d69e7df4970d8f7bd
  Author: Alex MacLean <amaclean at nvidia.com>
  Date:   2025-07-13 (Sun, 13 Jul 2025)

  Changed paths:
    M llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
    M llvm/lib/Target/NVPTX/NVPTXISelLowering.h
    M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
    M llvm/test/CodeGen/NVPTX/LoadStoreVectorizer.ll
    M llvm/test/CodeGen/NVPTX/extractelement.ll
    M llvm/test/CodeGen/NVPTX/i8x4-instructions.ll
    M llvm/test/CodeGen/NVPTX/ldg-invariant-256.ll
    M llvm/test/CodeGen/NVPTX/ldg-invariant.ll
    M llvm/test/CodeGen/NVPTX/load-store-vectors.ll
    M llvm/test/CodeGen/NVPTX/sext-setcc.ll

  Log Message:
  -----------
  [NVPTX] Use PRMT more widely, and improve folding around this instruction (#148261)

Replace uses of BFE with PRMT when lowering v4i8 vectors. This will
generally lead to equivalent or better SASS and reduces the number of
target specific operations we need to represent.
(https://cuda.godbolt.org/z/M75W6f8xd) Also implement KnownBits tracking
for PRMT allowing elimination of redundant AND instructions when
lowering various i8 operations.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list