[all-commits] [llvm/llvm-project] d1e1ed: PeepholeOpt: Do not add subregister indexes to reg...

Thu Jan 30 05:39:36 PST 2025

  Branch: refs/heads/users/arsenm/peephole-opt-do-not-introduce-subregister-uses-on-reg-sequence
  Home:   https://github.com/llvm/llvm-project
  Commit: d1e1edd807382d02698506b9854479d5921655c6
      https://github.com/llvm/llvm-project/commit/d1e1edd807382d02698506b9854479d5921655c6
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2025-01-30 (Thu, 30 Jan 2025)

  Changed paths:
    M llvm/lib/CodeGen/PeepholeOptimizer.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
    M llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
    M llvm/test/CodeGen/AMDGPU/div-rem-by-constant-64.ll
    M llvm/test/CodeGen/AMDGPU/div_v2i128.ll
    M llvm/test/CodeGen/AMDGPU/fptoi.i128.ll
    M llvm/test/CodeGen/AMDGPU/fptrunc.ll
    M llvm/test/CodeGen/AMDGPU/idot2.ll
    M llvm/test/CodeGen/AMDGPU/idot4u.ll
    M llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
    M llvm/test/CodeGen/AMDGPU/llvm.mulo.ll
    M llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
    M llvm/test/CodeGen/AMDGPU/load-constant-i32.ll
    M llvm/test/CodeGen/AMDGPU/load-global-i16.ll
    M llvm/test/CodeGen/AMDGPU/load-global-i32.ll
    M llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-nontemporal-metadata.ll
    M llvm/test/CodeGen/AMDGPU/mad_64_32.ll
    M llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw-system.ll
    M llvm/test/CodeGen/AMDGPU/mul.ll
    M llvm/test/CodeGen/AMDGPU/rem_i128.ll
    M llvm/test/CodeGen/AMDGPU/sdiv.ll
    M llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
    M llvm/test/CodeGen/AMDGPU/spill-vgpr.ll
    M llvm/test/CodeGen/AMDGPU/sra.ll
    M llvm/test/CodeGen/AMDGPU/udiv.ll
    M llvm/test/CodeGen/AMDGPU/v_cndmask.ll

  Log Message:
  -----------
  PeepholeOpt: Do not add subregister indexes to reg_sequence operands

Given the rest of the pass just gives up when it needs to compose
subregisters, folding a subregister extract directly into a reg_sequence
is counterproductive. Later fold attempts in the function will give up
on the subregister operand, preventing looking up through the reg_sequence.

It may still be profitable to do these folds if we start handling
the composes. There are some test regressions, but this mostly
looks better.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications