[all-commits] [llvm/llvm-project] 27139b: [SME] Stop RA from coalescing COPY instructions th...

Sander de Smalen via All-commits all-commits at lists.llvm.org
Wed Jan 31 11:43:58 PST 2024


  Branch: refs/heads/release/18.x
  Home:   https://github.com/llvm/llvm-project
  Commit: 27139bceb27d0b551e6e9d18fb91c703cbc3d7b8
      https://github.com/llvm/llvm-project/commit/27139bceb27d0b551e6e9d18fb91c703cbc3d7b8
  Author: Sander de Smalen <sander.desmalen at arm.com>
  Date:   2024-01-31 (Wed, 31 Jan 2024)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/lib/Target/AArch64/AArch64ISelLowering.h
    M llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
    M llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
    M llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
    A llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
    M llvm/test/CodeGen/AArch64/sme-streaming-body.ll
    M llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
    M llvm/test/CodeGen/AArch64/sme-streaming-interface.ll
    M llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll

  Log Message:
  -----------
  [SME] Stop RA from coalescing COPY instructions that transcend beyond smstart/smstop. (#78294)

This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.

For example:

    %0:fpr64 = COPY killed $d0
    undef %2.dsub:zpr = COPY %0       // <- Do not coalesce this COPY
    ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
    $d0 = COPY killed %0
    BL @use_f64, csr_aarch64_aapcs

If the COPY would be coalesced, that would lead to:

    $d0 = COPY killed %0

being replaced by:

    $d0 = COPY killed %2.dsub

which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:

    str     q0, [sp]   // 16-byte Folded Spill
    smstop  sm
    ldr     z0, [sp]   // 16-byte Folded Reload
    bl      use_f64

which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
   'mul vl' addressing modes to access the spill location.

By disabling the coalescing, we get the desired results:

    str     d0, [sp, #8]  // 8-byte Folded Spill
    smstop  sm
    ldr     d0, [sp, #8]  // 8-byte Folded Reload
    bl      use_f64

(cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)




More information about the All-commits mailing list