[all-commits] [llvm/llvm-project] 9cb8f4: [ARM] Add a tail-predication loop predicate register

David Green via All-commits all-commits at lists.llvm.org
Thu Sep 2 05:43:21 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 9cb8f4d1ad651e2be41985ccb3d6efe0953a8101
      https://github.com/llvm/llvm-project/commit/9cb8f4d1ad651e2be41985ccb3d6efe0953a8101
  Author: David Green <david.green at arm.com>
  Date:   2021-09-02 (Thu, 02 Sep 2021)

  Changed paths:
    M llvm/lib/CodeGen/MachineVerifier.cpp
    M llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
    M llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
    M llvm/lib/Target/ARM/ARMISelLowering.cpp
    M llvm/lib/Target/ARM/ARMInstrCDE.td
    M llvm/lib/Target/ARM/ARMInstrFormats.td
    M llvm/lib/Target/ARM/ARMInstrMVE.td
    M llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
    M llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
    M llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
    M llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp
    M llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp
    M llvm/test/CodeGen/ARM/machine-outliner-unoutlinable.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/add_reduce.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/begin-vpt-without-inst.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/cmplx_cong.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/count_dominates_start.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/ctlz-non-zeros.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/disjoint-vcmp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-ignore-vctp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/dont-remove-loop-update.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/emptyblock.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/extract-element.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-16.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-32.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/incorrect-sub-8.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-1.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-2.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpnot-3.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpsel-1.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/inloop-vpsel-2.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/invariant-qreg.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-chain-store.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-chain.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-itercount.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-mov.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-random.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp-reordered.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-two-vcmp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/iv-vcmp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/livereg-no-loop-def.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/lstp-insertion-position.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-lr-terminator.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/move-def-before-start.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/move-start-after-def.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-block-cond-iter-count.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-cond-iter-count.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/no-vpsel-liveout.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/non-masked-load.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/non-masked-store.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-invariant.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-liveout.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/reductions-vpt-liveout.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/safe-retaining.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/subreg-liveness.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/unpredicated-max.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/unsafe-retaining.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vaddv.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vcmp-vpst-combination-across-blocks.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-add-operand-liveout.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-in-vpt-2.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-in-vpt.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subi3.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subri.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp-subri12.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vctp16-reduce.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vector_spill_in_loop.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vmaxmin_vpred_r.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vmldava_in_vpt.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/vpt-blocks.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/wls-search-pred.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/wlstp.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-liveout-lsr-shift.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-opcode-liveout.mir
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/wrong-vctp-operand-liveout.mir
    M llvm/test/CodeGen/Thumb2/mve-gatherscatter-mmo.ll
    M llvm/test/CodeGen/Thumb2/mve-postinc-distribute.mir
    M llvm/test/CodeGen/Thumb2/mve-stacksplot.mir
    M llvm/test/CodeGen/Thumb2/mve-tp-loop.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-1-pred.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-2-preds.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-ctrl-flow.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-non-consecutive-ins.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-3-blocks-kill-vpr.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-1-ins.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-2-ins.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-4-ins.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-debug.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-elses.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-fold-vcmp.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-kill.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-block-optnone.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-nots.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-optimisations.mir
    M llvm/test/CodeGen/Thumb2/mve-vpt-preuse.mir
    M llvm/test/CodeGen/Thumb2/mve-wls-block-placement.mir
    M llvm/test/CodeGen/Thumb2/phi_prevent_copy.mir

  Log Message:
  -----------
  [ARM] Add a tail-predication loop predicate register

The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r3, #3
DLSTP lr, r3        // Start tail predication, lr==3
VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0.
mov lr, #1
VADD.s32 q0, q1, q2 // Only first lane is updated.

This means that the value of lr cannot be spilled and re-used in tail
predication regions without potentially altering the behaviour of the
program. More lanes than required could be stored, for example, and in
the case of a gather those lanes might not have been setup, leading to
alignment exceptions.

This patch adds a new lr predicate operand to MVE instructions in order
to keep a reference to the lr that they use as a tail predicate. It will
usually hold the zeroreg meaning not predicated, being set to the LR phi
value in the MVETPAndVPTOptimisationsPass. This will prevent it from
being spilled anywhere that it needs to be used.

A lot of tests needed updating.

Differential Revision: https://reviews.llvm.org/D107638




More information about the All-commits mailing list