[PATCH] D121208: [AArch64][SME] Split up SME features. (alternative approach)

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 8 04:40:28 PST 2022


sdesmalen created this revision.
sdesmalen added reviewers: rsandifo-arm, paulwalker-arm, c-rhodes, kmclaughlin, david-arm.
Herald added subscribers: ctetreau, arphaman, hiraditya, kristof.beyls.
Herald added a project: All.
sdesmalen requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

This patch models SME features by adding features for the different PSTATE.SM
and PSTATE.ZA states, and implementing these as independent features to SME,
SVE(2) and NEON. This approach comes from the observation that setting
PSTATE.SM=1 invalidates instructions that are only valid when PSTATE.SM=0.
The same holds for setting PSTATE.SM=0, which invalidates SME instructions.
It can therefore be considered a subtractive feature, rather than an additive
feature.

This patch adds the following:

- `+pstate-sm0` is set by default for all subtargets. NEON/SVE/SVE2 instructions that are only valid when PSTATE.SM=0 are predicated with HasPSTATESM0. In contrast to runtime, for the compiler it is allowed to have both +pstate-sm0 and +pstate-sm1 set, as this makes both instructions available to the compiler. It is up to the compiler (or compiler-user in case of inline-asm) to guarantee that PSTATE.SM is sufficiently guarded/honoured.
- `+pstate-sm1` is optional and enables all instructions valid under PSTATE.SM=1.
- `+pstate-za1` is optional and enables all instructions valid under PSTATE.ZA=1.
- `+sme`, `+sme-i64` and `+sme-f64` are the (only) user-visible flags to enable SME support. They set `+pstate-sm1` and `pstate-za1` by default.

The set of streaming-compatible NEON/SVE/SVE2 instructions are neither
predicated on HasPSTATESM0 nor HasPSTATESM1, and can therefore be
expressed with e.g. `-mattr=+sve2,-pstate-sm0[,-pstate-sm1]`.

The set of streaming-agnostic SME instructions are not predicated on
HasPSTATESM1 and can be expressed with `-mattr=+sme,-pstate-sm1`.

This is an alternative approach as proposed in D120261 <https://reviews.llvm.org/D120261> that follows from
an offline conversation with @rsandifo-arm.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D121208

Files:
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
  llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
  llvm/lib/Target/AArch64/AArch64SchedA64FX.td
  llvm/lib/Target/AArch64/AArch64Subtarget.cpp
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp
  llvm/lib/Target/AArch64/SVEInstrFormats.td
  llvm/test/CodeGen/AArch64/sve-intrinsics-contiguous-prefetches.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-conversion.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-bits.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-create-tuple.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-fp-converts.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-insert-extract-tuple.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ldN-reg+imm-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ldN-reg+reg-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ldN-sret-reg+imm-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-ldN-sret-reg+reg-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-logical.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-pred-creation.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-pred-operations.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-pred-testing.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-reinterpret.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-reversal.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-sel.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-sqdec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-sqinc.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-st1-addressing-mode-reg-imm.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-st1-addressing-mode-reg-reg.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-st1.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-stN-reg-imm-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-stN-reg-reg-addr-mode.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-stores.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-uqdec.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-uqinc.ll
  llvm/test/CodeGen/AArch64/sve-intrinsics-while.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-binary-narrowing-add-sub.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-binary-narrowing-shr.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-complex-dot.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-contiguous-conflict-detection.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-converts.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-widening-mul-acc.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-int-mul-lane.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-polynomial-arithmetic.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-unary-narrowing.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-uniform-complex-arith.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-while.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-widening-complex-int-arith.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-widening-dsp.ll
  llvm/test/CodeGen/AArch64/sve2-intrinsics-widening-pairwise-arith.ll
  llvm/test/MC/AArch64/SME/addha-u32.s
  llvm/test/MC/AArch64/SME/addha-u64.s
  llvm/test/MC/AArch64/SME/addva-u32.s
  llvm/test/MC/AArch64/SME/addva-u64.s
  llvm/test/MC/AArch64/SME/bfmopa.s
  llvm/test/MC/AArch64/SME/bfmops.s
  llvm/test/MC/AArch64/SME/directives-negative.s
  llvm/test/MC/AArch64/SME/feature.s
  llvm/test/MC/AArch64/SME/fmopa-fp64.s
  llvm/test/MC/AArch64/SME/fmopa.s
  llvm/test/MC/AArch64/SME/fmops-fp64.s
  llvm/test/MC/AArch64/SME/fmops.s
  llvm/test/MC/AArch64/SME/ld1b.s
  llvm/test/MC/AArch64/SME/ld1d.s
  llvm/test/MC/AArch64/SME/ld1h.s
  llvm/test/MC/AArch64/SME/ld1q.s
  llvm/test/MC/AArch64/SME/ld1w.s
  llvm/test/MC/AArch64/SME/ldr.s
  llvm/test/MC/AArch64/SME/mova.s
  llvm/test/MC/AArch64/SME/psel.s
  llvm/test/MC/AArch64/SME/revd.s
  llvm/test/MC/AArch64/SME/sclamp-diagnostics.s
  llvm/test/MC/AArch64/SME/sclamp.s
  llvm/test/MC/AArch64/SME/smopa-32.s
  llvm/test/MC/AArch64/SME/smopa-64.s
  llvm/test/MC/AArch64/SME/smops-32.s
  llvm/test/MC/AArch64/SME/smops-64.s
  llvm/test/MC/AArch64/SME/smstart.s
  llvm/test/MC/AArch64/SME/smstop.s
  llvm/test/MC/AArch64/SME/st1b.s
  llvm/test/MC/AArch64/SME/st1d.s
  llvm/test/MC/AArch64/SME/st1h.s
  llvm/test/MC/AArch64/SME/st1q.s
  llvm/test/MC/AArch64/SME/st1w.s
  llvm/test/MC/AArch64/SME/str.s
  llvm/test/MC/AArch64/SME/streaming-mode-neon-bf16.s
  llvm/test/MC/AArch64/SME/streaming-mode-neon-fp16.s
  llvm/test/MC/AArch64/SME/streaming-mode-neon.s
  llvm/test/MC/AArch64/SME/sumopa-32.s
  llvm/test/MC/AArch64/SME/sumopa-64.s
  llvm/test/MC/AArch64/SME/sumops-32.s
  llvm/test/MC/AArch64/SME/sumops-64.s
  llvm/test/MC/AArch64/SME/system-regs.s
  llvm/test/MC/AArch64/SME/uclamp-diagnostics.s
  llvm/test/MC/AArch64/SME/uclamp.s
  llvm/test/MC/AArch64/SME/umopa-32.s
  llvm/test/MC/AArch64/SME/umopa-64.s
  llvm/test/MC/AArch64/SME/umops-32.s
  llvm/test/MC/AArch64/SME/umops-64.s
  llvm/test/MC/AArch64/SME/usmopa-32.s
  llvm/test/MC/AArch64/SME/usmopa-64.s
  llvm/test/MC/AArch64/SME/usmops-32.s
  llvm/test/MC/AArch64/SME/usmops-64.s
  llvm/test/MC/AArch64/SME/zero.s
  llvm/test/MC/AArch64/SVE/abs.s
  llvm/test/MC/AArch64/SVE/add.s
  llvm/test/MC/AArch64/SVE/addpl.s
  llvm/test/MC/AArch64/SVE/addvl.s
  llvm/test/MC/AArch64/SVE/and.s
  llvm/test/MC/AArch64/SVE/ands.s
  llvm/test/MC/AArch64/SVE/andv.s
  llvm/test/MC/AArch64/SVE/asr.s
  llvm/test/MC/AArch64/SVE/asrd.s
  llvm/test/MC/AArch64/SVE/asrr.s
  llvm/test/MC/AArch64/SVE/bfcvt.s
  llvm/test/MC/AArch64/SVE/bfcvtnt.s
  llvm/test/MC/AArch64/SVE/bfdot.s
  llvm/test/MC/AArch64/SVE/bfmlal.s
  llvm/test/MC/AArch64/SVE/bic.s
  llvm/test/MC/AArch64/SVE/bics.s
  llvm/test/MC/AArch64/SVE/brka.s
  llvm/test/MC/AArch64/SVE/brkas.s
  llvm/test/MC/AArch64/SVE/brkb.s
  llvm/test/MC/AArch64/SVE/brkbs.s
  llvm/test/MC/AArch64/SVE/brkn.s
  llvm/test/MC/AArch64/SVE/brkns.s
  llvm/test/MC/AArch64/SVE/brkpa.s
  llvm/test/MC/AArch64/SVE/brkpas.s
  llvm/test/MC/AArch64/SVE/brkpb.s
  llvm/test/MC/AArch64/SVE/brkpbs.s
  llvm/test/MC/AArch64/SVE/clasta.s
  llvm/test/MC/AArch64/SVE/clastb.s
  llvm/test/MC/AArch64/SVE/cls.s
  llvm/test/MC/AArch64/SVE/clz.s
  llvm/test/MC/AArch64/SVE/cmpeq.s
  llvm/test/MC/AArch64/SVE/cmpge.s
  llvm/test/MC/AArch64/SVE/cmpgt.s
  llvm/test/MC/AArch64/SVE/cmphi.s
  llvm/test/MC/AArch64/SVE/cmphs.s
  llvm/test/MC/AArch64/SVE/cmple.s
  llvm/test/MC/AArch64/SVE/cmplo.s
  llvm/test/MC/AArch64/SVE/cmpls.s
  llvm/test/MC/AArch64/SVE/cmplt.s
  llvm/test/MC/AArch64/SVE/cmpne.s
  llvm/test/MC/AArch64/SVE/cnot.s
  llvm/test/MC/AArch64/SVE/cnt.s
  llvm/test/MC/AArch64/SVE/cntb.s
  llvm/test/MC/AArch64/SVE/cntd.s
  llvm/test/MC/AArch64/SVE/cnth.s
  llvm/test/MC/AArch64/SVE/cntp.s
  llvm/test/MC/AArch64/SVE/cntw.s
  llvm/test/MC/AArch64/SVE/compact.s
  llvm/test/MC/AArch64/SVE/cpy.s
  llvm/test/MC/AArch64/SVE/ctermeq.s
  llvm/test/MC/AArch64/SVE/ctermne.s
  llvm/test/MC/AArch64/SVE/decb.s
  llvm/test/MC/AArch64/SVE/decd.s
  llvm/test/MC/AArch64/SVE/dech.s
  llvm/test/MC/AArch64/SVE/decp.s
  llvm/test/MC/AArch64/SVE/decw.s
  llvm/test/MC/AArch64/SVE/dup.s
  llvm/test/MC/AArch64/SVE/dupm.s
  llvm/test/MC/AArch64/SVE/eon.s
  llvm/test/MC/AArch64/SVE/eor.s
  llvm/test/MC/AArch64/SVE/eors.s
  llvm/test/MC/AArch64/SVE/eorv.s
  llvm/test/MC/AArch64/SVE/ext.s
  llvm/test/MC/AArch64/SVE/fabd.s
  llvm/test/MC/AArch64/SVE/fabs.s
  llvm/test/MC/AArch64/SVE/facge.s
  llvm/test/MC/AArch64/SVE/facgt.s
  llvm/test/MC/AArch64/SVE/facle.s
  llvm/test/MC/AArch64/SVE/faclt.s
  llvm/test/MC/AArch64/SVE/fadd.s
  llvm/test/MC/AArch64/SVE/fadda.s
  llvm/test/MC/AArch64/SVE/faddv.s
  llvm/test/MC/AArch64/SVE/fcadd.s
  llvm/test/MC/AArch64/SVE/fcmeq.s
  llvm/test/MC/AArch64/SVE/fcmge.s
  llvm/test/MC/AArch64/SVE/fcmgt.s
  llvm/test/MC/AArch64/SVE/fcmla.s
  llvm/test/MC/AArch64/SVE/fcmle.s
  llvm/test/MC/AArch64/SVE/fcmlt.s
  llvm/test/MC/AArch64/SVE/fcmne.s
  llvm/test/MC/AArch64/SVE/fcmuo.s
  llvm/test/MC/AArch64/SVE/fcpy.s
  llvm/test/MC/AArch64/SVE/fcvt.s
  llvm/test/MC/AArch64/SVE/fcvtzs.s
  llvm/test/MC/AArch64/SVE/fcvtzu.s
  llvm/test/MC/AArch64/SVE/fdiv.s
  llvm/test/MC/AArch64/SVE/fdivr.s
  llvm/test/MC/AArch64/SVE/fdup.s
  llvm/test/MC/AArch64/SVE/fexpa.s
  llvm/test/MC/AArch64/SVE/fmad.s
  llvm/test/MC/AArch64/SVE/fmax.s
  llvm/test/MC/AArch64/SVE/fmaxnm.s
  llvm/test/MC/AArch64/SVE/fmaxnmv.s
  llvm/test/MC/AArch64/SVE/fmaxv.s
  llvm/test/MC/AArch64/SVE/fmin.s
  llvm/test/MC/AArch64/SVE/fminnm.s
  llvm/test/MC/AArch64/SVE/fminnmv.s
  llvm/test/MC/AArch64/SVE/fminv.s
  llvm/test/MC/AArch64/SVE/fmla.s
  llvm/test/MC/AArch64/SVE/fmls.s
  llvm/test/MC/AArch64/SVE/fmov.s
  llvm/test/MC/AArch64/SVE/fmsb.s
  llvm/test/MC/AArch64/SVE/fmul.s
  llvm/test/MC/AArch64/SVE/fmulx.s
  llvm/test/MC/AArch64/SVE/fneg.s
  llvm/test/MC/AArch64/SVE/fnmad.s
  llvm/test/MC/AArch64/SVE/fnmla.s
  llvm/test/MC/AArch64/SVE/fnmls.s
  llvm/test/MC/AArch64/SVE/fnmsb.s
  llvm/test/MC/AArch64/SVE/frecpe.s
  llvm/test/MC/AArch64/SVE/frecps.s
  llvm/test/MC/AArch64/SVE/frecpx.s
  llvm/test/MC/AArch64/SVE/frinta.s
  llvm/test/MC/AArch64/SVE/frinti.s
  llvm/test/MC/AArch64/SVE/frintm.s
  llvm/test/MC/AArch64/SVE/frintn.s
  llvm/test/MC/AArch64/SVE/frintp.s
  llvm/test/MC/AArch64/SVE/frintx.s
  llvm/test/MC/AArch64/SVE/frintz.s
  llvm/test/MC/AArch64/SVE/frsqrte.s
  llvm/test/MC/AArch64/SVE/frsqrts.s
  llvm/test/MC/AArch64/SVE/fscale.s
  llvm/test/MC/AArch64/SVE/fsqrt.s
  llvm/test/MC/AArch64/SVE/fsub.s
  llvm/test/MC/AArch64/SVE/fsubr.s
  llvm/test/MC/AArch64/SVE/ftsmul.s
  llvm/test/MC/AArch64/SVE/ftssel.s
  llvm/test/MC/AArch64/SVE/incb.s
  llvm/test/MC/AArch64/SVE/incd.s
  llvm/test/MC/AArch64/SVE/inch.s
  llvm/test/MC/AArch64/SVE/incp.s
  llvm/test/MC/AArch64/SVE/incw.s
  llvm/test/MC/AArch64/SVE/index.s
  llvm/test/MC/AArch64/SVE/insr.s
  llvm/test/MC/AArch64/SVE/lasta.s
  llvm/test/MC/AArch64/SVE/lastb.s
  llvm/test/MC/AArch64/SVE/ld1b-sve-only.s
  (414 more files...)



More information about the llvm-commits mailing list