[PATCH] D117405: [AArch64] CodeGen for Armv8.8/9.3 MOPS
Son Tuan Vu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Sat Jan 15 12:56:07 PST 2022
tyb0807 created this revision.
Herald added subscribers: hiraditya, kristof.beyls.
tyb0807 requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.
This implements codegen for Armv8.8/9.3 Memory Operations extension
(MOPS). Any memcpy/memset/memmov intrinsics will be always be emitted
as a series of three instructions which perform the operation.
In addition, this introduces a new ACLE intrinsic for memset tagged (see
https://github.com/ARM-software/acle/pull/38).
void *__builtin_arm_mops_memset_tag(void *, int, size_t)
A corresponding LLVM intrinsic is introduced:
i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64)
The types match llvm.memset but the return type is not void.
1. SelectionDAG:
- New target SDNodes are added: AArch64ISD::MOPS_MEMSET, etc. Each intrinsic is translated to one of these in SelectionDAGBuilder via EmitTargetCodeForMOPS.
- A custom lowering routine for INTRINSIC_W_CHAIN is added to handle llvm.aarch64.mops.memset.tag. This takes a separate path from the common intrinsics but ultimately ends up in the same EmitMOPS().
2. GlobalIsel:
- AArch64LegalizerInfo will now consider the following generic opcodes if +mops is available, instead of legalising by expanding them to libcalls: G_BZERO, G_MEMCPY_INLINE, G_MEMCPY, G_MEMMOVE, G_MEMSET The s8 value of memset is legalised to s64 to match the pseudos.
- AArch64O0PreLegalizerCombinerInfo will not combine any of the generic opcodes. This means that small or zero sized memory operations will not be optimised out if +mops is present.
- AArch64InstructionSelector will select the above as new pseudo instructions: AArch64::MOPSMemory{Copy/Move/Set/SetTagging} These are each expanded in the usual place to a series of three instructions (e.g. SETP/SETM/SETE) which must be emitted together. To avoid the scheduler moving unrelated instructions between parts of MOPS sequences, the sequences are placed inside Machine Instruction Bundles, making sure that late scheduler passes handle it as a single unit.
- Furthermore, this adds the following additions to the LegalizerInfo API:
+ Add a 3-type version of customForCartesianProduct.
+ Expose immIdx() so that immediates can be marked checked. This is required for G_MEMCPY etc which have an immediate operand $tailcall, and debug builds of LLVM check that the immediates have all been handled by the legalizer.
Patch by Tomas Matheson and Lucas Prates.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D117405
Files:
clang/include/clang/Basic/BuiltinsAArch64.def
clang/lib/CodeGen/CGBuiltin.cpp
clang/lib/Headers/arm_acle.h
clang/test/CodeGen/aarch64-mops.c
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
llvm/include/llvm/IR/IntrinsicsAArch64.td
llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h
llvm/lib/Target/AArch64/AArch64InstrInfo.td
llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.h
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h
llvm/lib/Target/AArch64/GISel/AArch64O0PreLegalizerCombiner.cpp
llvm/test/CodeGen/AArch64/aarch64-mops-consecutive.ll
llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll
llvm/test/CodeGen/AArch64/aarch64-mops.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D117405.400316.patch
Type: text/x-patch
Size: 109833 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20220115/d7e5147b/attachment-0001.bin>
More information about the cfe-commits
mailing list