[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
Lang Hames via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 18 18:04:48 PDT 2015
I'd like to float two changes to the llvm.memcpy / llvm.memmove intrinsics.
(1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy intrinsic.
When set to '1' (the auto-upgrade default), this argument would indicate
that the source and destination arguments may perfectly alias (otherwise
they must not alias at all - memcpy prohibits partial overlap). While the C
standard says that memcpy's arguments can't alias at all, perfect aliasing
works in practice, and clang currently relies on this behavior: it emits
llvm.memcpys for aggregate copies, despite the possibility of
Going forward, llvm.memcpy calls emitted for aggregate copies would have
mayPerfectlyAlias set to '1'. Other uses of llvm.memcpy (including
lowerings from memcpy calls) would have mapPerfectlyAlias set to '0'.
This change is motivated by poor optimization for small memcpys on targets
with strict alignment requirements. When a user writes a small, unaligned
memcpy we may transform it into an unaligned load/store pair in instcombine
(See InstCombine::SimplifyMemTransfer), which is then broken up into an
unwieldy series of smaller loads and stores during legalization. I have a
fix for this issue which tags the pointers for unaligned load/store pairs
with noalias metadata allowing CodeGen to produce better code during
legalization, but it's not safe to apply while clang is emitting memcpys
with pointers that may perfectly alias. If the 'mayPerfectlyAlias' flag
were introduced, I could inspect that and add the noalias tag only if
mayPerfectlyAlias is '0'.
Note: We could also achieve the desired effect by adding a new intrinsic
(llvm.structcpy?) with semantics that match the current llvm.memcpy ones
(i.e. perfect-aliasing or non-aliasing, but no partial), and then reclaim
llvm.memcpy for non-aliasing pointers only. I floated this idea with David
Majnemer on IRC and he suggested that adding a flag to llvm.memcpy might be
less disruptive and easier to maintain - thanks for the suggestion David!
(2) Allow different source and destination alignments on both llvm.memcpy /
Since I'm talking about changes to llvm.memcpy anyway, a few people asked
me to float this one. Having separate alignments for the source and
destination pointers may allow us to generate better code when one of the
pointers has a higher alignment.
The auto-upgrade for this would be to set both source and destination
alignment to the original 'align' value.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev