[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Chandler Carruth via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 7 19:41:34 PDT 2015


On Mon, Sep 7, 2015 at 7:31 PM Lang Hames <lhames at gmail.com> wrote:

> Hi Hal,
>
> > If you attach noalias metadata to the memcpy call, it will apply to both
> the source and destination; we don't have a way to differentiate. It might
> be true that if you attach both noalias and alias.scope metadata to the
> call, then querying the call against itself will return NoModRef, but
> that's really hacky (and, in part, wrong, because the destination still
> alias with itself).
>
> Sorry it took me a while to get back to this, and thanks for the
> explanation. I had misremembered how noalias metadata worked, and was
> imagining we could tag the pointers themselves as non-aliasing (along the
> lines of the noalias parameter attribute).
>
>
> > I agree. Chatting with Chandler offline, he suggested that it might be
> better to have Clang emit a pointer-quality check and branch around the
> memcpy when the pointers are equal. This might be faster than the
> self-copies anyway, and we might often be able to statically prove the
> result of the comparison. I think this is worth experimenting with.
>
> That does sound interesting. So if we did this, the idea is that we could
> then teach the alias analysis passes about memcpy? I think that's what we'd
> need in order to attach the noalias metadata to the loads/stores in
> instcombine.
> Do you have any intuition for how much code we'd break if we did that
> (from other people abusing memcpy as we have)? Or whether it would improve
> our alias analysis? I have no idea, but I'm happy to take a shot an initial
> implementation and run some experiments.
>

FWIW, we could auto-upgrade old IR with the same test and branch that Clang
would emit. Not 100% certain the best way to detect the old IR, but we do
have options to ensure old bitcode continues to function.


>
> Cheers,
> Lang.
>
> On Fri, Aug 21, 2015 at 2:57 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>> ----- Original Message -----
>> > From: "Lang Hames" <lhames at gmail.com>
>> > To: "Hal Finkel" <hfinkel at anl.gov>
>> > Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "LLVM Developers Mailing
>> List" <llvm-dev at lists.llvm.org>, "Philip Reames"
>> > <listmail at philipreames.com>, "Peter Cooper" <peter_cooper at apple.com>,
>> "Gerolf Hoflehner" <ghoflehner at apple.com>
>> > Sent: Friday, August 21, 2015 1:02:18 AM
>> > Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove
>> intrinsics.
>> >
>> > Hi Hal
>> >
>> > > By this I assume you mean some new 'nooverlap' metadata? I don't
>> > > think we have any existing metadata with the correct semantics.
>> >
>> >
>> > I was thinking we could just use the existing noalias metadata.
>> > Implicitly, the current llvm.memcpy semantics are "src and dst
>> > overlap perfectly or not at all" (perhaps we should update the docs
>> > to reflect this if we plan to rely on it?
>>
>> If we're going to do that, we certainly should.
>>
>> >). Attaching noalias
>> > metadata to the source and destination would capture the extra
>> > information that the pointers really do not overlap, when we can
>> > figure that out (e.g. when lowering a libc memcpy call).
>>
>> If you attach noalias metadata to the memcpy call, it will apply to both
>> the source and destination; we don't have a way to differentiate. It might
>> be true that if you attach both noalias and alias.scope metadata to the
>> call, then querying the call against itself will return NoModRef, but
>> that's really hacky (and, in part, wrong, because the destination still
>> alias with itself).
>>
>> >
>> > It does seem odd that we would rely on the documented behaviour of
>> > libc memcpy (dst/src should not alias at all) to attach the noalias
>> > metadata, while simultaneously relying on the undocumented behaviour
>> > of libc memcpy (perfect aliasing works in practice) to lower to
>> > llvm.memcpy for struct copies. The clang struct copy code should
>> > probably carry a warning: Do what we say, not what we do.
>> >
>>
>> I agree. Chatting with Chandler offline, he suggested that it might be
>> better to have Clang emit a pointer-quality check and branch around the
>> memcpy when the pointers are equal. This might be faster than the
>> self-copies anyway, and we might often be able to statically prove the
>> result of the comparison. I think this is worth experimenting with.
>>
>> Thanks again,
>> Hal
>>
>> >
>> > Cheers,
>> > Lang.
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Aug 20, 2015 at 4:17 PM, Hal Finkel < hfinkel at anl.gov >
>> > wrote:
>> >
>> >
>> > ----- Original Message -----
>> > > From: "Lang Hames" < lhames at gmail.com >
>> > > To: "Gerolf Hoflehner" < ghoflehner at apple.com >
>> > > Cc: "Mehdi Amini" < mehdi.amini at apple.com >, "LLVM Developers
>> > > Mailing List" < llvm-dev at lists.llvm.org >, "Hal Finkel"
>> > > < hfinkel at anl.gov >, "Philip Reames" < listmail at philipreames.com >,
>> > > "Peter Cooper" < peter_cooper at apple.com >
>> > > Sent: Thursday, August 20, 2015 4:26:20 PM
>> > > Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove
>> > > intrinsics.
>> > >
>> > >
>> > > Pete - That patch sounds great!
>> > >
>> > >
>> > > Philip, Hal, Medhi, Gerolf - Thanks very much for the feedback.
>> > >
>> > >
>> > > So how about this:
>> > > (1) We drop llvm.memcpy's alignment argument and use Pete's
>> > > alignment-via-metadata patch (whatever version of it passes
>> > > review).
>> > > (2) llvm.memcpy retains its current semantics, but we teach clang,
>> > > SimplifyLibCalls, etc. to add noalias metadata where we know it's
>> > > safe.
>> >
>> > By this I assume you mean some new 'nooverlap' metadata? I don't
>> > think we have any existing metadata with the correct semantics.
>> >
>> > >
>> > > Dropping the alignment argument will still change the signature of
>> > > llvm.memcpy / llvm.memmove, so I guess there's one other issue
>> > > worth
>> > > discussing: Should we also split 'isVolatile' into 'isSrcVolatile'
>> > > and 'isDstVolatile' ?
>> >
>> > Yes. We should be able to specify all relevant properties of the
>> > source and destination separately. I see no reason not to do this.
>> >
>> > -Hal
>> >
>> >
>> >
>> > > Nobody has asked for this as far as I know,
>> > > but I believe it would improve codegen in some cases. E.g.:
>> > >
>> > >
>> > > typedef struct {
>> > > unsigned X[8];
>> > > } S;
>> > >
>> > >
>> > > unsigned foo(volatile S* s) {
>> > > S t = *s;
>> > > return t.X[4];
>> > > }
>> > >
>> > > If the frontend lowers the struct copy to a volatile memcpy we'll
>> > > have to copy the whole struct before reading part of 't'. If we
>> > > could mark only the source as volatile then we could discard the
>> > > stores to 't'.
>> > >
>> > > Again - nobody has asked for this, but if there's interest now
>> > > would
>> > > be a good time to look at it.
>> > >
>> > > Cheers,
>> > > Lang.
>> > >
>> > >
>> > >
>> > >
>> > > On Wed, Aug 19, 2015 at 1:56 PM, Gerolf Hoflehner via llvm-dev <
>> > > llvm-dev at lists.llvm.org > wrote:
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Aug 19, 2015, at 12:54 PM, Mehdi Amini via llvm-dev <
>> > > llvm-dev at lists.llvm.org > wrote:
>> > >
>> > >
>> > >
>> > >
>> > > On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <
>> > > llvm-dev at lists.llvm.org > wrote:
>> > >
>> > > ----- Original Message -----
>> > >
>> > >
>> > > From: "Philip Reames via llvm-dev" < llvm-dev at lists.llvm.org >
>> > > To: "Pete Cooper" < peter_cooper at apple.com >, "Lang Hames" <
>> > > lhames at gmail.com >
>> > > Cc: "LLVM Developers Mailing List" < llvm-dev at lists.llvm.org >
>> > > Sent: Wednesday, August 19, 2015 12:14:19 PM
>> > > Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove
>> > > intrinsics.
>> > >
>> > > On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote:
>> > >
>> > >
>> > > Hey Lang
>> > >
>> > >
>> > > On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev
>> > > < llvm-dev at lists.llvm.org > wrote:
>> > >
>> > > Hi All,
>> > >
>> > > I'd like to float two changes to the llvm.memcpy / llvm.memmove
>> > > intrinsics.
>> > >
>> > >
>> > > (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy
>> > > intrinsic.
>> > >
>> > > When set to '1' (the auto-upgrade default), this argument would
>> > > indicate that the source and destination arguments may perfectly
>> > > alias (otherwise they must not alias at all - memcpy prohibits
>> > > partial overlap). While the C standard says that memcpy's
>> > > arguments can't alias at all, perfect aliasing works in practice,
>> > > and clang currently relies on this behavior: it emits
>> > > llvm.memcpys for aggregate copies, despite the possibility of
>> > > self-assignment.
>> > >
>> > > Going forward, llvm.memcpy calls emitted for aggregate copies
>> > > would have mayPerfectlyAlias set to '1'. Other uses of
>> > > llvm.memcpy (including lowerings from memcpy calls) would have
>> > > mapPerfectlyAlias set to '0'.
>> > >
>> > > This change is motivated by poor optimization for small memcpys on
>> > > targets with strict alignment requirements. When a user writes a
>> > > small, unaligned memcpy we may transform it into an unaligned
>> > > load/store pair in instcombine (See
>> > > InstCombine::SimplifyMemTransfer), which is then broken up into
>> > > an unwieldy series of smaller loads and stores during
>> > > legalization. I have a fix for this issue which tags the pointers
>> > > for unaligned load/store pairs with noalias metadata allowing
>> > > CodeGen to produce better code during legalization, but it's not
>> > > safe to apply while clang is emitting memcpys with pointers that
>> > > may perfectly alias. If the 'mayPerfectlyAlias' flag were
>> > > introduced, I could inspect that and add the noalias tag only if
>> > > mayPerfectlyAlias is '0'.
>> > >
>> > > Note: We could also achieve the desired effect by adding a new
>> > > intrinsic (llvm.structcpy?) with semantics that match the current
>> > > llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no
>> > > partial), and then reclaim llvm.memcpy for non-aliasing pointers
>> > > only. I floated this idea with David Majnemer on IRC and he
>> > > suggested that adding a flag to llvm.memcpy might be less
>> > > disruptive and easier to maintain - thanks for the suggestion
>> > > David!
>> > > Given there's a semantically conservative interpretation and a more
>> > > optimistic one, this really sounds like a case for metadata not
>> > > another
>> > > argument to the function. Our memcpy could keep it's current
>> > > semantics,
>> > > and we could add a piece of metadata which says none of the
>> > > arguments
>> > > to
>> > > the call alias.
>> > >
>> > > We could add some "memcpy-allows-self-copies" metadata, and have
>> > > Clang tag its associated aggregate copies with it. That would also
>> > > work.
>> > >
>> > > Isn’t introducing an instruction wise “correctness” related
>> > > metadata?
>> > > Shouldn’t it be the opposite for correctness, i.e.
>> > > “memcpy-disallows-self-copies”?
>> > > (correctness in the sense that dropping the metadata does not break
>> > > anything).
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Actually, can't we already get this interpretation by marking both
>> > > argument points as noalias? Doesn't that require that they don't
>> > > overlap at all? I think we just need the ability to specify noalias
>> > > at
>> > > the callsite for each argument. I don't know if that's been tried,
>> > > but
>> > > it should work in theory. There are some issues with control
>> > > dependence
>> > > of call site attributes though that we'd need to watch out for/fix.
>> > >
>> > > But that's not quite what we want. We want to say: These can't
>> > > alias,
>> > > unless they're exactly equal. noalias either means that it does not
>> > > alias at all, nor do any derived pointers, and obviously the lack
>> > > of
>> > > it says nothing.
>> > >
>> > > This we can still make aliasing assumptions if can prove that src
>> > > !=
>> > > destination, which is often easier than proving things accounting
>> > > for overlaps.
>> > >
>> > > Is this limited to the memcpy case or are these other use-cases so
>> > > that it would be worth having another attribute than noalias that
>> > > would carry this semantic (“nooverlap”)?
>> > >
>> > >
>> > > I was wondering about that, too. This looks like information either
>> > > the user has or the compiler could derive. Would it be best to
>> > > condense the properties into *alias* and *align* attributes that
>> > > are
>> > > also user visible?
>> > >
>> > >
>> > >
>> > >
>> > > —
>> > >
>> > >
>> > > Mehdi
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > (2) Allow different source and destination alignments on both
>> > > llvm.memcpy / llvm.memmove.
>> > >
>> > > Since I'm talking about changes to llvm.memcpy anyway, a few
>> > > people asked me to float this one. Having separate alignments for
>> > > the source and destination pointers may allow us to generate
>> > > better code when one of the pointers has a higher alignment.
>> > >
>> > > The auto-upgrade for this would be to set both source and
>> > > destination alignment to the original 'align' value.
>> > > FWIW, I have a patch for this lying around. I can dig it up. I
>> > > use alignment attributes to do it as there’s no need for alignment
>> > > to be its own argument any more.
>> > > This would be a nice cleanup in general. +1
>> > >
>> > > I agree, this sounds useful.
>> > >
>> > > -Hal
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Cheers,
>> > > Pete
>> > >
>> > >
>> > >
>> > >
>> > > Any thoughts?
>> > >
>> > > Cheers,
>> > > Lang.
>> > >
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e=
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
>> > >
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
>> > >
>> > >
>> > > --
>> > > Hal Finkel
>> > > Assistant Computational Scientist
>> > > Leadership Computing Facility
>> > > Argonne National Laboratory
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
>> > >
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=PlQHl7sshPU7FSzb4jGZyKbtJGJEL8ML0yYUKuWLs60&m=1ldotGn12NIM8scnVXnxKfrKZywUWKkEsSehTMLLR0E&s=489ZmsCqyXRRy8ULJCTdjh8vwbKjS5wSZfLbnsf4fD8&e=
>> > >
>> > > _______________________________________________
>> > > LLVM Developers mailing list
>> > > llvm-dev at lists.llvm.org
>> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> > >
>> > >
>> > >
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>> >
>> >
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150908/3c5460f3/attachment.html>


More information about the llvm-dev mailing list