[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Thu Aug 20 16:17:16 PDT 2015

----- Original Message -----
> From: "Lang Hames" <lhames at gmail.com>
> To: "Gerolf Hoflehner" <ghoflehner at apple.com>
> Cc: "Mehdi Amini" <mehdi.amini at apple.com>, "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org>, "Hal Finkel"
> <hfinkel at anl.gov>, "Philip Reames" <listmail at philipreames.com>, "Peter Cooper" <peter_cooper at apple.com>
> Sent: Thursday, August 20, 2015 4:26:20 PM
> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
> 
> 
> Pete - That patch sounds great!
> 
> 
> Philip, Hal, Medhi, Gerolf - Thanks very much for the feedback.
> 
> 
> So how about this:
> (1) We drop llvm.memcpy's alignment argument and use Pete's
> alignment-via-metadata patch (whatever version of it passes review).
> (2) llvm.memcpy retains its current semantics, but we teach clang,
> SimplifyLibCalls, etc. to add noalias metadata where we know it's
> safe.

By this I assume you mean some new 'nooverlap' metadata? I don't think we have any existing metadata with the correct semantics.

> 
> Dropping the alignment argument will still change the signature of
> llvm.memcpy / llvm.memmove, so I guess there's one other issue worth
> discussing: Should we also split 'isVolatile' into 'isSrcVolatile'
> and 'isDstVolatile' ?

Yes. We should be able to specify all relevant properties of the source and destination separately. I see no reason not to do this.

 -Hal

> Nobody has asked for this as far as I know,
> but I believe it would improve codegen in some cases. E.g.:
> 
> 
> typedef struct {
> unsigned X[8];
> } S;
> 
> 
> unsigned foo(volatile S* s) {
> S t = *s;
> return t.X[4];
> }
> 
> If the frontend lowers the struct copy to a volatile memcpy we'll
> have to copy the whole struct before reading part of 't'. If we
> could mark only the source as volatile then we could discard the
> stores to 't'.
> 
> Again - nobody has asked for this, but if there's interest now would
> be a good time to look at it.
> 
> Cheers,
> Lang.
> 
> 
> 
> 
> On Wed, Aug 19, 2015 at 1:56 PM, Gerolf Hoflehner via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Aug 19, 2015, at 12:54 PM, Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> 
> 
> 
> On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> ----- Original Message -----
> 
> 
> From: "Philip Reames via llvm-dev" < llvm-dev at lists.llvm.org >
> To: "Pete Cooper" < peter_cooper at apple.com >, "Lang Hames" <
> lhames at gmail.com >
> Cc: "LLVM Developers Mailing List" < llvm-dev at lists.llvm.org >
> Sent: Wednesday, August 19, 2015 12:14:19 PM
> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove
> intrinsics.
> 
> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote:
> 
> 
> Hey Lang
> 
> 
> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev
> < llvm-dev at lists.llvm.org > wrote:
> 
> Hi All,
> 
> I'd like to float two changes to the llvm.memcpy / llvm.memmove
> intrinsics.
> 
> 
> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy
> intrinsic.
> 
> When set to '1' (the auto-upgrade default), this argument would
> indicate that the source and destination arguments may perfectly
> alias (otherwise they must not alias at all - memcpy prohibits
> partial overlap). While the C standard says that memcpy's
> arguments can't alias at all, perfect aliasing works in practice,
> and clang currently relies on this behavior: it emits
> llvm.memcpys for aggregate copies, despite the possibility of
> self-assignment.
> 
> Going forward, llvm.memcpy calls emitted for aggregate copies
> would have mayPerfectlyAlias set to '1'. Other uses of
> llvm.memcpy (including lowerings from memcpy calls) would have
> mapPerfectlyAlias set to '0'.
> 
> This change is motivated by poor optimization for small memcpys on
> targets with strict alignment requirements. When a user writes a
> small, unaligned memcpy we may transform it into an unaligned
> load/store pair in instcombine (See
> InstCombine::SimplifyMemTransfer), which is then broken up into
> an unwieldy series of smaller loads and stores during
> legalization. I have a fix for this issue which tags the pointers
> for unaligned load/store pairs with noalias metadata allowing
> CodeGen to produce better code during legalization, but it's not
> safe to apply while clang is emitting memcpys with pointers that
> may perfectly alias. If the 'mayPerfectlyAlias' flag were
> introduced, I could inspect that and add the noalias tag only if
> mayPerfectlyAlias is '0'.
> 
> Note: We could also achieve the desired effect by adding a new
> intrinsic (llvm.structcpy?) with semantics that match the current
> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no
> partial), and then reclaim llvm.memcpy for non-aliasing pointers
> only. I floated this idea with David Majnemer on IRC and he
> suggested that adding a flag to llvm.memcpy might be less
> disruptive and easier to maintain - thanks for the suggestion
> David!
> Given there's a semantically conservative interpretation and a more
> optimistic one, this really sounds like a case for metadata not
> another
> argument to the function. Our memcpy could keep it's current
> semantics,
> and we could add a piece of metadata which says none of the arguments
> to
> the call alias.
> 
> We could add some "memcpy-allows-self-copies" metadata, and have
> Clang tag its associated aggregate copies with it. That would also
> work.
> 
> Isn’t introducing an instruction wise “correctness” related metadata?
> Shouldn’t it be the opposite for correctness, i.e.
> “memcpy-disallows-self-copies”?
> (correctness in the sense that dropping the metadata does not break
> anything).
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Actually, can't we already get this interpretation by marking both
> argument points as noalias? Doesn't that require that they don't
> overlap at all? I think we just need the ability to specify noalias
> at
> the callsite for each argument. I don't know if that's been tried,
> but
> it should work in theory. There are some issues with control
> dependence
> of call site attributes though that we'd need to watch out for/fix.
> 
> But that's not quite what we want. We want to say: These can't alias,
> unless they're exactly equal. noalias either means that it does not
> alias at all, nor do any derived pointers, and obviously the lack of
> it says nothing.
> 
> This we can still make aliasing assumptions if can prove that src !=
> destination, which is often easier than proving things accounting
> for overlaps.
> 
> Is this limited to the memcpy case or are these other use-cases so
> that it would be worth having another attribute than noalias that
> would carry this semantic (“nooverlap”)?
> 
> 
> I was wondering about that, too. This looks like information either
> the user has or the compiler could derive. Would it be best to
> condense the properties into *alias* and *align* attributes that are
> also user visible?
> 
> 
> 
> 
> —
> 
> 
> Mehdi
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> (2) Allow different source and destination alignments on both
> llvm.memcpy / llvm.memmove.
> 
> Since I'm talking about changes to llvm.memcpy anyway, a few
> people asked me to float this one. Having separate alignments for
> the source and destination pointers may allow us to generate
> better code when one of the pointers has a higher alignment.
> 
> The auto-upgrade for this would be to set both source and
> destination alignment to the original 'align' value.
> FWIW, I have a patch for this lying around. I can dig it up. I
> use alignment attributes to do it as there’s no need for alignment
> to be its own argument any more.
> This would be a nice cleanup in general. +1
> 
> I agree, this sounds useful.
> 
> -Hal
> 
> 
> 
> 
> 
> 
> Cheers,
> Pete
> 
> 
> 
> 
> Any thoughts?
> 
> Cheers,
> Lang.
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e=
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=PlQHl7sshPU7FSzb4jGZyKbtJGJEL8ML0yYUKuWLs60&m=1ldotGn12NIM8scnVXnxKfrKZywUWKkEsSehTMLLR0E&s=489ZmsCqyXRRy8ULJCTdjh8vwbKjS5wSZfLbnsf4fD8&e=
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory