[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Wed Aug 19 13:17:30 PDT 2015

----- Original Message -----
> From: "Mehdi Amini" <mehdi.amini at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Philip Reames" <listmail at philipreames.com>, "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org>
> Sent: Wednesday, August 19, 2015 2:54:56 PM
> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy /	llvm.memmove	intrinsics.
> 
> 
> > On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> > 
> > ----- Original Message -----
> >> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> >> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang Hames"
> >> <lhames at gmail.com>
> >> Cc: "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org>
> >> Sent: Wednesday, August 19, 2015 12:14:19 PM
> >> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy /
> >> llvm.memmove	intrinsics.
> >> 
> >> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote:
> >>> Hey Lang
> >>>> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev
> >>>> <llvm-dev at lists.llvm.org> wrote:
> >>>> 
> >>>> Hi All,
> >>>> 
> >>>> I'd like to float two changes to the llvm.memcpy / llvm.memmove
> >>>> intrinsics.
> >>>> 
> >>>> 
> >>>> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy
> >>>> intrinsic.
> >>>> 
> >>>> When set to '1' (the auto-upgrade default), this argument would
> >>>> indicate that the source and destination arguments may perfectly
> >>>> alias (otherwise they must not alias at all - memcpy prohibits
> >>>> partial overlap). While the C standard says that memcpy's
> >>>> arguments can't alias at all, perfect aliasing works in
> >>>> practice,
> >>>> and clang currently relies on this behavior: it emits
> >>>> llvm.memcpys for aggregate copies, despite the possibility of
> >>>> self-assignment.
> >>>> 
> >>>> Going forward, llvm.memcpy calls emitted for aggregate copies
> >>>> would have mayPerfectlyAlias set to '1'. Other uses of
> >>>> llvm.memcpy (including lowerings from memcpy calls) would have
> >>>> mapPerfectlyAlias set to '0'.
> >>>> 
> >>>> This change is motivated by poor optimization for small memcpys
> >>>> on
> >>>> targets with strict alignment requirements. When a user writes a
> >>>> small, unaligned memcpy we may transform it into an unaligned
> >>>> load/store pair in instcombine (See
> >>>> InstCombine::SimplifyMemTransfer), which is then broken up into
> >>>> an unwieldy series of smaller loads and stores during
> >>>> legalization. I have a fix for this issue which tags the
> >>>> pointers
> >>>> for unaligned load/store pairs with noalias metadata allowing
> >>>> CodeGen to produce better code during legalization, but it's not
> >>>> safe to apply while clang is emitting memcpys with pointers that
> >>>> may perfectly alias. If the 'mayPerfectlyAlias' flag were
> >>>> introduced, I could inspect that and add the noalias tag only if
> >>>> mayPerfectlyAlias is '0'.
> >>>> 
> >>>> Note: We could also achieve the desired effect by adding a new
> >>>> intrinsic (llvm.structcpy?) with semantics that match the
> >>>> current
> >>>> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no
> >>>> partial), and then reclaim llvm.memcpy for non-aliasing pointers
> >>>> only. I floated this idea with David Majnemer on IRC and he
> >>>> suggested that adding a flag to llvm.memcpy might be less
> >>>> disruptive and easier to maintain - thanks for the suggestion
> >>>> David!
> >> Given there's a semantically conservative interpretation and a
> >> more
> >> optimistic one, this really sounds like a case for metadata not
> >> another
> >> argument to the function.  Our memcpy could keep it's current
> >> semantics,
> >> and we could add a piece of metadata which says none of the
> >> arguments
> >> to
> >> the call alias.
> > 
> > We could add some "memcpy-allows-self-copies" metadata, and have
> > Clang tag its associated aggregate copies with it. That would also
> > work.
> 
> Isn’t introducing an instruction wise “correctness” related metadata?
> Shouldn’t it be the opposite for correctness, i.e.
> “memcpy-disallows-self-copies”?
> (correctness in the sense that dropping the metadata does not break
> anything).
> 

Indeed, you're correct.

 -Hal

> 
> 
> > 
> >> 
> >> Actually, can't we already get this interpretation by marking both
> >> argument points as noalias?  Doesn't that require that they don't
> >> overlap at all?  I think we just need the ability to specify
> >> noalias
> >> at
> >> the callsite for each argument.  I don't know if that's been
> >> tried,
> >> but
> >> it should work in theory.  There are some issues with control
> >> dependence
> >> of call site attributes though that we'd need to watch out
> >> for/fix.
> > 
> > But that's not quite what we want. We want to say: These can't
> > alias, unless they're exactly equal. noalias either means that it
> > does not alias at all, nor do any derived pointers, and obviously
> > the lack of it says nothing.
> > 
> > This we can still make aliasing assumptions if can prove that src
> > != destination, which is often easier than proving things
> > accounting for overlaps.
> 
> Is this limited to the memcpy case or are these other use-cases so
> that it would be worth having another attribute than noalias that
> would carry this semantic (“nooverlap”)?
> 
> —
> Mehdi
> 
> 
> > 
> >>>> 
> >>>> 
> >>>> 
> >>>> (2) Allow different source and destination alignments on both
> >>>> llvm.memcpy / llvm.memmove.
> >>>> 
> >>>> Since I'm talking about changes to llvm.memcpy anyway, a few
> >>>> people asked me to float this one. Having separate alignments
> >>>> for
> >>>> the source and destination pointers may allow us to generate
> >>>> better code when one of the pointers has a higher alignment.
> >>>> 
> >>>> The auto-upgrade for this would be to set both source and
> >>>> destination alignment to the original 'align' value.
> >>> FWIW, I have a patch for this lying around.  I can dig it up.  I
> >>> use alignment attributes to do it as there’s no need for
> >>> alignment
> >>> to be its own argument any more.
> >> This would be a nice cleanup in general.  +1
> > 
> > I agree, this sounds useful.
> > 
> > -Hal
> > 
> >>> 
> >>> Cheers,
> >>> Pete
> >>>> 
> >>>> 
> >>>> Any thoughts?
> >>>> 
> >>>> Cheers,
> >>>> Lang.
> >>>> 
> >>>> _______________________________________________
> >>>> LLVM Developers mailing list
> >>>> llvm-dev at lists.llvm.org
> >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e=
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> >> 
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory