[llvm-dev] [RFC] Memory region declaration intrinsic

Roman Lebedev via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 7 14:39:17 PST 2021


On Wed, Dec 8, 2021 at 12:57 AM Eli Friedman <efriedma at quicinc.com> wrote:
>
> How does this interact with "inbounds" markings on GEPs?  Can the result of a non-inbounds GEP point outside the specified region?
Uh, it's a trick question, isn't it? :)

Right now I would basically say that "inbounds" markings on GEPs is
orthogonal here, because this intrinsic simply "clamps" the underlying
"allocated object".
(and naturally, you can't unclamp it afterwards *in this def-use chain*)

I.e. if you go out of the specified bounds with `inbounds` you get
poison (and i think you can't go back from there?); but if you go
out of the specified bounds without `inbounds` it's not poison,
but dereferencing said pointer is still UB.

Let me know if this isn't a proper answer,
or if i have talked myself into a corner here :)

> I assume the input and result pointer alias?
The input pointer is directly returned, so they are the same,
so they are mustalias.

> Given that, I don't think the "nocapture" is right.
Hmm. I guess: https://alive2.llvm.org/ce/z/frx2aI
Okay.

> (There's a potential use-case for a similar intrinsic where the result doesn't alias the argument, but I guess that would be a different thing.)
Nothing immediately comes into mind.

> The usual concern with this sort of intrinsic is the work involved in teaching a bunch of pointer optimizations the meaning of the new intrinsic.
Yes, ignoring the implementation cost of the improvements that will be
made possible, there's going to be //some// implementation cost incurred
to de-pessimize the existing transformations, or, perhaps more importantly,
avoiding losing this intrinsic during transformations.
I acknowledge this worry.

> Encoding array bounds is probably important enough to be worth the extra work, though.
I think it'll solve a big knowledge propagation hole,
so i certainly hope so, yes.

> -Eli
Roman

> > -----Original Message-----
> > From: Roman Lebedev <lebedev.ri at gmail.com>
> > Sent: Tuesday, December 7, 2021 11:24 AM
> > To: llvm-dev at lists.llvm.org
> > Cc: Johannes Doerfert <jdoerfert at anl.gov>; Eli Friedman
> > <efriedma at quicinc.com>; Clement Courbet <courbet at google.com>; Nuno
> > Lopes <nunoplopes at sapo.pt>; Nikita Popov <nikita.ppv at gmail.com>; Peter
> > Collingbourne <peter at pcc.me.uk>; Philip Reames <listmail at philipreames.com>
> > Subject: [RFC] Memory region declaration intrinsic
> >
> > Hi all.
> >
> > Differential: https://reviews.llvm.org/D115274
> >
> > This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to
> > array subscripts.",
> > https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html
> >
> > Problem statement:
> >
> > As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given
> > ```
> > struct S {
> >     int a[3];
> >     int b[3];
> >     int c[3];
> > };
> >
> > void bar(int*);
> >
> > void foo(S* s) {
> >   bar(&s.b[1]);
> > }
> > ```
> > even though the pointer the bar receives has 4 ints to the left of it
> > and 4 to the right of it, the only ints it can access are
> > one to the left and one to the right. I.e. it can not go outside of the S::b.
> >
> > But, there is currently no way to encode that knowledge into LLVM IR.
> > There's limited `inrange` thing  for constant expression GEP's,. since:
> > * https://reviews.llvm.org/D22793
> > * https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
> >
> > ... but it's limited to constant expressions. There were previous attempts at
> > removing that restriction, namely that RFC and my patch:
> > https://reviews.llvm.org/D114988, however implementation experience/review
> > pointed out a few design problems:
> > 1. Poor opaque pointers interop, it requires the GEP to be into a structure,
> >     so if it's a pure pointer computation, we suddenly can't preserve
> > the knowledge.
> > 2. While just adding a bit[s] to GEP instruction allows the
> > transformation to just ignore it
> >     if they aren't explicitly taught about it, which is fine from a
> > legality standpoint,
> >     it complicates it's preservation through transformation.
> > 3. While i'm not sure how useful it would be, it limits us to
> > statically-sized arrays.
> >
> > Instead of following through with that, let me propose a new design:
> >
> > <begin langref>
> > ```
> > .. _int_memory_region_decl:
> >
> > '``llvm.memory.region.decl``' Intrinsic
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > Syntax:
> > """""""
> >
> > ::
> >
> >       declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone
> > returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree nosync
> > nounwind readnone speculatable willreturn
> >
> > Overview:
> > """""""""
> >
> > The '``llvm.memory.region.decl``' intrinsic annotates memory region.
> >
> > Arguments:
> > """"""""""
> >
> > This is an overloaded intrinsic. The memory region can belong to any address
> > space. The first argument is a pointer into the memory region. The returned
> > pointer, which is the first argument, must belong to the same address space
> > as the argument. The second argument specifies the offset to the pointer (the
> > first argument) at which the memory region begins. The third argument specifies
> > the offset to the pointer (the first argument) at which the memory region ends.
> >
> > Semantics:
> > """"""""""
> >
> > The returned pointer, and, transitively, any pointer that is def-use based on
> > that pointer, points into the memory region ``[ptr+begin_offset,
> > ptr+end_offset)``,
> > or is a :ref:`poison value <poisonvalues>` otherwise.
> >
> > This intrinsic is intended to be an optimization hint, there are no correctness
> > concerns with completely ignoring and/or dropping it. The main use-case is
> > to be able to annotate array bounds in C family of languages,
> > which may allow alloca splitting, and better alias analysis.
> > ```
> > </end langref>
> >
> > Example:
> > ```
> > struct S {
> >   int a;
> >   int b[4];
> > };
> > int* get(S*s, int i) {
> >   return &s->b[i];
> > }
> > ```
> > is currently lowered into
> > ```
> > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> > %i) local_unnamed_addr #0 {
> >  %idxprom = sext i32 %i to i64
> > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> > i32 1, i64 %idxprom
> >  ret i32* %arrayidx
> > }
> > ```
> > would instead be lowered into
> > ```
> > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> > %i) local_unnamed_addr #0 {
> >  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> > i32 1, i64 0
> >  %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32*
> > %arrayidx, i64 0, i64 32)
> >  %idxprom = sext i32 %i to i64
> >  %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64
> > %idxprom
> >  ret i32* %arrayidx3
> > }
> > ```
> > Concretely, this tells us that %i u<= 4, which should be useful for
> > Alias Analysis
> > in less contrived snippets.
> >
> > The other motivational example, although still contrived:
> > ```
> > struct S {
> >  int a;
> >  int b[4];
> > };
> > int stuff(int i, int array_val, int j, int scalar_val) {
> >  S s;
> >  s.a = scalar_val;
> >  s.b[i] = array_val;
> >  return s.a;
> > }
> > ```
> > currently results in:
> > ```
> > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> > %scalar_val) local_unnamed_addr #0 {
> > entry:
> > %s = alloca %struct.S, align 4
> > %0 = bitcast %struct.S* %s to i8*
> > call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2
> > %a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0
> > store i32 %scalar_val, i32* %a, align 4, !tbaa !3
> > %idxprom = sext i32 %i to i64
> > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> > i32 1, i64 %idxprom
> > store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8
> > %1 = load i32, i32* %a, align 4, !tbaa !3
> > call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2
> > ret i32 %1
> > }
> > ```
> > Notice the problem? `array_val` couldn't have been stored into `S::a`,
> > this particular example should optimize to just
> > ```
> > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> > %scalar_val) local_unnamed_addr #0 {
> >  ret i32 %scalar_val
> > }
> > ```
> >
> > The even bigger picture here is that SROA simply gives up in presence
> > of variable GEP's,
> > but if we annotate the extents of such a variable GEP, then, given
> > right circumstances,
> > we may be able to conclude that the alloca could be split up, and
> > certain parts be promoted.
> > That is the main motivation for me behind this.
> >
> > I think, this is sufficient information, but let me know if i should
> > address something else.
> >
> > Roman.


More information about the llvm-dev mailing list