[PATCH] D115274: [IR][RFC] Memory region declaration intrinsic

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 21 16:39:46 PDT 2022


lebedev.ri marked 2 inline comments as done.
lebedev.ri added inline comments.


================
Comment at: llvm/docs/LangRef.rst:20876
+to be able to annotate array bounds in C family of languages,
+which may allow alloca splitting, and better alias analysis.
+
----------------
arichardson wrote:
> lebedev.ri wrote:
> > arichardson wrote:
> > > lebedev.ri wrote:
> > > > arichardson wrote:
> > > > > Do you envision this being used for all sub-object pointer creations? If so it might need a flag to disable it since it might break some C patterns such as `container_of`.
> > > > > 
> > > > > According to https://godbolt.org/z/evTbejaMf the container_of macro results in an inbounds GEP, so with sufficient inlining things might break?
> > > > > 
> > > > > About three years ago I spent quite a lot of time enforcing sub-object bounds at runtime using CHERI. Almost all code works just fine but there are things such as container_of() that require opt-out annotations. I wrote about the incompatibilities that I found in Chapter 5 of https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-949.pdf. TL;DR: not many changes needed - about 50 annotations across the entire FreeBSD source tree. Almost all annotations due to container_of or emulation of C++ inheritance in C.
> > > > I'm not sure what you mean by "all sub-object pointer creations".
> > > > 
> > > > Roughly, front-ends should emit this intrinsic on some pointer with some bounds
> > > > iff they know that it would be UB to go *from that specific pointer* (aka, as per def-use)
> > > > outside of the specified bounds.
> > > > 
> > > > The one case we know of is C arrays within structs.
> > > By sub-object pointer creations I mean something like `&obj->field`. You could conceivable treat that as declaring a new sub-object bounded to just `field`. E.g. something like this: https://godbolt.org/z/bM7j1bxqs
> > I think the question is slightly wrong. It's up to front-ends to decide when they can and can't emit this.
> > 
> > If you are asking about https://godbolt.org/z/adq5EWx17,
> > then as per previous conversations about this, i do believe that code to be well-defined and not UB.
> > 
> > IOW, i do **not** believe that as per the current C/C++ standards wording each data member of a struct
> > is it's own sub-object from which you are not allowed to get to it's neighbor objects,
> > But perhaps @aaron.ballman wants to correct me on this.
> > 
> Yes absolutely agreed that this is purely up to the frontend, I just assumed you had plans to update clang to emit the new intrinsic.
> 
> It has been a long time since I looked at the C standard with regards to subobjects but if I recall correctly you are right that this is not defined as being illegal. However, doesn't that also mean that you can access the member before/after an embedded array?
Right. Having a pointer to the array member of a struct isn't going to do anything.
The magic happens when you have a pointer to the *element* of the array member:
https://godbolt.org/z/cjE5bY4G4 <- manually crafted


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115274/new/

https://reviews.llvm.org/D115274



More information about the llvm-commits mailing list