[llvm] r176408 - recommit r172363 & r171325 (reverted in r172756)
Hal Finkel
hfinkel at anl.gov
Sun Mar 3 15:38:51 PST 2013
----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Nuno Lopes" <nunoplopes at sapo.pt>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Sunday, March 3, 2013 5:13:10 PM
> Subject: Re: [llvm] r176408 - recommit r172363 & r171325 (reverted in r172756)
>
>
> On Mar 3, 2013, at 4:17 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>
> >> I have to agree with Shuxin here. The LangRef says about the
> >> semantics of "llvm.objectsize.i32/i64":
> >>
> >> "The llvm.objectsize intrinsic is designed to provide information
> >> to the optimizers to determine at compile time whether a) an
> >> operation (like memcpy) will overflow a buffer that corresponds
> >> to an object, or b) that a runtime check for overflow isn’t
> >> necessary. An object in this context means an allocation of a
> >> specific class, structure, array, or other object."
> >>
> >> "The llvm.objectsize intrinsic is lowered to a constant
> >> representing the size of the object concerned. If the size cannot
> >> be determined at compile time,llvm.objectsize returns i32/i64 -1
> >> or 0 (depending on the min argument)."
> >>
> >> Emphasis: "An object in this context means an allocation of a
> >> specific class, structure, array, or other object."
> >>
> >> To me this can be clearly interpreted as a requirement to point to
> >> one "identified underlying object" in LLVM terms. A front end
> >> could now assume that if "llvm.objectsize.i32/i64" returns a
> >> value other than the failurer value it is guaranteed that the
> >> pointer value points to one underlying object and not two.
> >
> > Let me give an example:
> >
> > %p1 = malloc(42);
> > %p2 = malloc(42);
> > %p3 = phi(%p1, %p2);
> > %size = @llvm.objectsize(%p3);
> >
> > The patch I commited enables instcombine to replace %size with the
> > constant 42.
>
> Yes, and that is the problem. I claim that it is questionable whether
> the semantics of "llvm.objectsize" does allow you to do so.
>
> "The llvm.objectsize intrinsic is lowered to a constant representing
> the size of THE object concerned"
> - http://llvm.org/docs/LangRef.html#llvm-objectsize-intrinsic
>
> There is no mentioning of more than one object. I believe you could
> interpret this as a contract that llvm.objectsize will only return a
> valid size value for one identified underlying object.
>
> "An object in this context means AN allocation of A specific class,
> structure, array, or other object."
> - http://llvm.org/docs/LangRef.html#llvm-objectsize-intrinsic
>
> Again, no mentioning of a set of objects. To me this reads as a clear
> specification of what the object is. ONE and not a set.
>
> If you think that the semantics of "llvm.objectsize" should be
> different in that it allows for a set of underlying objects, again,
> I think we have to change the LangRef. We might be breaking out of
> tree code that interpreted the semantics as I did, though.
>
> What I am concerned with is external code that took my interpretation
> of what llvm.objectsize means:
>
> Imagine a code generator taking my view and generating the following
> code:
>
> %p1 = malloc(42);
> %p2 = malloc(42);
> %p3 = phi(%p1, %p2);
> %size = @llvm.objectsize(%p3, true);
>
> if (%size != 0) {
> free(%p1)
> } else {
> free(%p1)
> free(%p2)
> }
I'll comment from the sidelines that, while I can see this is a theoretical possibility, we should not use it as a justification for inhibiting a useful optimization. Nevertheless, this seems to be a solvable problem. For example, we could add a third parameter to the objectsize intrinsic indicating whether ambiguous object sizes are allowed, and we could auto-upgrade the two argument form to the three argument form such that the new argument defaults to false.
-Hal
>
> Now you have a dangling pointer p2. Or imagine code that initialize a
> class with valid values …
>
> > I fail to see why that transformation is incorrect. While it's
> > true that we don't know to which object %p3 will point to at run
> > time, we know that its size will be 42. Of course, if we had
> > %p2=malloc(3)', then we couldn't do this constant propagation.
> > For the analysis of objectsize, it's safe to look to a superset of
> > the objects pointed-to by the input pointer.
> >
>
> It is true that we can possibly determine an "common size of a
> pointer pointing to a set of underlying objects" which is not
> necessarily the same thing as "llvm.objectsize".
>
> The problem is with what the semantics of llvm.objectsize is. What
> does it return if the pointer could point to two objects? I believe
> that from the text in
> "http://llvm.org/docs/LangRef.html#llvm-objectsize-intrinsic" you
> cannot replace it with a constant if you have several objects or at
> least that it is a feasible interpretation of the text.
>
> >
> > Shuxin: I'm not breaking any assumption here. If you notice, we
> > already do a similar analysis for select instructions. Phi nodes
> > were not processed because nobody (I mean, me) implemented the
> > code.
>
> Then, existing code also breaks the semantics of llvm.objectsize and
> has to be fixed. At least if you take the view that my
> interpretation of llvm.objectsize is a feasible one.
>
> Again, there are two issues here:
>
> What is the semantics of "llvm.objectsize"?
>
> What size can we compute for a pointer that might point to a set of
> underlying objects.
>
> Those might be two separate things, depending on which view you take
> on the semantics of "llvm.objectsize".
>
>
> > I agree BasicAA could use more comments. I'll add a comment there.
> >
> > Nuno
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
More information about the llvm-commits
mailing list