[llvm-dev] analysis based on nonnull attribute

Fri Dec 16 12:26:53 PST 2016

And this is where things get complicated...

My gut reaction is that inferring nonnull across function boundaries is 
probably a good idea.  Let me now figure out how to explain why. :)

Most of the optimizer is structured around single function 
optimization.  Within a single function, we typically don't cache 
analysis results within the IR.  Across function boundaries, we do (e.g. 
readonly, readynone, etc...).  Why the split?

I *think*  that at a high level, this comes down to a issue of 
practicality.  We don't have a good mechanism for accessing module level 
analysis results within function transform passes.  If we did, maybe 
we'd have a different set of decisions here.  Given we don't, we've made 
the decision that function boundaries are a reasonable place to 
summarize analysis results.

p.s. I'll freely admit I'm out on a bit of a limb here philosophy wise.  
If other folks have alternate views, I'd be very interested in hearing 
them.

Philip

On 12/16/2016 12:03 PM, Sanjay Patel wrote:
> Based on the earlier comments in this thread and the existence of a 
> transform that adds 'nonnull' to callsite params, I have proposed a 
> patch to extend nonnull to a parent function:
> https://reviews.llvm.org/D27855 <https://reviews.llvm.org/D27855>
> ...but given today's comments about inferring the analysis rather than 
> making it part of the IR, this might be the wrong approach?
>
> On Fri, Dec 16, 2016 at 12:49 PM, Philip Reames via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
>
>     On 12/16/2016 11:37 AM, Michael Kuperstein wrote:
>>     Calling an instruction a "source" is basically another way to say
>>     "we can't dataflow through this".
>>
>>     What I'm trying to say is that this is not really a property of
>>     the instruction type.
>>     I agree we should be adding annotations sparingly - that is, we
>>     should not annotate something we can infer. But that's a semantic
>>     property, so I don't really see why that means we should prohibit
>>     annotating certain instructions on the syntactic level.
>     I'm not opposed to this per se, but I see it as a slippery slope
>     argument.  One of the foundational design principles of LLVM is
>     that analysis results are inferred from the IR, not part of the
>     IR.  This decision is hugely important for stability and
>     extensibility of the framework.  If we ever got to the day where
>     we were putting !range on an add instruction as part of a
>     transform pass, that would clearly be several steps too far.
>>
>>     Admittedly, the only example I have in mind right now is the one
>>     under discussion above - if we have:
>>
>>     %p = select i1 %a, i8* %x, i8 *y
>>     call void foo(i8* nonnull %p)
>>
>>     Then after inlining foo, we lose the non-null information for %p
>>     unless we annotate it - and we can't propagate it through the
>>     select. The same would happen for a phi,
>     Are there cases where we loose information by inlining this
>     example?  Yes.  Are they common?  I don't know.  In particular, if
>     foo contains an unconditional load from %p, we don't actually
>     loose any information by inlining.  Similarly, we can frequently
>     infer the non-null fact from another source.
>
>     Just to be clear, I want to spell out a distinction between having
>     metadata available for frontend annotation and having the
>     optimizer itself introduce metadata.  The former is a much easier
>     request because it implies a much smaller maintenance burden.  If
>     we screw something up (compile time say), then only the frontend
>     that cared bears the cost.  If we start having the optimizer
>     itself introduce metadata (or assumes, etc..), then the
>     justification has to sufficient for *all* frontends and use
>     cases.  In practice, that's going to be a much higher bar to clear.
>
>
>>
>>     On Fri, Dec 16, 2016 at 11:25 AM, Philip Reames
>>     <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>>
>>         The general idea to date has been only "sources" get
>>         annotations.  If there's something we fundamentally *can't*
>>         analyze through, that's where we annotate.  We try not to use
>>         annotations for places where we could have but didn't.
>>
>>         e.g. call metadata/attributes allow us to model external
>>         calls, load metadata allow us to model frontend knowledge of
>>         external memory locations, etc..
>>
>>
>>         On 12/16/2016 11:03 AM, Michael Kuperstein via llvm-dev wrote:
>>>         By the way, I've been wondering - why can we only attach
>>>         !nonnull and !range to loads (for both) and call/invoke (for
>>>         !range)?
>>>
>>>         I mean, those are all instructions you can't do dataflow
>>>         through - intraprocedurally, w/o memoryssa - but why only
>>>         these instructions? Why not allow annotating any pointer def
>>>         with !nonnull and any integer def with !range?
>>>         Sure, that's redundant w.r.t llvm.assume, but so are the
>>>         existing annotations.
>>>
>>>         On Wed, Dec 14, 2016 at 11:20 PM, Hal Finkel
>>>         <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
>>>
>>>
>>>
>>>             ------------------------------------------------------------------------
>>>
>>>                 *From: *"Michael Kuperstein"
>>>                 <michael.kuperstein at gmail.com
>>>                 <mailto:michael.kuperstein at gmail.com>>
>>>                 *To: *"Hal Finkel" <hfinkel at anl.gov
>>>                 <mailto:hfinkel at anl.gov>>
>>>                 *Cc: *"Sanjay Patel" <spatel at rotateright.com
>>>                 <mailto:spatel at rotateright.com>>, "llvm-dev"
>>>                 <llvm-dev at lists.llvm.org
>>>                 <mailto:llvm-dev at lists.llvm.org>>, "Michael
>>>                 Kuperstein" <mkuper at google.com
>>>                 <mailto:mkuper at google.com>>
>>>                 *Sent: *Thursday, December 15, 2016 1:13:07 AM
>>>                 *Subject: *Re: [llvm-dev] analysis based on nonnull
>>>                 attribute
>>>
>>>                 I think what Sanjay is getting at is that it's not
>>>                 an integer, it's still a pointer - but it's not
>>>                 clear where information about non-nullness of the
>>>                 pointer should be propagated to.
>>>
>>>                 In this particular case, since the def of %x in the
>>>                 caller is also an argument, we could propagate it to
>>>                 the def directly, e.g.
>>>
>>>                 define i1 @foo(i32* nonnull %x) {
>>>                 %y.i = load i32, i32* %x ; inlined, still known to
>>>                 be nonnull
>>>
>>>                 And if the def of %x was a load, we could use
>>>                 !nonnull. But I'm not sure what we can do in the
>>>                 general case (say, %x = select...).
>>>                 The best I can think of is generating an llvm.assume
>>>                 for the condition.
>>>
>>>             True. In this case, the preferred thing would be to add
>>>             the nonnull attribute to the caller's parameter. Adding
>>>             llvm.assume is indeed a general solution.
>>>
>>>              -Hal
>>>
>>>
>>>                 Michael
>>>
>>>                 On 14 December 2016 at 14:05, Hal Finkel via
>>>                 llvm-dev <llvm-dev at lists.llvm.org
>>>                 <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>
>>>
>>>                     ------------------------------------------------------------------------
>>>
>>>                         *From: *"Sanjay Patel"
>>>                         <spatel at rotateright.com
>>>                         <mailto:spatel at rotateright.com>>
>>>                         *To: *"Hal Finkel" <hfinkel at anl.gov
>>>                         <mailto:hfinkel at anl.gov>>
>>>                         *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org
>>>                         <mailto:llvm-dev at lists.llvm.org>>
>>>                         *Sent: *Wednesday, December 14, 2016 4:03:40 PM
>>>                         *Subject: *Re: [llvm-dev] analysis based on
>>>                         nonnull attribute
>>>
>>>
>>>
>>>
>>>                         On Wed, Dec 14, 2016 at 2:51 PM, Hal Finkel
>>>                         <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>
>>>                         wrote:
>>>
>>>
>>>
>>>                             ------------------------------------------------------------------------
>>>
>>>                                 *From: *"Sanjay Patel via llvm-dev"
>>>                                 <llvm-dev at lists.llvm.org
>>>                                 <mailto:llvm-dev at lists.llvm.org>>
>>>                                 *To: *"llvm-dev"
>>>                                 <llvm-dev at lists.llvm.org
>>>                                 <mailto:llvm-dev at lists.llvm.org>>
>>>                                 *Sent: *Wednesday, December 14, 2016
>>>                                 3:47:03 PM
>>>                                 *Subject: *[llvm-dev] analysis based
>>>                                 on nonnull attribute
>>>
>>>                                 Does the nonnull parameter attribute
>>>                                 give us information about subsequent
>>>                                 uses of that value outside of the
>>>                                 function that has the attribute?
>>>
>>>                             Yes. We're guaranteeing that we never
>>>                             pass a null value for the argument, so
>>>                             that information can be used to optimize
>>>                             the caller as well.
>>>
>>>
>>>                         Thanks! I don't know if that will actually
>>>                         solve our sub-optimal output for dyn_cast
>>>                         (!), but it might help...
>>>                         https://llvm.org/bugs/show_bug.cgi?id=28430
>>>
>>>
>>>
>>>                                 Example:
>>>
>>>                                 define i1 @bar(i32* nonnull %x) { ;
>>>                                 %x must be non-null in this function
>>>                                   %y = load i32, i32* %x
>>>                                   %z = icmp ugt i32 %y, 23
>>>                                   ret i1 %z
>>>                                 }
>>>
>>>                                 define i1 @foo(i32* %x) {
>>>                                   %d = call i1 @bar(i32* %x)
>>>                                   %null_check = icmp eq i32* %x,
>>>                                 null ; check if null after call that
>>>                                 guarantees non-null?
>>>                                   br i1 %null_check, label %t, label %f
>>>                                 t:
>>>                                   ret i1 1
>>>                                 f:
>>>                                   ret i1 %d
>>>                                 }
>>>
>>>                                 $ opt -inline nonnull.ll -S
>>>                                 ...
>>>                                 define i1 @foo(i32* %x) {
>>>                                   %y.i = load i32, i32* %x ; inlined
>>>                                 and non-null knowledge is lost?
>>>
>>>                             It should be replaced by !nonnull
>>>                             metadata on the load. We might not be
>>>                             doing that today, however.
>>>
>>>
>>>                         We can't tag this load with !nonnull though
>>>                         because this isn't a load of the pointer?
>>>                         "The existence of the |!nonnull| metadata on
>>>                         the instruction tells the optimizer that the
>>>                         value loaded is known to never be null. This
>>>                         is analogous to the |nonnull| attribute on
>>>                         parameters and return values. This metadata
>>>                         can only be applied to loads of a pointer
>>>                         type."
>>>
>>>                     True, but we have range metadata for integers.
>>>
>>>                      -Hal
>>>
>>>
>>>
>>>
>>>
>>>
>>>                     -- 
>>>                     Hal Finkel
>>>                     Lead, Compiler Technology and Programming Languages
>>>                     Leadership Computing Facility
>>>                     Argonne National Laboratory
>>>
>>>                     _______________________________________________
>>>                     LLVM Developers mailing list
>>>                     llvm-dev at lists.llvm.org
>>>                     <mailto:llvm-dev at lists.llvm.org>
>>>                     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>                     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Hal Finkel
>>>             Lead, Compiler Technology and Programming Languages
>>>             Leadership Computing Facility
>>>             Argonne National Laboratory
>>>
>>>
>>>
>>>
>>>         _______________________________________________
>>>         LLVM Developers mailing list
>>>         llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>
>     _______________________________________________ LLVM Developers
>     mailing list llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161216/13a9c8c2/attachment-0001.html>