[LLVMdev] Troubling promotion of return value to Integer ...

Fri May 16 13:39:08 PDT 2008

On Fri, 2008-05-16 at 13:00 -0700, Chris Lattner wrote:
> Ok, so these are basically "copies with assertions".  There are a  
> couple of similar proposals for things like this (for example, with  
> pointers you might want to make some assertion w.r.t. aliasing).
> 
> The downside of this sort of approach (vs attributes) is that it  
> increases the size of the every caller of the function.  The advantage  
> is that it is more explicit, potentially more general, and doesn't  
> require new attributes for each size.  OTOH, some of these advantages  
> go away if this gets extended for other properties: we end up having a  
> copy with attributes on it :)

Agreed, with one caveat:

I don't understand how attributes work at this point, but if the
use-case is specific to each caller, the attribute will need to be
attached separately to each call point, and the difference in space for
such cases vs "copies with assertions" may not turn out to be very big.

A general observation I would make -- this is coming from other systems
that I have worked on and does not reflect any understanding of LLVM
internals -- is that annotation schemes are often a source of error in
programs. Perhaps I have gotten them wrong when I have attempted them,
but what I have found in the past is that I end up with a lot of code
where I have a primary switch-like construct on the main object type (in
this case I imagine a switch on IR instruction), followed by a bunch of
special cases that look for applicable annotations within each case.

What I have observed to happen when code is structured this way is that
two types of coding failures become common:

  1. Failures of case analysis, where some annotation is applicable in
     several places or states, but is missing in some subset of these.
     These can be hard to locate.

  2. Failures of update, of the form "I caught 4 of the 5 places that
     needed this update."

  3. What one really ends up with, in effect, is a situation where the
     *real* instruction (from a semantic perspective) was the alleged
     instruction plus the semantically important annotation, but the
     representational encoding of the real instruction has been made
     awkward to deal with by splitting it into multiple substructures.

That is: they are "robust to error", in the sense that with high
likelihood errors will be introduced by maintenance over time.

Because of this, I have come to feel in my own code that
operation-specific annotations are often better dealt with by new
operations that can be inserted directly into the primary switch. I can
always run multiple cases into the same switch block, and the compiler
will help me notice which ones I forgot.  I try to reserve use of
annotations and attributes for situations where:

  1. The annotation/attribute is advisory or non-semantic, or

  2. The annotation/attribute is almost universally applicable, and is
     therefore not likely to need independent consideration at
     multiple points in a long switch-like construct. Instead
     it gets dealt with at one place, either at top or bottom of
     the switch.

It's a trade-off, and I see no hard and fast rule that can be extracted
here, but my personal bias is to err in favor of robustness under
maintenance rather than in favor of reduced memory consumption.

shap