[LLVMdev] The nsw story

Mon Dec 5 19:42:00 PST 2011

On Dec 5, 2011, at 4:44 PM, Paul Robinson wrote:

> (If this thread is becoming tiresome, let me know. This newbie is trying to
> understand some of what's going on; clearly you've thought about it way more
> than I have, and I can understand if you want to stop thinking about it!)
> 
> On Mon, Dec 5, 2011 at 2:22 PM, Dan Gohman <gohman at apple.com> wrote:
> On Dec 5, 2011, at 11:55 AM, Paul Robinson wrote:
> >
> > On Thu, Dec 1, 2011 at 9:37 AM, David A. Greene <greened at obbligato.org> wrote:
> > Dan Gohman <gohman at apple.com> writes:
> >
> > > Prohibiting poison values from propogating through memory would mean
> > > that the reg2mem pass would no longer be a semantics-preserving pass.
> >
> > Or it means you couldn't demote those values.
> >
> > If reg2mem is constructing spill slots, or address-not-taken locals, I could
> > see classifying those as okay-to-poison. You would not want statics or globals
> > or pointer-addressed memory to be poisonable, as those should be treated
> > as being observable locations.
> >
> > So, it's not really "all of memory" that could be poisoned, only these
> > compiler-generated things.
> 
> This would mean that it's not valid to do store-to-load propagation
> in the optimizer if the pointee might be a global variable. That
> would mean the loss of real optimizations.
> 
> And it would still mean that non-volatile store instructions suddenly
> can have side effects beyond just writing to their addressed memory.
> That would be surprising.
> 
> I think Dave also suggested making select instructions be
> observation points. That would mean that select instructions would
> have side effects. Again, that would be surprising.
> 
> Dan
> 
> Back in your first message on this thread, you did say:
> 
> >The poison would
> >propagate down the def-use graph until it reaches an instruction that
> >is capable of triggering the undefined behavior.
> 
> So, somebody somewhere has to be capable of triggering the undefined
> behavior. What instructions did you have in mind? If the store,
> select, and compare instructions aren't appropriate, which ones are?

A call to printf (I/O), a volatile memory operation, etc. These things
can never be speculated. They are essentially the anchors of the system.

> I don't think this next suggestion was on your list, so here goes:
> Drawing from the Itanium example, we could define a new 'check' 
> instruction, which would explicitly trigger the undefined behavior if its
> operand is poisoned, or produce a known-safe value otherwise. Yes, the
> IR would have to be sprinkled with these things; but only outputs of 'nsw'
> kinds of operations would subsequently need a 'check' to sanitize them.
> 
> So, if a potentially poison value propagates down the def-use graph
> until it reaches a store/select/compare, that instruction must be 
> preceded by a 'check' that will sanitize the value.
> 
> (Or maybe we should call it the 'foodtaster' instruction, which protects
> the important instructions from poison...)
> 
> Separating out the induce-undefined-behavior instruction means the
> existing semantics of store/select/compare are unaffected, the main
> consequences being that you can't move these guys past the 'check'
> that feeds them.  But that constraint would exist automatically anyway,
> right? because you couldn't move an instruction up prior to where its
> input values are defined.
> 
> Clever/optimal placement of 'check' becomes a topic of interest, but
> that's kind of the point of the exercise.

For example, suppose we want to convert the && to &, and the ?: to a
select, in this code:

if (a && (b ? (c + d) : e)) {

because we have a CPU architecture with poor branch prediction, or
because we want more ILP, or because of some other reason. Here's what the
LLVM IR for that might look like:

   %t0 = add nsw i32 %c, %d
   %t1 = select i1 %b, i32 %t0, i32 %e
   %t2 = icmp ne i32 %t1, 0
   %t3 = and i1 %a, %t2
   br i1 %t3, ...

The extra branching is gone, yay. But now we've put an add nsw out there
to be executed unconditionally. If we make the select an observation
point, we'd have introduced undefined behavior on a path that didn't
previously have it.

A foodtaster instruction doesn't really solve this problem, because
we'd have to put it between the add and the select, and it would be
just as problematic.

The current definition of nsw is an attempt to avoid arbitrary
limitations and keep the system as flexible as possible.

One could argue that aggressive speculation is a low-level optimization
better suited to CodeGen than the optimizer, as LLVM divides them, and
that perhaps the cost for providing this level of flexibility in the
optimizer is too high, but that's a different argument.

Dan