[llvm-dev] RFC: Killing undef and spreading poison

Fri Nov 4 10:34:47 PDT 2016

On Wed, Oct 19, 2016 at 4:52 AM Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:

> > I'm still digesting the proposal in its entirety, but one point I want
> to call out here:
> >
> >> On Tue, Oct 18, 2016 at 1:39 PM Friedman, Eli via llvm-dev wrote:
> >> Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src;
> >> store i32 dst".  I guess that's illegal under this model?  How about the
> >> related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4
> >> x i8> dst"?  (SROA also performs similar transforms.)
> >
> > I think it is really important that memcpy and load/store pair be
> equivalent. If this is in fact true about the proposal, I would want to
> really dig into why and whether anything could be done to make them
> equivalent.
>
> Memcpy does a byte-by-byte copy. So if one of the bits is poison then only
> the byte containing that bit becomes poison.
> Therefore, memcpy(x, y, 1) is equivalent to load i8.  But memcpy(x,y,4) is
> not equivalent to "load i32" since load makes the whole value poison if any
> of the bits is poison.
>

That is not the semantic model that LLVM uses today. Your proposal is
changing this, and it is an *incredibly* fundamental part of the optimizer.

LLVM, very fundamentally, models an integer, a vector of bytes, and a
vector of bits as absolutely identical to the underlying storage in memory.
Changing this would completely break SROA, various parts of mem2reg and
reg2mem, and all load widening.

I understand you have reasons here, but this is a much bigger change than
the undef vs. poison part. ;] I'm currently deeply concerned by this change
and would strongly prefer looking at alternatives.

> The alternative as given by Eli is to use "load <4 x i8>".  Since now we
> are loading 4 separate values, poison does not extend past the byte
> boundary.  When load is lowered, you should get exactly the same code as
> with "load i32", though.
> So the hope is that there's no diff at assembly level.
>

Note that vectors in LLVM are both quite limited and will have likely very
different representations in the assembly.

You're essentially conferring a *type* onto *memory*. That is not currently
LLVM's model. Instead, the types in LLVM are tied to *operations*.

As a consequence, when LLVM sees a vector type, it canonicalizes,
optimizes, and code generates to facilitate vector operations on values of
vector type. So for example, a vector of 32 i1s is likely to be put into a
vector mask register.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161104/d43cfd57/attachment.html>