[llvm-dev] LangRef semantics for shufflevector with undef mask is incorrect

Wed Nov 27 01:32:16 PST 2019

On 11/27/19 2:10 AM, Eli Friedman via llvm-dev wrote:

The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form.  From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values."  So really, we can resolve this any way we want; "undef" in this context doesn't have to mean the same thing as "undef" in other contexts.  Formally, at the LangRef level, we can state that the shuffle mask is not an operand of a shufflevector; instead, it's not a value at all.  It's just a description of the shuffle, defined with a grammar similar to a vector constant.  Then we can talk about shuffle masks where an element is the string "undef", unrelated to the general notion of an undef value.

That is something that has been on my mind for a while now. You can ask the same why we use 'undef' for phi nodes. Eg it is legal to turn this:

   %x = phi i32 [ 0, A ], [ undef, B ]

into

   %x = phi i32 [ 0, A ], [ 1, B ]

which arguing by the intended semantics of phi nodes should be an illegal transformation but isn't in LLVM.

I think that we abuse the 'undef' (symbol) to mute instruction parameters whenever that parameter doesn't matter but we are shy of 'some' value handle to feed the operand slot.

IMHO for those cases, we need a proper '\bot' constant that denotes the absence of a concrete value as opposed to 'undef' (conceptually '\top'), which could be any value you'd like it to be.

- Simon

In that context, the existing LangRef description of the result of shufflevector can be interpreted to mean exactly what it says. This would imply it's forbidden to transform the shuffle mask "<2 x i32> <i32 undef, i32 0>" to "<2 x i32> <i32 1, i32 0>" if the input might contain poison. (And this is the same logic that led to https://reviews.llvm.org/D70246 .)

That said, long-term, we probably want to switch shufflevector to produce poison.

-Eli

-----Original Message-----
From: Nuno Lopes <nuno.lopes at ist.utl.pt><mailto:nuno.lopes at ist.utl.pt>
Sent: Tuesday, November 26, 2019 3:20 PM
To: LLVMdev <llvm-dev at lists.llvm.org><mailto:llvm-dev at lists.llvm.org>
Cc: spatel at rotateright.com<mailto:spatel at rotateright.com>; Eli Friedman <efriedma at quicinc.com><mailto:efriedma at quicinc.com>; Juneyoung
Lee <juneyoung.lee at sf.snu.ac.kr><mailto:juneyoung.lee at sf.snu.ac.kr>; zhengyang-liu at hotmail.com<mailto:zhengyang-liu at hotmail.com>; John Regehr
<regehr at cs.utah.edu><mailto:regehr at cs.utah.edu>
Subject: [EXT] LangRef semantics for shufflevector with undef mask is incorrect

Hi,

This is a follow up on a discussion around shufflevector with undef mask in
https://reviews.llvm.org/D70641 and
https://bugs.llvm.org/show_bug.cgi?id=43958.

The current semantics of shufflevector in
http://llvm.org/docs/LangRef.html#shufflevector-instruction states:
"If the shuffle mask is undef, the result vector is undef. If any element of
the mask operand is undef, that element of the result is undef."

We found this semantics to be problematic. TL;DR: instructions can't detect
if an operand is undef.
Long story:
Undef can be replaced with any value at any time. It's valid to replace
undef with 0 or 1 or anything else.

A possible sequence of optimizations with sufflevector could be as follows:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef,
i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 2, i32
0>
->
%v = <undef, %x[0]>

So this respects the semantics in LangRef: the mask is undef, so the
resulting element is undef.

However, there's an alternative sequence of optimizations:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef,
i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 1, i32
0>
->
%v = <%x[1], %x[0]>

So now it depends on what the value of %x[1] is. If it's poison, we obtain:
%v = <poison, %x[0]>

This result contradicts the semantics in LangRef, even though no individual
transformation we did above is wrong.
In summary, an instruction cannot detect undef and have special semantics
for it.

AFAICT, the only way to fix the semantics of shufflevector is to say that if
the mask is out-of-bounds, we yield poison. That's correct because there's
nothing worse than poison.
Since we can replace undef with an OOB index, an undef mask can safely yield
poison. Or it would yield one of the input elements, which is poison in the
worst case. So we get poison in both cases.

I guess the issue to make this semantics a reality is that we would need to
introduce a poison value (which is a good thing IMHO). Otherwise we can't
continue doing some of the folds we have today since we don't have a poison
constant to replace undef when folding.

Any comments/suggestions appreciated!

Thanks,
Nuno

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

 Click https://www.mailcontrol.com/sr/DTFn8g8lJpvGX2PQPOmvUqrlA1_9RTN2czBGo4QKb0koZUfpwhBgW61USUO2MCVrfSeJemAnwsVue2rhFpgg0Q==  to report this email as spam.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191127/19ba1fee/attachment.html>