[llvm-dev] LangRef semantics for shufflevector with undef mask is incorrect
Nuno Lopes via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 27 04:31:47 PST 2019
Quoting Simon Moll via llvm-dev <llvm-dev at lists.llvm.org>:
> On 11/27/19 2:10 AM, Eli Friedman via llvm-dev wrote:
>
> The shuffle mask of a shufflevector is special: it's required to be
> a constant in a specific form. From LangRef: "The shuffle mask
> operand is required to be a constant vector with either constant
> integer or undef values." So really, we can resolve this any way we
> want; "undef" in this context doesn't have to mean the same thing as
> "undef" in other contexts. Formally, at the LangRef level, we can
> state that the shuffle mask is not an operand of a shufflevector;
> instead, it's not a value at all. It's just a description of the
> shuffle, defined with a grammar similar to a vector constant. Then
> we can talk about shuffle masks where an element is the string
> "undef", unrelated to the general notion of an undef value.
>
> That is something that has been on my mind for a while now. You can
> ask the same why we use 'undef' for phi nodes. Eg it is legal to
> turn this:
>
> %x = phi i32 [ 0, A ], [ undef, B ]
>
> into
>
> %x = phi i32 [ 0, A ], [ 1, B ]
>
> which arguing by the intended semantics of phi nodes should be an
> illegal transformation but isn't in LLVM.
>
> I think that we abuse the 'undef' (symbol) to mute instruction
> parameters whenever that parameter doesn't matter but we are shy of
> 'some' value handle to feed the operand slot.
>
> IMHO for those cases, we need a proper '\bot' constant that denotes
> the absence of a concrete value as opposed to 'undef' (conceptually
> '\top'), which could be any value you'd like it to be.
From a correctness perspective, it's fine to use undef in phi nodes.
But I agree that for e.g. static analysis is not ideal, since as undef
can take any value, any static analysis you do must return \top for a
phi node with at least one undef input.
Switching to a poison value doesn't fix the issue either, since poison
can still be refined by any value.
This discussion is a bit orthogonal to shufflevector, but consider
this example:
%x = phi [\bottom, A], [%v, B]
If we know that %v \in [0, 3], I assume you want to conclude that %x
\in [2, 3].
Now assume that %v is only computed in basic block B and not in A.
When you generate assembly, which value do you use when jumping from
A? It has to be a value that respects any static analysis you've done
for %v, so it has to be 2 or 3, and you can't use %v.
It's non-trivial. I'm not seeing an easy solution to this precision
problem. It's an interesting problem nevertheless.
Nuno
More information about the llvm-dev
mailing list