<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 10/3/2016 1:51 PM, Tom Stellard via
llvm-dev wrote:<br>
</div>
<blockquote cite="mid:20161003205138.GA26423@freedesktop.org"
type="cite">
<pre wrap="">Hi,
I've found a test case where SelectionDAG is doing an undefined behavior
optimization, and I need help determining whether or not this is legal.
Here is the example IR:
define void @test(<4 x i8> addrspace(1)* %out, float %a) {
%uint8 = fptoui float %a to i8
%vec = insertelement <4 x i8> <i8 0, i8 0, i8 0, i8 0>, i8 %uint8, i32 0
store <4 x i8> %vec, <4 x i8> addrspace(1)* %out
ret void
}
Since %vec is a 32-bit vector, a common way to implement this function on a target
with 32-bit registers would be to zero initialize a 32-bit register to hold
the initial vector and then 'mask' and 'or' the inserted value with the
initial vector. In AMDGPU assembly it would look something like:
v_mov_b32 v0, 0
v_cvt_u32_f32_e32 v1, s0
v_and_b32 v1, v1, 0x000000ff
v_or_b32 v0, v0, v1
The optimization the SelectionDAG does for us in this function, though, ends
up removing the mask operation. Which gives us:
v_mov_b32 v0, 0
v_cvt_u32_f32_e32 v1, s0
v_or_b32 v0, v0, v1
The reason the SelectionDAG is doing this is because it knows that the result
of %uint8 = fptoui float %a to i8 is undefined when the result uses more than
8-bits. So, it assumes that the result will only set the low 8-bits, because
anything else would be undefined behavior and the program would be broken.
This assumption is what causes it to remove the 'and' operation.
So effectively, what has happened here, is that by inserting the result of
an operation with undefined behavior into one lane of a vector, we have
overwritten all the other lanes of the vector.
Is this optimization legal? To me it seems wrong that undefined behavior
in one lane of a vector could affect another lane. However, given that LLVM IR
is SSA and we are technically creating a new vector and not modifying the old
one, then maybe it's OK. I'm just not sure.
Appreciate any insight people may have.
</pre>
</blockquote>
<br>
The way insertelement is defined, inserting an element never affects
the other elements of the vector ("<span style="color: rgb(0, 0, 0);
font-family: "Lucida Grande", "Lucida Sans
Unicode", Geneva, Verdana, sans-serif; font-size: 14px;
font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: normal; letter-spacing:
normal; orphans: 2; text-align: left; text-indent: 0px;
text-transform: none; white-space: normal; widows: 2;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(255, 255, 255); display: inline !important;
float: none;">Its element values are those of<span
class="Apple-converted-space"> </span></span><code
class="docutils literal" style="font-family: Consolas, "Deja
Vu Sans Mono", "Bitstream Vera Sans Mono",
monospace; font-size: 0.95em; color: rgb(0, 0, 0); font-style:
normal; font-variant-ligatures: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; orphans: 2;
text-align: left; text-indent: 0px; text-transform: none;
white-space: normal; widows: 2; word-spacing: 0px;
-webkit-text-stroke-width: 0px; background-color: rgb(255, 255,
255);"><span class="pre">val</span></code>...") So the question
is whether you're triggering undefined behavior in some other way.
Looking at LangRef for fptoui, it says "If the value cannot fit in<span
class="Apple-converted-space"> </span><code class="docutils
literal" style="font-family: Consolas, "Deja Vu Sans
Mono", "Bitstream Vera Sans Mono", monospace;
font-size: 0.95em;"><span class="pre">ty2</span></code>, the
results are undefined", i.e. the value is equivalent to the constant
"undef". Therefore, you should end up storing "<4 x i8>
<undef, 0, 0, 0>", not "<4 x i8> undef".<br>
<br>
Note that there's a tradeoff here: saying that fptoui for
out-of-range values doesn't have undefined behavior allows us to
simplify control flow and hoist operations more aggressively.<br>
<br>
-Eli<br>
<br>
--
<pre class="moz-signature" cols="72">Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>