<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Incorrect codegen for "store <2 x i48>" triggered by -fslp-vectorize-aggressive"
href="https://llvm.org/bugs/show_bug.cgi?id=28845">28845</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Incorrect codegen for "store <2 x i48>" triggered by -fslp-vectorize-aggressive
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>babokin@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>elena.demikhovsky@intel.com, llvm-bugs@lists.llvm.org, Vsevolod.Livinskij@frtk.ru
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=16882" name="attach_16882" title="reproducer">attachment 16882</a> <a href="attachment.cgi?id=16882&action=edit" title="reproducer">[details]</a></span>
reproducer
Attached test case has a structure with a number of bit fields (discard the
weirdness of C defintion of the structure, it's not important, while LLVM IR
defintion is important).
LLVM IR structure defintion:
%struct.struct_1 = type { [6 x i8], [6 x i8], i24 }
Initialization happens by read-modity-write of two 48 bit chunks, no magic
here.
define void @_Z4initv() local_unnamed_addr #0 {
entry:
%bf.load = load i48, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8
%bf.clear = and i48 %bf.load, -8796091973633
%bf.set3 = or i48 %bf.clear, 7326889148416
store i48 %bf.set3, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8
%bf.load4 = load i48, i48* bitcast ([6 x i8]* getelementptr inbounds
(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2
%bf.clear5 = and i48 %bf.load4, -2198956146689
%bf.set6 = or i48 %bf.clear5, 822419128320
store i48 %bf.set6, i48* bitcast ([6 x i8]* getelementptr inbounds
(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2
ret void
}
But when test case is compiler with -fslp-vectorize-aggressive, this is
optimized to vector operations:
define void @_Z4initv() local_unnamed_addr #0 {
entry:
%bf.load = load <2 x i48>, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x
i48>*), align 8
%bf.clear = and <2 x i48> %bf.load, <i48 -8796091973633, i48 -2198956146689>
%bf.set3 = or <2 x i48> %bf.clear, <i48 7326889148416, i48 822419128320>
store <2 x i48> %bf.set3, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x
i48>*), align 8
ret void
}
This seems legal, but it leads to incorrect code generation. More specifically,
instead of two *consequent* 48 bit stores, stores happen with 16 bit gap.
Good:
movl %ecx, s1(%rip)
movw %cx, s1+4(%rip)
movl %ecx, s1+6(%rip)
movw %cx, s1+10(%rip)
Bad:
movq %xmm1, s1(%rip)
movq %xmm0, s1+8(%rip)
So the problem is in different meaning of <2 x i48> - in LLVM IR consequent 48
bit locations are assumed, while code generation assumes 64 bit alignment. My
guess that it's code gen bug.
To reproduce:
<span class="quote">>clang++ func.cpp test.cpp -o out -fslp-vectorize-aggressive -O2</span ></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>