[llvm-bugs] [Bug 28845] New: Incorrect codegen for "store <2 x i48>" triggered by -fslp-vectorize-aggressive
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Aug 4 08:03:13 PDT 2016
https://llvm.org/bugs/show_bug.cgi?id=28845
Bug ID: 28845
Summary: Incorrect codegen for "store <2 x i48>" triggered by
-fslp-vectorize-aggressive
Product: new-bugs
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: babokin at gmail.com
CC: elena.demikhovsky at intel.com, llvm-bugs at lists.llvm.org,
Vsevolod.Livinskij at frtk.ru
Classification: Unclassified
Created attachment 16882
--> https://llvm.org/bugs/attachment.cgi?id=16882&action=edit
reproducer
Attached test case has a structure with a number of bit fields (discard the
weirdness of C defintion of the structure, it's not important, while LLVM IR
defintion is important).
LLVM IR structure defintion:
%struct.struct_1 = type { [6 x i8], [6 x i8], i24 }
Initialization happens by read-modity-write of two 48 bit chunks, no magic
here.
define void @_Z4initv() local_unnamed_addr #0 {
entry:
%bf.load = load i48, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8
%bf.clear = and i48 %bf.load, -8796091973633
%bf.set3 = or i48 %bf.clear, 7326889148416
store i48 %bf.set3, i48* bitcast (%struct.struct_1* @s1 to i48*), align 8
%bf.load4 = load i48, i48* bitcast ([6 x i8]* getelementptr inbounds
(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2
%bf.clear5 = and i48 %bf.load4, -2198956146689
%bf.set6 = or i48 %bf.clear5, 822419128320
store i48 %bf.set6, i48* bitcast ([6 x i8]* getelementptr inbounds
(%struct.struct_1, %struct.struct_1* @s1, i64 0, i32 1) to i48*), align 2
ret void
}
But when test case is compiler with -fslp-vectorize-aggressive, this is
optimized to vector operations:
define void @_Z4initv() local_unnamed_addr #0 {
entry:
%bf.load = load <2 x i48>, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x
i48>*), align 8
%bf.clear = and <2 x i48> %bf.load, <i48 -8796091973633, i48 -2198956146689>
%bf.set3 = or <2 x i48> %bf.clear, <i48 7326889148416, i48 822419128320>
store <2 x i48> %bf.set3, <2 x i48>* bitcast (%struct.struct_1* @s1 to <2 x
i48>*), align 8
ret void
}
This seems legal, but it leads to incorrect code generation. More specifically,
instead of two *consequent* 48 bit stores, stores happen with 16 bit gap.
Good:
movl %ecx, s1(%rip)
movw %cx, s1+4(%rip)
movl %ecx, s1+6(%rip)
movw %cx, s1+10(%rip)
Bad:
movq %xmm1, s1(%rip)
movq %xmm0, s1+8(%rip)
So the problem is in different meaning of <2 x i48> - in LLVM IR consequent 48
bit locations are assumed, while code generation assumes 64 bit alignment. My
guess that it's code gen bug.
To reproduce:
>clang++ func.cpp test.cpp -o out -fslp-vectorize-aggressive -O2
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160804/a475f76d/attachment.html>
More information about the llvm-bugs
mailing list