[llvm-bugs] [Bug 44544] New: Nested loop unroll bug on skylake avx512
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Jan 14 08:14:35 PST 2020
https://bugs.llvm.org/show_bug.cgi?id=44544
Bug ID: 44544
Summary: Nested loop unroll bug on skylake avx512
Product: clang
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: C++
Assignee: unassignedclangbugs at nondot.org
Reporter: jakobschwarz at yahoo.com
CC: blitzrakete at gmail.com, dgregor at apple.com,
erik.pilkington at gmail.com, llvm-bugs at lists.llvm.org,
richard-llvm at metafoo.co.uk
I think, I found a bug in clang, tested on local machines and on godbolt with
clang 7, 8 and 9. It only occurs with -O3 optimization and
-march=skylake-avx512. With GCC and Intel the code produces correct results.
Disabling loop nesting in the example is also fine with Clang. The code should
return just zeros in the cout print.
#include <iostream>
int main(int argc, char *argv[])
{
static constexpr uint32_t mult = 4u;
static constexpr uint64_t MASK_H = 0x000000000000FFFFull;
uint64_t arr2[16][4];
for(auto i=0; i<16; i++) for(auto j=0; j<4; j++) arr2[i][j] = ~uint64_t(0);
uint64_t* mm =&arr2[0][0];
for(uint32_t zz=0; zz<16; zz++){
// #pragma clang loop unroll(disable)
for(uint32_t yy=0; yy<16; yy++){
const uint32_t ID = yy+zz*16;
const uint64_t mask = ~(MASK_H<<(ID%mult*16));
mm[ID/mult] &= mask;
}
}
for(auto i=0; i<16; i++) {
for(auto j=0; j<4; j++) std::cout << arr2[i][j] << " ";
std::cout << std::endl;
}
return 0;
}
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200114/bf291126/attachment.html>
More information about the llvm-bugs
mailing list