[cfe-dev] Clang 4.0.1 C++ code generation issue (bug?)
via cfe-dev
cfe-dev at lists.llvm.org
Tue Jul 11 01:10:18 PDT 2017
Greetings Clangers,
SUMMARY:
I’m experiencing a C++ code generation difference between clang 3.9.0 & 4.0.1 that is resulting in unexpected behaviour in my code. The difference also exhibits between -O1 & -O2/3 with clang 4.0.1.
(I’ve seen this issue for i386 & x86_64 on OSX, but it probably affects all OSes.)
DISCLAIMER:
I’m not entirely sure if this is a bug in clang or some kind of undefined behaviour.
(I can’t fully get my head around the class.union section of the C++ standard to determine its secret meaning.)
I don’t know if the problem is located in clang’s code or LLVM’s code. (But I’m reporting it here.)
Using -std=c++14 or -std=c++1z makes no difference.
HISTORY:
I recently upgraded to clang 4.0.1 (using http://releases.llvm.org/4.0.1/clang+llvm-4.0.1-x86_64-apple-darwin.tar.xz). Most things seem to be fine (~2MLOC & ~150 binaries) apart from one problem that only exhibits itself in a release build. (The problem exhibited as corrupted documents & crashes … during testing.) The original code has been in use for > 15 years & was previously compiled, successfully, using clang 3.9.0 & 3.6.1, gcc 4.2.1 & CodeWarrior for PPC, PPC-64, i386 & x86-64 (in various combinations).
CODE:
I have spent several days distilling, reducing & refining the code that exhibits the problem to the following:
// TestClang.cpp - Test clang 4.0.1 code generation issue
#include <cstdio>
#include <cstring>
struct B {
char t;
union { char c; int x; void* p; };
#ifndef FIX
B& operator= (const B& rhs) { t = rhs.t; p = rhs.p; return *this; }
#endif
};
union U { union { int f; char s[8]; } n; B b; };
struct D {
const int size;
B e[8];
__attribute__((noinline)) D (int count, const U objs[]) : size(count)
{
U tmp{ .b.t = 1, .b.x = 0x123400 }; // b.x set to help see problem
#pragma clang loop unroll(disable) // Shortens generated code
for (auto* it = e; count--; ++it)
{
const U* val = objs++;
if (val->n.s[0] > 32)
{
tmp.b.x = val->n.f; // <<<<< PROBLEM IS AROUND HERE <<<<<
val = &tmp;
// if (!size) std::puts(tmp.n.s); // Also fixes code
}
*it = val->b;
}
}
};
int main (int argc, const char* argv[])
{
const char* args[] = { "one!", "two!" };
int count = argc ? 2 : argc - 1; // Prevent over optimisation
auto s = argc ? args : argv + 1;
U us[8];
for (int i = 0; i < count; ++i)
std::strncpy(us[i].n.s, *s++, 8);
const D dict(count, us);
for (int i = 0; i < dict.size; ++i)
{
auto& n = dict.e[i];
std::printf(" %u. '%.4s' [%u:$%08X]\n", i, &n.c, n.t, n.x);
}
}
// end
TESTS:
When the following line is executed:
clang -arch i386 -O2 -Wall -std=c++14 -stdlib=libc++ TestClang.cpp && ./a.out
the output is:
0. '' [1:$00123400]
1. '' [1:$00123400]
The output should be:
0. 'one!' [1:$21656E6F]
1. 'two!' [1:$216F7774]
Adding the option `-DFIX` generates the expected output.
Removing the `-O2` option with clang 4.0.1 generates the expected output without `-DFIX`.
Uncommenting line 31 also generates the expected output with clang 4.0.1.
(Line 31 does nothing - other than tricking the optimiser.)
Using `-arch x86_64` and/or `-std=c++1z` with clang 4.0.1 makes no difference.
Using `-Weverything` provides no useful output! (I know C++14 != C++98.)
Using clang 3.9.0 instead of 4.0.1 generates the expected output with all the above variations.
The problem appears to be related to the B::operator= code and code (possibly loop) optimisation.
ASSEMBLER:
The relevant parts of the code (the loop from lines 24-34) generate the following with clang 4.0.1.
The GOOD code (-DFIX):
LBB2_2: ## =>This Inner Loop Header: Depth=1
decl %eax
cmpb $33, (%edx)
movl %edx, %edi
jl LBB2_4
## BB#3: ## in Loop: Header=BB2_2 Depth=1
movl (%edx), %edi
movl %edi, -12(%ebp)
movl %esi, %edi
LBB2_4: ## in Loop: Header=BB2_2 Depth=1
addl $8, %edx
movsd (%edi), %xmm0 ## xmm0 = mem[0],zero
movsd %xmm0, (%ecx)
addl $8, %ecx
testl %eax, %eax
jne LBB2_2
The BAD code (no -DFIX):
LBB2_2: ## =>This Inner Loop Header: Depth=1
decl %eax
movzbl (%edx), %ebx
cmpb $33, %bl
movl %edx, %edi
jl LBB2_4
## BB#3: ## in Loop: Header=BB2_2 Depth=1
movl (%edx), %esi
movb $1, %bl
leal -24(%ebp), %edi
LBB2_4: ## in Loop: Header=BB2_2 Depth=1
addl $8, %edx
movb %bl, (%ecx)
movl 4(%edi), %edi
movl %edi, 4(%ecx)
addl $8, %ecx
testl %eax, %eax
jne LBB2_2
## BB#5:
movl %esi, -20(%ebp)
This last line looks problematic (if I’m reading it correctly).
It’s storing %esi only once although it is loaded each time round the loop at BB#3.
It appears to be generated by `tmp.b.x = val->n.f;` which has migrated outside the loop & after tmp is read.
QUESTIONS:
So my questions are:
Is this a bug in clang/LLVM?
Does anyone else see this?
Do clang 4.0.0 and/or clang 3.9.1 exhibit this problem?
Is this caused by a rare code combination or is it going to silently break lots of code?
Is this serious enough to warrant/require a clang 4.0.2?
Do you need any further info to help fix this?
Is someone willing to, please, fix it?
Thanks,
CHRIS
More information about the cfe-dev
mailing list