[LLVMbugs] [Bug 15830] New: Vectorizer produces incorrect results for regex code
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Tue Apr 23 06:46:07 PDT 2013
http://llvm.org/bugs/show_bug.cgi?id=15830
Bug ID: 15830
Summary: Vectorizer produces incorrect results for regex code
Product: new-bugs
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: dimitry at andric.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
Created attachment 10410
--> http://llvm.org/bugs/attachment.cgi?id=10410&action=edit
Testcase for regcomp() vectorization problem
Recently we upgraded clang in FreeBSD to trunk r178860. Afterwards, some
people reported problems with configure scripts, caused by sed failing to
replace certain strings, e.g. @CC@ and such. These people all had the
following settings in common:
- amd64 (x86_64) architecture
- core2 or higher CPUs, and using -march=native
- using -O3 or equivalent optimization flags
After some searching, this turned out to be a miscompilation in regcomp.c, the
libc source containing a part of the regular expression parsing logic. One of
the functions, computejumps(), computes a Boyer-Moore jump table:
static void
computejumps(struct parse *p, struct re_guts *g)
{
int ch;
int mindex;
/* Avoid making errors worse */
if (p->error != 0)
return;
g->charjump = (int*) malloc((NC + 1) * sizeof(int));
if (g->charjump == NULL) /* Not a fatal error */
return;
/* Adjust for signed chars, if necessary */
g->charjump = &g->charjump[-(CHAR_MIN)];
/* If the character does not exist in the pattern, the jump
* is equal to the number of characters in the pattern.
*/
for (ch = CHAR_MIN; ch < (CHAR_MAX + 1); ch++)
g->charjump[ch] = g->mlen;
/* If the character does exist, compute the jump that would
* take us to the last character in the pattern equal to it
* (notice that we match right to left, so that last character
* is the first one that would be matched).
*/
for (mindex = 0; mindex < g->mlen; mindex++)
g->charjump[(int)g->must[mindex]] = g->mlen - mindex - 1;
}
When this function is inlined into the main regcomp() function at -O3, and the
vectorizer optimizes the last for loop, something is done incorrectly, and the
resulting table sometimes has one faulty entry.
I have attached a sample testcase, that shows the problem at runtime. If the
sample is compiled with -O2, or with -O3 -fno-vectorize, it will run without
displaying anything. If it is compiled with -O3, it will display:
charjump[67] is 5 instead of 4
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20130423/bab3b7e4/attachment.html>
More information about the llvm-bugs
mailing list