[PATCH] D90648: [SCEV] Fix nsw flags for GEP expressions
Jordan Rupprecht via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 25 13:39:45 PST 2020
rupprecht added a comment.
In D90648#2409773 <https://reviews.llvm.org/D90648#2409773>, @nikic wrote:
> In D90648#2409760 <https://reviews.llvm.org/D90648#2409760>, @davidxl wrote:
>
>> This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.
>
> This does not fix a specific end-to-end miscompile. It fixes incorrect analysis results that may result in a miscompile. See also D90708 <https://reviews.llvm.org/D90708> for a clarification of GEP inbounds semantics in LangRef.
We're actually seeing this *cause* a miscompile, AFAICT.
The method `my_strnxfrm_unicode_full_bin` from mysql 5.7 [1] has a loop like:
const uchar *const weightend = dst + (nweights*3);
const uchar *const end = (weightend < de) ? weightend : de;
while (dst < end-2)
{
*dst++= 0x00;
*dst++= 0x00;
*dst++= 0x20;
nweights--;
}
The failure we see is ASAN kill the process with heap-buffer-overflow when it runs into bad memory. However, the write is well past `end`, indicating the loop is not exiting when it should.
I dumped the IR for this method, and it looks like the beginning & end changed like so:
%add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195 ; This is "end - 2"
...
br i1 %cmp57212, label %while.body, label %while.end, !dbg !1197
while.body: ; preds = %if.then48, %163
...
%incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202 ; This is the last "*dst++ = 0x20" expression
...
%cmp57 = icmp ult i8* %incdec.ptr65, %add.ptr56, !dbg !1196 ; Compare dst against end - 2
br i1 %cmp57, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205
After this patch, we see:
%add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195
...
br i1 %cmp57212, label %while.body.preheader, label %while.end, !dbg !1197
while.body.preheader: ; preds = %if.then48
%scevgep228 = getelementptr i8, i8* %dst.addr.3, i64 3, !dbg !1197
br label %while.body, !dbg !1197
while.body: ; preds = %while.body.preheader, %163
...
%incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202
...
%cmp57229 = icmp ult i8* %scevgep228, %add.ptr56, !dbg !1197
br i1 %cmp57229, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205
If I'm reading the diff correctly, this patch adds a `while.body.preheader` BB for the SCEV GEP expression, and at the end compares that against `end - 2` to exit the loop. However, we jump back to the loop body, not the preheader, so the GEP expression is never updated. And hence, we end up with an infinite loop.
[1] https://github.com/mysql/mysql-server/blob/5.7/strings/ctype-utf8.c#L5217
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D90648/new/
https://reviews.llvm.org/D90648
More information about the llvm-commits
mailing list