[PATCH] D90648: [SCEV] Fix nsw flags for GEP expressions
Jordan Rupprecht via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 25 13:39:45 PST 2020
rupprecht added a comment.
In D90648#2409773 <https://reviews.llvm.org/D90648#2409773>, @nikic wrote:
> In D90648#2409760 <https://reviews.llvm.org/D90648#2409760>, @davidxl wrote:
>> This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.
> This does not fix a specific end-to-end miscompile. It fixes incorrect analysis results that may result in a miscompile. See also D90708 <https://reviews.llvm.org/D90708> for a clarification of GEP inbounds semantics in LangRef.
We're actually seeing this *cause* a miscompile, AFAICT.
The method `my_strnxfrm_unicode_full_bin` from mysql 5.7  has a loop like:
const uchar *const weightend = dst + (nweights*3);
const uchar *const end = (weightend < de) ? weightend : de;
while (dst < end-2)
The failure we see is ASAN kill the process with heap-buffer-overflow when it runs into bad memory. However, the write is well past `end`, indicating the loop is not exiting when it should.
I dumped the IR for this method, and it looks like the beginning & end changed like so:
%add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195 ; This is "end - 2"
br i1 %cmp57212, label %while.body, label %while.end, !dbg !1197
while.body: ; preds = %if.then48, %163
%incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202 ; This is the last "*dst++ = 0x20" expression
%cmp57 = icmp ult i8* %incdec.ptr65, %add.ptr56, !dbg !1196 ; Compare dst against end - 2
br i1 %cmp57, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205
After this patch, we see:
%add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195
br i1 %cmp57212, label %while.body.preheader, label %while.end, !dbg !1197
while.body.preheader: ; preds = %if.then48
%scevgep228 = getelementptr i8, i8* %dst.addr.3, i64 3, !dbg !1197
br label %while.body, !dbg !1197
while.body: ; preds = %while.body.preheader, %163
%incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202
%cmp57229 = icmp ult i8* %scevgep228, %add.ptr56, !dbg !1197
br i1 %cmp57229, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205
If I'm reading the diff correctly, this patch adds a `while.body.preheader` BB for the SCEV GEP expression, and at the end compares that against `end - 2` to exit the loop. However, we jump back to the loop body, not the preheader, so the GEP expression is never updated. And hence, we end up with an infinite loop.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the llvm-commits