[llvm-commits] PATCH: Fix AddressSanitizer to emit basic blocks in the natural order for the CFG

Chandler Carruth chandlerc at gmail.com
Sun Jul 15 15:58:59 PDT 2012


On Tue, Jul 10, 2012 at 9:46 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:

>
> On Jul 10, 2012, at 9:21 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
>
> On Tue, Jul 10, 2012 at 9:13 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:
>
>>
> Unfortunately, we can't bail out of register allocation.
>>
>
> But is there a cheaper algorithm we could fall over to? I don't know the
> first thing about regalloc, so maybe this doesn't make sense. Just trying
> to get a feel for whether this is solvable within the backend, or we simply
> must not produce such inputs.
>
>
> It is not impossible to fall back to RAFast which doesn't compute liveness
> at all, and so won't have this problem. It's a giant hack that I'd rather
> not do, though.
>
> The fundamental problem is that LLVM doesn't have an IR optimizer, it has
> an IR canonicalizer. In this case, it's probably LICM hoisting a thousand
> GEPs out of a thousand-block loop, creating the quadratic problem.
> Normally, CodeGenPrepare would sink those GEPs again, but in this case they
> are used by PHIs which CGP won't touch.
>

As it happens, that's not it. Here is a snippet of the generated code:

=====
define void @bar(i32* %a) nounwind uwtable address_safety {
  tail call void @llvm.dbg.value(metadata !{i32* %a}, i64 0, metadata !13),
!dbg !14
  %1 = tail call i32 (...)* @foo() nounwind, !dbg !15
  %2 = getelementptr inbounds i32* %a, i64 2877, !dbg !15
  %3 = ptrtoint i32* %2 to i64, !dbg !15
  %4 = lshr i64 %3, 3, !dbg !15
  %5 = or i64 %4, 17592186044416, !dbg !15
  %6 = inttoptr i64 %5 to i8*, !dbg !15  %7 = load i8* %6, align 1, !dbg
!15  %8 = icmp eq i8 %7, 0, !dbg !15
  br i1 %8, label %15, label %9

; <label>:9                                       ; preds = %0
  %10 = and i64 %3, 7
  %11 = add i64 %10, 3
  %12 = trunc i64 %11 to i8
  %13 = icmp slt i8 %12, %7
  br i1 %13, label %15, label %14

; <label>:14                                      ; preds = %9
  call void @__asan_report_store4(i64 %3) noreturn nounwind, !dbg !15
  unreachable

; <label>:15                                      ; preds = %9, %0
  store i32 %1, i32* %2, align 4, !dbg !15, !tbaa !17
  %16 = tail call i32 (...)* @foo() nounwind, !dbg !20
  %17 = getelementptr inbounds i32* %a, i64 20955, !dbg !20
  %18 = ptrtoint i32* %17 to i64, !dbg !20
  %19 = lshr i64 %18, 3, !dbg !20
  %20 = or i64 %19, 17592186044416, !dbg !20
  %21 = inttoptr i64 %20 to i8*, !dbg !20
  %22 = load i8* %21, align 1, !dbg !20
  %23 = icmp eq i8 %22, 0, !dbg !20
  br i1 %23, label %30, label %24

; <label>:24                                      ; preds = %15
  %25 = and i64 %18, 7
  %26 = add i64 %25, 3
  %27 = trunc i64 %26 to i8
  %28 = icmp slt i8 %27, %22
  br i1 %28, label %30, label %29

; <label>:29                                      ; preds = %24
  call void @__asan_report_store4(i64 %18) noreturn nounwind, !dbg !20
  unreachable
=====

The pattern established here repeats.


> If you sink those GEPs, I expect the problem will go away.
>

I don't see any way to sink the GEPs here... We're not even using them in
the two instrumentation basic blocks, we're using the result of ptrtoint on
the gep.

My suggestion for a long-term fix is to avoid duplicating these two basic
blocks for every store, and instead have a single set of instrumentation
blocks that select the questionable value through a phi-node. The only
tricky thing is that we'll also need to pass down debug info somehow, as
currently it leverages that the debug info is attached to the particular
call to the runtime library.


I don't think ASan should be relying on the long term code motion passes --
it should be emitting IR that is tuned for the subsequent passes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120715/ce6a7fcc/attachment.html>


More information about the llvm-commits mailing list