[LLVMdev] Incorrect loop optimization when building the Linux kernel

Sanjoy Das sanjoy at playingwithpointers.com
Sun Dec 7 21:09:25 PST 2014


LLVM seems to assume that two non-weak declarations can never point to
the same thing (so "__start_builtin_fw == __end_builtin_fw" is always
false).  In fact, foo in the following program constant folds to
return false (with a warning from clang):

struct F {};
extern struct F a[];
extern struct F b[];

int foo() {
  return a == b;
}

I don't know enough about the difference between different linkage
types to say if llvm's behavior is reasonable or not.


One place where this property is exploited (there are probably others):

(llvm/lib/IR/ConstantFold.cpp)

static ICmpInst::Predicate areGlobalsPotentiallyEqual(const GlobalValue *GV1,
                                                      const GlobalValue *GV2) {
  // Don't try to decide equality of aliases.
  if (!isa<GlobalAlias>(GV1) && !isa<GlobalAlias>(GV2))
    if (!GV1->hasExternalWeakLinkage() || !GV2->hasExternalWeakLinkage())
      return ICmpInst::ICMP_NE;
  return ICmpInst::BAD_ICMP_PREDICATE;
}


On Sun, Dec 7, 2014 at 8:21 PM, Chengyu Song <csong84 at gatech.edu> wrote:
> I was trying to build the Linux kernel with clang and observed a crash due to incorrect loop optimization:
>
> drivers/base/firmware_class.c
>
> extern struct builtin_fw __start_builtin_fw[];
> extern struct builtin_fw __end_builtin_fw[];
>
> static bool fw_get_builtin_firmware(struct firmware *fw, const char *name)
> {
>         struct builtin_fw *b_fw;
>         for (b_fw = __start_builtin_fw; b_fw != __end_builtin_fw; b_fw++) {
>                 if (strcmp(name, b_fw->name) == 0) {
>                         fw->size = b_fw->size;
>                         fw->data = b_fw->data;
>                         return true;
>                 }
>         }
>         return false;
> }
>
> When compiled with -O0, the comparison (b_fw != __end_builtin_fw) is executed before loop body, which is the expected behavior.
>
> But with -O1 optimization (opt -O1), the comparison is moved to the end of the loop, which causes the strcmp to use incorrect argument and finally crashes the kernel.
>
> define internal fastcc i1 @fw_get_builtin_firmware(%struct.firmware* nocapture %fw, i8* nocapture readonly %name) #0 {
>         tail call void @llvm.dbg.value(metadata !{%struct.firmware* %fw}, i64 0, metadata !3985, metadata !3750), !dbg !3986
>         tail call void @llvm.dbg.value(metadata !{i8* %name}, i64 0, metadata !3987, metadata !3750), !dbg !3988
>         tail call void @llvm.dbg.value(metadata !3839, i64 0, metadata !3989, metadata !3750), !dbg !3990
>         br label %1, !dbg !3991
>
> ; <label>:1                                       ; preds = %13, %0
>         %b_fw.02 = phi %struct.builtin_fw* [ getelementptr inbounds ([0 x %struct.builtin_fw]* @__start_builtin_fw, i64 0, i64 0), %0 ], [ %14, %13 ]
>         %2 = getelementptr inbounds %struct.builtin_fw* %b_fw.02, i64 0, i32 0, !dbg !3995
>         %3 = load i8** %2, align 8, !dbg !3995
>         %4 = tail call i32 @strcmp(i8* %name, i8* %3) #8, !dbg !3995
>         %5 = icmp eq i32 %4, 0, !dbg !3995
>         br i1 %5, label %6, label %13, !dbg !3995
>
> ; <label>:6                                       ; preds = %1
>         %b_fw.02.lcssa = phi %struct.builtin_fw* [ %b_fw.02, %1 ]
>         %7 = getelementptr inbounds %struct.builtin_fw* %b_fw.02.lcssa, i64 0, i32 2, !dbg !3999
>         %8 = load i64* %7, align 8, !dbg !3999
>         %9 = getelementptr inbounds %struct.firmware* %fw, i64 0, i32 0, !dbg !3999
>         store i64 %8, i64* %9, align 8, !dbg !3999
>         %10 = getelementptr inbounds %struct.builtin_fw* %b_fw.02.lcssa, i64 0, i32 1, !dbg !4001
>         %11 = load i8** %10, align 8, !dbg !4001
>         %12 = getelementptr inbounds %struct.firmware* %fw, i64 0, i32 1, !dbg !4001
>         store i8* %11, i8** %12, align 8, !dbg !4001
>         br label %.loopexit, !dbg !4002
>
> ; <label>:13                                      ; preds = %1
>         %14 = getelementptr %struct.builtin_fw* %b_fw.02, i64 1, !dbg !400
>         tail call void @llvm.dbg.value(metadata !{%struct.builtin_fw* %14}, i64 0, metadata !3989, metadata !3750), !dbg !3990
>         %15 = icmp eq %struct.builtin_fw* %14, getelementptr inbounds ([0 x %struct.builtin_fw]* @__end_builtin_fw, i64 0, i64 0), !dbg !3991
>         br i1 %15, label %.loopexit.loopexit, label %1, !dbg !3991
>
> .loopexit.loopexit:                               ; preds = %13
>         br label %.loopexit
>
> .loopexit:                                        ; preds = %.loopexit.loopexit, %6
>         %.0 = phi i1 [ true, %6 ], [ false, %.loopexit.loopexit ]
>         ret i1 %.0, !dbg !4004
> }
>
> Could anyone explain why this is unhappening and how to avoid it.
>
> Thanks,
> Chengyu
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



More information about the llvm-dev mailing list