[PATCH] D104266: [DFSan] Handle landingpad inst explicitly as zero shadow.
stephan.yichao.zhao via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 15 13:58:21 PDT 2021
stephan.yichao.zhao added inline comments.
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:2567
+ // Do nothing.
+ // See https://github.com/google/sanitizers/issues/504
+ DFSF.setShadow(&LPI, DFSF.DFS.getZeroShadow(&LPI));
----------------
browneee wrote:
> stephan.yichao.zhao wrote:
> > This link does not explain this issue in details.
> > https://github.com/llvm/llvm-project/blame/dde9dcc24b23e0b8185cf7ce5072c0dc8ff086c9/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp shows that at the very beginning, MemorySanitizer has the comment here.
> >
> > From https://llvm.org/docs/ExceptionHandling.html#try-catch, we can see in practice the first element of the return value of a landing pad is an exception.
> > So this means it is possible to have data flows from an exception raise to here, but not 100% sure.
> > For MSan, it is only to ignore tracking this because at runtime all returned values can be assumed initialized. For DFSan, this may miss some flows.
> >
> > @eugenis: Do you still happen to remember how https://github.com/google/sanitizers/issues/504 affected MSan? I did not see how other sanitizers handle landing pads. Will DFSan be affected too?
> >
> > Not sure if https://github.com/google/sanitizers/issues/504 implies any
> Testing throwing data:
>
> ```
> int foo(int a) {
> throw a;
> }
>
> int bar(int a) {
> try {
> foo(a);
> } catch (int x) {
> return x;
> }
> return 0;
> }
> ```
>
> ```
> ; Function Attrs: noinline optnone uwtable
> define dso_local i32 @_Z3fooi(i32 %0) #0 {
> %2 = alloca i32, align 4
> store i32 %0, i32* %2, align 4
> %3 = call i8* @__cxa_allocate_exception(i64 4) #4 ; STORAGE FROM ABI
> %4 = bitcast i8* %3 to i32*
> %5 = load i32, i32* %2, align 4
> store i32 %5, i32* %4, align 16 ; STORE VALUE TO THROW
> call void @__cxa_throw(i8* %3, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5
> unreachable
> }
>
> declare dso_local i8* @__cxa_allocate_exception(i64)
>
> declare dso_local void @__cxa_throw(i8*, i8*, i8*)
>
> ; Demangled: "bar(int)"
> ; Function Attrs: noinline optnone uwtable
> define dso_local i32 @_Z3bari(i32 %0) #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
> %2 = alloca i32, align 4
> %3 = alloca i32, align 4
> %4 = alloca i8*, align 8
> %5 = alloca i32, align 4
> %6 = alloca i32, align 4
> store i32 %0, i32* %3, align 4
> %7 = load i32, i32* %3, align 4
> %8 = invoke i32 @_Z3fooi(i32 %7)
> to label %9 unwind label %10
>
> 9: ; preds = %1
> br label %24
>
> 10: ; preds = %1
> %11 = landingpad { i8*, i32 }
> catch i8* bitcast (i8** @_ZTIi to i8*)
> %12 = extractvalue { i8*, i32 } %11, 0
> store i8* %12, i8** %4, align 8
> %13 = extractvalue { i8*, i32 } %11, 1
> store i32 %13, i32* %5, align 4
> br label %14
>
> 14: ; preds = %10
> %15 = load i32, i32* %5, align 4
> %16 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #4
> %17 = icmp eq i32 %15, %16
> br i1 %17, label %18, label %27
>
> 18: ; preds = %14
> %19 = load i8*, i8** %4, align 8
> %20 = call i8* @__cxa_begin_catch(i8* %19) #4
> %21 = bitcast i8* %20 to i32*
> %22 = load i32, i32* %21, align 4 ; LOAD VALUE FROM THROW
> store i32 %22, i32* %6, align 4
> %23 = load i32, i32* %6, align 4
> store i32 %23, i32* %2, align 4
> call void @__cxa_end_catch() #4
> br label %25
>
> 24: ; preds = %9
> store i32 0, i32* %2, align 4
> br label %25
>
> 25: ; preds = %24, %18
> %26 = load i32, i32* %2, align 4
> ret i32 %26
>
> 27: ; preds = %14
> %28 = load i8*, i8** %4, align 8
> %29 = load i32, i32* %5, align 4
> %30 = insertvalue { i8*, i32 } undef, i8* %28, 0
> %31 = insertvalue { i8*, i32 } %30, i32 %29, 1
> resume { i8*, i32 } %31
> }
> ```
>
> It seems like the ABI will provide storage for data flows. DFSan does handle this flow correctly.
>
> As for:
> %12 = extractvalue { i8*, i32 } %11, 0
> %13 = extractvalue { i8*, i32 } %11, 1
>
> I'd expect the indirection in `i8* %12` would allow DFSan to have a shadow for any data at load(%12).
> `i32 %13` is a type ID, and should never be tainted.
>
> I'm thinking I should:
> - remove the out of date issue link from MSan and
> - add comments about not needing to track flows through landingpad.
Thank you for extracting this.
```
%12 = extractvalue { i8*, i32 } %11, 0
store i8* %12, i8** %4, align 8
...
%19 = load i8*, i8** %4, align 8
%20 = call i8* @__cxa_begin_catch(i8* %19) #4
%21 = bitcast i8* %20 to i32*
%22 = load i32, i32* %21, align 4 ; LOAD VALUE FROM THROW
store i32 %22, i32* %6, align 4
%23 = load i32, i32* %6, align 4
store i32 %23, i32* %2, align 4
...
br label %25
25: ; preds = %24, %18
%26 = load i32, i32* %2, align 4
ret i32 %26
```
>From the above, %12 eventually goes to %26 via __cxa_begin_catch.
If we build the above c code with dfsan and C++ exception enabled, does it have a link error saying dfs$__cxa_begin_catch cannot be found? If there is no such an error, I think our problem is that we do not have a correct wrapper of __cxa_begin_catch, so %19 may not go to %20 correctly, by default its return value will have 0 shadow. .../compiler-rt/lib/dfsan/done_abilist.txt does not have this.
https://llvm.org/docs/ExceptionHandling.html says "__cxa_begin_catch takes an exception structure reference as an argument and returns the value of the exception object." To support this, the wrapper needs to help load correct shadow and assign it to the shadow of what it ret points to.
But we cannot provide a wrapper to this builtin libcall yet, and can only wrap glibc calls.
So I think zero-out here is like a workaround before we can have a correct wrapper of __cxa_begin_catch.
Please add a comment to explain this as a TODO after we can wrapp libcall APIs.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D104266/new/
https://reviews.llvm.org/D104266
More information about the llvm-commits
mailing list