[clang] 166c8cc - [msan][CodeGen] Set noundef for C return value

Vitaly Buka via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 5 22:58:46 PST 2022


Author: Vitaly Buka
Date: 2022-12-05T22:58:29-08:00
New Revision: 166c8cccde010ea01fdc20945a2a25fa1971cb73

URL: https://github.com/llvm/llvm-project/commit/166c8cccde010ea01fdc20945a2a25fa1971cb73
DIFF: https://github.com/llvm/llvm-project/commit/166c8cccde010ea01fdc20945a2a25fa1971cb73.diff

LOG: [msan][CodeGen] Set noundef for C return value

Msan needs noundef consistency between interface and implementation. If
we call C++ from C we can have noundef on C++ side, and no noundef on
caller C side, noundef implementation will not set TLS for return value,
no noundef caller will expect it. Then we have false reports in msan.

The workaround could be set TLS to zero even for noundef return values.
However if we do that always it will increase binary size by about 10%.
If we do that selectively we need to handle "address is taken"
functions, any non local functions, and probably all function which have
musttail callers. Which is still a lot.

The existing implementation of HasStrictReturn refers to C standard as
the reason not enforcing noundef. I believe it applies only to the case
when return statement is omitted. Testing on Google codebase I never see
such cases, however I've see tens of cases where C code returns actual
uninitialized variables, but we ignore that it because of "omitted
return" case.

So this patch will:
1. fix false-positives with TLS missmatch.
2. detect bugs returning uninitialized variables for C as well.
3. report "omitted return" cases stricter than C, which is already a
   warning and very likely a bug in a code anyway.

Reviewed By: kda

Differential Revision: https://reviews.llvm.org/D139296

Added: 
    

Modified: 
    clang/lib/CodeGen/CGCall.cpp
    clang/test/CodeGen/cleanup-destslot-simple.c
    clang/test/CodeGen/msan-param-retval.c

Removed: 
    


################################################################################
diff  --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 07606877d843f..b94bf401d1949 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1798,6 +1798,13 @@ bool CodeGenModule::MayDropFunctionReturn(const ASTContext &Context,
 
 static bool HasStrictReturn(const CodeGenModule &Module, QualType RetTy,
                             const Decl *TargetDecl) {
+  // As-is msan can not tolerate noundef mismatch between caller and
+  // implementation. Mismatch is possible for e.g. indirect calls from C-caller
+  // into C++. Such mismatches lead to confusing false reports. To avoid
+  // expensive workaround on msan we enforce initialization event in uncommon
+  // cases where it's allowed.
+  if (Module.getLangOpts().Sanitize.has(SanitizerKind::Memory))
+    return true;
   // C++ explicitly makes returning undefined values UB. C's rule only applies
   // to used values, so we never mark them noundef for now.
   if (!Module.getLangOpts().CPlusPlus)
@@ -1818,7 +1825,6 @@ static bool HasStrictReturn(const CodeGenModule &Module, QualType RetTy,
   // Try to respect what the programmer intended.
   return Module.getCodeGenOpts().StrictReturn ||
          !Module.MayDropFunctionReturn(Module.getContext(), RetTy) ||
-         Module.getLangOpts().Sanitize.has(SanitizerKind::Memory) ||
          Module.getLangOpts().Sanitize.has(SanitizerKind::Return);
 }
 

diff  --git a/clang/test/CodeGen/cleanup-destslot-simple.c b/clang/test/CodeGen/cleanup-destslot-simple.c
index 2a0e682b0ab21..075afbba18cdb 100644
--- a/clang/test/CodeGen/cleanup-destslot-simple.c
+++ b/clang/test/CodeGen/cleanup-destslot-simple.c
@@ -64,7 +64,12 @@
 // CHECK-MSAN-NEXT:    [[_MSLD1:%.*]] = load i32, ptr [[TMP11]], align 4, !dbg [[DBG20]]
 // CHECK-MSAN-NEXT:    call void @llvm.lifetime.end.p0(i64 8, ptr nonnull [[P]]), !dbg [[DBG22:![0-9]+]]
 // CHECK-MSAN-NEXT:    call void @llvm.lifetime.end.p0(i64 4, ptr nonnull [[X]]) #[[ATTR2]], !dbg [[DBG22]]
-// CHECK-MSAN-NEXT:    store i32 [[_MSLD1]], ptr @__msan_retval_tls, align 8, !dbg [[DBG23:![0-9]+]]
+// CHECK-MSAN-NEXT:    [[_MSCMP2_NOT:%.*]] = icmp eq i32 [[_MSLD1]], 0, !dbg [[DBG23:![0-9]+]]
+// CHECK-MSAN-NEXT:    br i1 [[_MSCMP2_NOT]], label [[TMP13:%.*]], label [[TMP12:%.*]], !dbg [[DBG23]], !prof [[PROF21]]
+// CHECK-MSAN:       12:
+// CHECK-MSAN-NEXT:    call void @__msan_warning_noreturn() #[[ATTR3]], !dbg [[DBG23]]
+// CHECK-MSAN-NEXT:    unreachable, !dbg [[DBG23]]
+// CHECK-MSAN:       13:
 // CHECK-MSAN-NEXT:    ret i32 [[TMP8]], !dbg [[DBG23]]
 //
 // CHECK-KMSAN-LABEL: @test(

diff  --git a/clang/test/CodeGen/msan-param-retval.c b/clang/test/CodeGen/msan-param-retval.c
index b003ae21ba82c..744d70620bdc4 100644
--- a/clang/test/CodeGen/msan-param-retval.c
+++ b/clang/test/CodeGen/msan-param-retval.c
@@ -23,13 +23,20 @@ int foo() {
   return 1;
 }
 
-// CHECK: define dso_local i32 @foo() #0 {
-// CHECK:   @__msan_retval_tls
+// CLEAN:   define dso_local i32 @foo() #0 {
+// NOUNDEF: define dso_local noundef i32 @foo() #0 {
+// CLEAN:        @__msan_retval_tls
+// NOUNDEF_ONLY: @__msan_retval_tls
+// EAGER-NOT:    @__msan_retval_tls
 // CHECK: }
 
 int noret() {
 }
 
-// CHECK: define dso_local i32 @noret() #0 {
+// CLEAN:   define dso_local i32 @noret() #0 {   
+// NOUNDEF: define dso_local noundef i32 @noret() #0 {
 // CHECK:   %retval = alloca
-// CHECK: }
\ No newline at end of file
+// CLEAN:        @__msan_retval_tls
+// NOUNDEF_ONLY: @__msan_retval_tls
+// EAGER-NOT:    @__msan_retval_tls
+// CHECK: }


        


More information about the cfe-commits mailing list