[PATCH] Implement function type checker for the undefined behavior sanitizer.

Richard Smith richard at metafoo.co.uk
Mon Oct 14 21:12:36 PDT 2013


  Clever! The approach only makes me feel a //little// uneasy =)

  It would be polite to issue a diagnostic if someone explicitly asks for this (not via `-fsanitize=undefined`) and we don't support it for their target.


================
Comment at: lib/CodeGen/CGExpr.cpp:3129
@@ +3128,3 @@
+      !dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+    if (llvm::Constant* PDSig =
+            CGM.getTargetCodeGenInfo().getUBSanFunctionSignature(CGM)) {
----------------
`*` on the right.

================
Comment at: lib/CodeGen/CGExpr.cpp:3128
@@ +3127,3 @@
+  if (getLangOpts().CPlusPlus && SanOpts->Function &&
+      !dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+    if (llvm::Constant* PDSig =
----------------
This is a bit hard to parse. `(!TargetDecl || !isa<FunctionDecl>(TargetDecl))` maybe?

================
Comment at: lib/CodeGen/CGExpr.cpp:3133-3134
@@ +3132,4 @@
+          CGM.GetAddrOfRTTIDescriptor(QualType(FnType, 0), /*ForEH=*/true);
+      llvm::Type *PDStructTyElems[] = {PDSig->getType(),
+                                       FTRTTIConst->getType()};
+      llvm::StructType *PDStructTy = llvm::StructType::get(
----------------
We usually format this as

    llvm::Type *PDStructTyElems[] = {
      PDSig->getType(),
      FTRTTIConst->getType()
    };

or similar.

================
Comment at: lib/CodeGen/CGExpr.cpp:3155-3156
@@ +3154,4 @@
+          Builder.CreateICmpEQ(CalleeRTTI, FTRTTIConst);
+      llvm::Constant *StaticData[] = {EmitCheckSourceLocation(CallLoc),
+                                      EmitCheckTypeDescriptor(CalleeType)};
+      EmitCheck(CalleeRTTIMatch,
----------------
Likewise.

================
Comment at: lib/CodeGen/CGExpr.cpp:3133
@@ +3132,3 @@
+          CGM.GetAddrOfRTTIDescriptor(QualType(FnType, 0), /*ForEH=*/true);
+      llvm::Type *PDStructTyElems[] = {PDSig->getType(),
+                                       FTRTTIConst->getType()};
----------------
Richard Smith wrote:
> We usually format this as
> 
>     llvm::Type *PDStructTyElems[] = {
>       PDSig->getType(),
>       FTRTTIConst->getType()
>     };
> 
> or similar.
Some indication of what "PD" stands for here would be useful.

================
Comment at: lib/CodeGen/CGExpr.cpp:3138-3142
@@ +3137,7 @@
+
+      llvm::Value *CalleePDStruct = Builder.CreateBitCast(
+          Callee, llvm::PointerType::getUnqual(PDStructTy));
+      llvm::Value *CalleeSigPtr =
+          Builder.CreateConstGEP2_32(CalleePDStruct, 0, 0);
+      llvm::Value *CalleeSig = Builder.CreateLoad(CalleeSigPtr);
+      llvm::Value *CalleeSigMatch = Builder.CreateICmpEQ(CalleeSig, PDSig);
----------------
What happens if the callee address is at the end of a page and doesn't have the extra data? Could this load trigger a segfault?

================
Comment at: lib/CodeGen/CodeGenFunction.cpp:531
@@ +530,3 @@
+            CGM.GetAddrOfRTTIDescriptor(FD->getType(), /*ForEH=*/true);
+        llvm::Constant *PDStructElems[] = {PDSig, FTRTTIConst};
+        llvm::Constant *PDStructConst =
----------------
Spaces around braces.

================
Comment at: lib/CodeGen/CodeGenFunction.cpp:523-524
@@ -521,1 +522,4 @@
 
+  // If we are checking function types, emit a function type signature as
+  // prefix data.
+  if (getLangOpts().CPlusPlus && SanOpts->Function) {
----------------
It'd be nice if we could only do this for address-taken functions. When we take the address of a function, we could emit `linkonce_odr` thunk with the extra data, and use that instead of the original address... except that would break address-of-function comparisons between instrumented and uninstrumented code. Can we provide an option to enable that behavior for the case where we're OK with an ABI change?

================
Comment at: lib/CodeGen/TargetInfo.cpp:607-608
@@ +606,4 @@
+                   (0x06 << 8) |  //           .+0x08
+                   ('F' << 16) |
+                   ('T' << 24);
+    return llvm::ConstantInt::get(CGM.Int32Ty, Sig);
----------------
What do 'F' and 'T' demangle as? An invalid instruction encoding would give me more confidence here.

================
Comment at: lib/CodeGen/CodeGenFunction.cpp:529-530
@@ +528,4 @@
+              CGM.getTargetCodeGenInfo().getUBSanFunctionSignature(CGM)) {
+        llvm::Constant *FTRTTIConst =
+            CGM.GetAddrOfRTTIDescriptor(FD->getType(), /*ForEH=*/true);
+        llvm::Constant *PDStructElems[] = {PDSig, FTRTTIConst};
----------------
Does this include the calling convention attributes?


http://llvm-reviews.chandlerc.com/D1338



More information about the cfe-commits mailing list