r313096 - [ubsan] Function Sanitizer: Don't require writable text segments

Peter Collingbourne via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 13 16:06:14 PDT 2017


Here's a small reproducer.

struct A {
  virtual void f(...);
};

struct B : virtual A {
  virtual void b();
  virtual void f(...);
};

void B::f(...) {}

$ clang++ -fsanitize=function fsan.cpp -ffunction-sections -fdata-sections
-c -o /dev/null
fatal error: error in backend: Cannot represent a difference across sections

Looking at the IR I see this function definition:
define void @_ZTv0_n24_N1B1fEz(%struct.B* %this, ...) unnamed_addr #0 align
2 prologue <{ i32, i32 }> <{ i32 846595819, i32 trunc (i64 sub (i64
ptrtoint (i8** @0 to i64), i64 ptrtoint (void (%struct.B*, ...)* @_ZN1B1fEz
to i64)) to i32) }> {
which appears to cause the error.

I get basically the same IR if I use a Darwin target triple, so this isn't
a Linux-specific issue. (On Darwin we end up successfully creating an
object file, but the embedded offset in the text section will presumably be
incorrect.)

Note that we're emitting prologue data on virtual functions, which is
unnecessary because -fsanitize=function only checks indirect calls via
function pointers. So I imagine that one way to solve the problem would be
to turn off prologue data emission on non-virtual functions.

Peter

On Fri, Oct 13, 2017 at 3:06 PM, Vedant Kumar <vsk at apple.com> wrote:

>
> On Oct 13, 2017, at 2:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
>
> On Fri, Oct 13, 2017 at 2:50 PM Vedant Kumar <vsk at apple.com> wrote:
>
>> On Oct 13, 2017, at 1:44 PM, Eric Christopher <echristo at gmail.com> wrote:
>>
>>
>>
>> On Fri, Oct 13, 2017 at 1:42 PM Vedant Kumar <vsk at apple.com> wrote:
>>
>>> On Oct 13, 2017, at 1:39 PM, Vedant Kumar <vsk at apple.com> wrote:
>>>
>>> Hey Eric,
>>>
>>> I'm sorry for the breakage. I made sure to check the run-time tests in
>>> compiler-rt but we could have missing coverage there.
>>>
>>> The original version of this patch restricted the prologue data changes
>>> to Darwin only. We can switch back to that easily, just let me know.
>>>
>>>
>>> Actually I'll go ahead and work a patch up.
>>>
>>>
>> Appreciated :)
>>
>> Basically we were getting an error of:
>>
>> error: Cannot represent a difference across sections
>>
>> trying to compile things with the current code.
>>
>>
>> Oh I see.. well, we started using a difference between the address of a
>> function and the address of a global, so the error makes sense.
>>
>> I'd be interested in any factors that could narrow the problem down (e.g
>> using a specific linker, using -ffunction-sections, using data-sections,
>> etc). Basically I'm not sure why this would work on some Linux setups but
>> not others.
>>
>>
> Definitely using the latter two options and gold as a linker. I'll see
> what Han can come up with.
>
>
> Gotcha. Well, -ffunction-sections appears to be untested in
> compiler-rt/test/ubsan, at least.
>
> There's a test somewhere in there called function.cpp -- it would be great
> if we could cover the *-sections options there. I'm not sure whether that's
> what caused the failure, but the extra coverage couldn't hurt :). I would
> do it myself but I don't have a Linux machine to test on.
>
> vedant
>
>
>
>> While we figure that out here's a patch to limit the impact on non-Darwin
>> platforms:
>> https://reviews.llvm.org/D38903
>>
>
> *goes a looking*
>
> Thanks!
>
> -eric
>
>>
>> vedant
>>
>>
>> Thanks!
>>
>> -eric
>>
>>
>>> vedant
>>>
>>>
>>> vedant
>>>
>>>
>>> On Oct 13, 2017, at 1:33 PM, Eric Christopher <echristo at gmail.com>
>>> wrote:
>>>
>>> Hi Vedant,
>>>
>>> So this actually broke -fsanitize=function on linux. Han is working up a
>>> testcase for it, but letting you know for now that we'll probably need some
>>> change here.
>>>
>>> -eric
>>>
>>> On Tue, Sep 12, 2017 at 5:05 PM Vedant Kumar via cfe-commits <
>>> cfe-commits at lists.llvm.org> wrote:
>>>
>>>> Author: vedantk
>>>> Date: Tue Sep 12 17:04:35 2017
>>>> New Revision: 313096
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=313096&view=rev
>>>> Log:
>>>> [ubsan] Function Sanitizer: Don't require writable text segments
>>>>
>>>> This change will make it possible to use -fsanitize=function on Darwin
>>>> and
>>>> possibly on other platforms. It fixes an issue with the way RTTI is
>>>> stored into
>>>> function prologue data.
>>>>
>>>> On Darwin, addresses stored in prologue data can't require run-time
>>>> fixups and
>>>> must be PC-relative. Run-time fixups are undesirable because they
>>>> necessitate
>>>> writable text segments, which can lead to security issues. And absolute
>>>> addresses are undesirable because they break PIE mode.
>>>>
>>>> The fix is to create a private global which points to the RTTI, and
>>>> then to
>>>> encode a PC-relative reference to the global into prologue data.
>>>>
>>>> Differential Revision: https://reviews.llvm.org/D37597
>>>>
>>>> Modified:
>>>>     cfe/trunk/lib/CodeGen/CGExpr.cpp
>>>>     cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
>>>>     cfe/trunk/lib/CodeGen/CodeGenFunction.h
>>>>     cfe/trunk/lib/CodeGen/TargetInfo.cpp
>>>>     cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp
>>>>
>>>> Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/
>>>> CodeGen/CGExpr.cpp?rev=313096&r1=313095&r2=313096&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
>>>> +++ cfe/trunk/lib/CodeGen/CGExpr.cpp Tue Sep 12 17:04:35 2017
>>>> @@ -4409,10 +4409,7 @@ RValue CodeGenFunction::EmitCall(QualTyp
>>>>        SanitizerScope SanScope(this);
>>>>        llvm::Constant *FTRTTIConst =
>>>>            CGM.GetAddrOfRTTIDescriptor(QualType(FnType, 0),
>>>> /*ForEH=*/true);
>>>> -      llvm::Type *PrefixStructTyElems[] = {
>>>> -        PrefixSig->getType(),
>>>> -        FTRTTIConst->getType()
>>>> -      };
>>>> +      llvm::Type *PrefixStructTyElems[] = {PrefixSig->getType(),
>>>> Int32Ty};
>>>>        llvm::StructType *PrefixStructTy = llvm::StructType::get(
>>>>            CGM.getLLVMContext(), PrefixStructTyElems,
>>>> /*isPacked=*/true);
>>>>
>>>> @@ -4433,8 +4430,10 @@ RValue CodeGenFunction::EmitCall(QualTyp
>>>>        EmitBlock(TypeCheck);
>>>>        llvm::Value *CalleeRTTIPtr =
>>>>            Builder.CreateConstGEP2_32(PrefixStructTy,
>>>> CalleePrefixStruct, 0, 1);
>>>> -      llvm::Value *CalleeRTTI =
>>>> +      llvm::Value *CalleeRTTIEncoded =
>>>>            Builder.CreateAlignedLoad(CalleeRTTIPtr, getPointerAlign());
>>>> +      llvm::Value *CalleeRTTI =
>>>> +          DecodeAddrUsedInPrologue(CalleePtr, CalleeRTTIEncoded);
>>>>        llvm::Value *CalleeRTTIMatch =
>>>>            Builder.CreateICmpEQ(CalleeRTTI, FTRTTIConst);
>>>>        llvm::Constant *StaticData[] = {
>>>>
>>>> Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/
>>>> CodeGen/CodeGenFunction.cpp?rev=313096&r1=313095&r2=313096&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- cfe/trunk/lib/CodeGen/CodeGenFunction.cpp (original)
>>>> +++ cfe/trunk/lib/CodeGen/CodeGenFunction.cpp Tue Sep 12 17:04:35 2017
>>>> @@ -429,6 +429,43 @@ bool CodeGenFunction::ShouldXRayInstrume
>>>>    return CGM.getCodeGenOpts().XRayInstrumentFunctions;
>>>>  }
>>>>
>>>> +llvm::Constant *
>>>> +CodeGenFunction::EncodeAddrForUseInPrologue(llvm::Function *F,
>>>> +                                            llvm::Constant *Addr) {
>>>> +  // Addresses stored in prologue data can't require run-time fixups
>>>> and must
>>>> +  // be PC-relative. Run-time fixups are undesirable because they
>>>> necessitate
>>>> +  // writable text segments, which are unsafe. And absolute addresses
>>>> are
>>>> +  // undesirable because they break PIE mode.
>>>> +
>>>> +  // Add a layer of indirection through a private global. Taking its
>>>> address
>>>> +  // won't result in a run-time fixup, even if Addr has linkonce_odr
>>>> linkage.
>>>> +  auto *GV = new llvm::GlobalVariable(CGM.getModule(),
>>>> Addr->getType(),
>>>> +                                      /*isConstant=*/true,
>>>> +                                      llvm::GlobalValue::PrivateLinkage,
>>>> Addr);
>>>> +
>>>> +  // Create a PC-relative address.
>>>> +  auto *GOTAsInt = llvm::ConstantExpr::getPtrToInt(GV, IntPtrTy);
>>>> +  auto *FuncAsInt = llvm::ConstantExpr::getPtrToInt(F, IntPtrTy);
>>>> +  auto *PCRelAsInt = llvm::ConstantExpr::getSub(GOTAsInt, FuncAsInt);
>>>> +  return (IntPtrTy == Int32Ty)
>>>> +             ? PCRelAsInt
>>>> +             : llvm::ConstantExpr::getTrunc(PCRelAsInt, Int32Ty);
>>>> +}
>>>> +
>>>> +llvm::Value *
>>>> +CodeGenFunction::DecodeAddrUsedInPrologue(llvm::Value *F,
>>>> +                                          llvm::Value *EncodedAddr) {
>>>> +  // Reconstruct the address of the global.
>>>> +  auto *PCRelAsInt = Builder.CreateSExt(EncodedAddr, IntPtrTy);
>>>> +  auto *FuncAsInt = Builder.CreatePtrToInt(F, IntPtrTy, "func_addr.int
>>>> ");
>>>> +  auto *GOTAsInt = Builder.CreateAdd(PCRelAsInt, FuncAsInt, "
>>>> global_addr.int");
>>>> +  auto *GOTAddr = Builder.CreateIntToPtr(GOTAsInt, Int8PtrPtrTy,
>>>> "global_addr");
>>>> +
>>>> +  // Load the original pointer through the global.
>>>> +  return Builder.CreateLoad(Address(GOTAddr, getPointerAlign()),
>>>> +                            "decoded_addr");
>>>> +}
>>>> +
>>>>  /// EmitFunctionInstrumentation - Emit LLVM code to call the specified
>>>>  /// instrumentation function with the current function and the call
>>>> site, if
>>>>  /// function instrumentation is enabled.
>>>> @@ -856,7 +893,10 @@ void CodeGenFunction::StartFunction(Glob
>>>>                CGM.getTargetCodeGenInfo().
>>>> getUBSanFunctionSignature(CGM)) {
>>>>          llvm::Constant *FTRTTIConst =
>>>>              CGM.GetAddrOfRTTIDescriptor(FD->getType(),
>>>> /*ForEH=*/true);
>>>> -        llvm::Constant *PrologueStructElems[] = { PrologueSig,
>>>> FTRTTIConst };
>>>> +        llvm::Constant *FTRTTIConstEncoded =
>>>> +            EncodeAddrForUseInPrologue(Fn, FTRTTIConst);
>>>> +        llvm::Constant *PrologueStructElems[] = {PrologueSig,
>>>> +                                                 FTRTTIConstEncoded};
>>>>          llvm::Constant *PrologueStructConst =
>>>>              llvm::ConstantStruct::getAnon(PrologueStructElems,
>>>> /*Packed=*/true);
>>>>          Fn->setPrologueData(PrologueStructConst);
>>>>
>>>> Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.h
>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/
>>>> CodeGen/CodeGenFunction.h?rev=313096&r1=313095&r2=313096&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- cfe/trunk/lib/CodeGen/CodeGenFunction.h (original)
>>>> +++ cfe/trunk/lib/CodeGen/CodeGenFunction.h Tue Sep 12 17:04:35 2017
>>>> @@ -1776,6 +1776,15 @@ public:
>>>>    /// EmitMCountInstrumentation - Emit call to .mcount.
>>>>    void EmitMCountInstrumentation();
>>>>
>>>> +  /// Encode an address into a form suitable for use in a function
>>>> prologue.
>>>> +  llvm::Constant *EncodeAddrForUseInPrologue(llvm::Function *F,
>>>> +                                             llvm::Constant *Addr);
>>>> +
>>>> +  /// Decode an address used in a function prologue, encoded by \c
>>>> +  /// EncodeAddrForUseInPrologue.
>>>> +  llvm::Value *DecodeAddrUsedInPrologue(llvm::Value *F,
>>>> +                                        llvm::Value *EncodedAddr);
>>>> +
>>>>    /// EmitFunctionProlog - Emit the target specific LLVM code to load
>>>> the
>>>>    /// arguments for the given function. This is also responsible for
>>>> naming the
>>>>    /// LLVM function arguments.
>>>>
>>>> Modified: cfe/trunk/lib/CodeGen/TargetInfo.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/
>>>> CodeGen/TargetInfo.cpp?rev=313096&r1=313095&r2=313096&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- cfe/trunk/lib/CodeGen/TargetInfo.cpp (original)
>>>> +++ cfe/trunk/lib/CodeGen/TargetInfo.cpp Tue Sep 12 17:04:35 2017
>>>> @@ -1086,8 +1086,8 @@ public:
>>>>    getUBSanFunctionSignature(CodeGen::CodeGenModule &CGM) const
>>>> override {
>>>>      unsigned Sig = (0xeb << 0) |  // jmp rel8
>>>>                     (0x06 << 8) |  //           .+0x08
>>>> -                   ('F' << 16) |
>>>> -                   ('T' << 24);
>>>> +                   ('v' << 16) |
>>>> +                   ('2' << 24);
>>>>      return llvm::ConstantInt::get(CGM.Int32Ty, Sig);
>>>>    }
>>>>
>>>> @@ -2277,17 +2277,10 @@ public:
>>>>
>>>>    llvm::Constant *
>>>>    getUBSanFunctionSignature(CodeGen::CodeGenModule &CGM) const
>>>> override {
>>>> -    unsigned Sig;
>>>> -    if (getABIInfo().has64BitPointers())
>>>> -      Sig = (0xeb << 0) |  // jmp rel8
>>>> -            (0x0a << 8) |  //           .+0x0c
>>>> -            ('F' << 16) |
>>>> -            ('T' << 24);
>>>> -    else
>>>> -      Sig = (0xeb << 0) |  // jmp rel8
>>>> -            (0x06 << 8) |  //           .+0x08
>>>> -            ('F' << 16) |
>>>> -            ('T' << 24);
>>>> +    unsigned Sig = (0xeb << 0) | // jmp rel8
>>>> +                   (0x06 << 8) | //           .+0x08
>>>> +                   ('v' << 16) |
>>>> +                   ('2' << 24);
>>>>      return llvm::ConstantInt::get(CGM.Int32Ty, Sig);
>>>>    }
>>>>
>>>>
>>>> Modified: cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/
>>>> CodeGenCXX/catch-undef-behavior.cpp?rev=313096&r1=
>>>> 313095&r2=313096&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp (original)
>>>> +++ cfe/trunk/test/CodeGenCXX/catch-undef-behavior.cpp Tue Sep 12
>>>> 17:04:35 2017
>>>> @@ -16,6 +16,10 @@ struct S {
>>>>  // Check that type mismatch handler is not modified by ASan.
>>>>  // CHECK-ASAN: private unnamed_addr global { { [{{.*}} x i8]*, i32,
>>>> i32 }, { i16, i16, [4 x i8] }*, i8*, i8 } { {{.*}}, { i16, i16, [4 x i8] }*
>>>> [[TYPE_DESCR]], {{.*}} }
>>>>
>>>> +// CHECK: [[IndirectRTTI_ZTIFvPFviEE:@.+]] = private constant i8*
>>>> bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to i8*)
>>>> +// CHECK-X86: [[IndirectRTTI_ZTIFvPFviEE:@.+]] = private constant i8*
>>>> bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to i8*)
>>>> +// CHECK-X32: [[IndirectRTTI_ZTIFvPFviEE:@.+]] = private constant i8*
>>>> bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to i8*)
>>>> +
>>>>  struct T : S {};
>>>>
>>>>  // CHECK-LABEL: @_Z17reference_binding
>>>> @@ -395,23 +399,30 @@ void downcast_reference(B &b) {
>>>>    // CHECK-NEXT: br i1 [[AND]]
>>>>  }
>>>>
>>>> -// CHECK-LABEL: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i8* }> <{ i32 1413876459, i8* bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to
>>>> i8*) }>
>>>> -// CHECK-X32: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i8* }> <{ i32 1413875435, i8* bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to
>>>> i8*) }>
>>>> -// CHECK-X86: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i8* }> <{ i32 1413875435, i8* bitcast ({ i8*, i8* }* @_ZTIFvPFviEE to
>>>> i8*) }>
>>>> +//
>>>> +// CHECK-LABEL: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i32 }> <{ i32 846595819, i32 trunc (i64 sub (i64 ptrtoint (i8** {{.*}}
>>>> to i64), i64 ptrtoint (void (void (i32)*)* @_Z22indirect_function_callPFviE
>>>> to i64)) to i32) }>
>>>> +// CHECK-X32: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i32 }> <{ i32 846595819, i32 sub (i32 ptrtoint (i8**
>>>> [[IndirectRTTI_ZTIFvPFviEE]] to i32), i32 ptrtoint (void (void (i32)*)*
>>>> @_Z22indirect_function_callPFviE to i32)) }>
>>>> +// CHECK-X86: @_Z22indirect_function_callPFviE({{.*}} prologue <{
>>>> i32, i32 }> <{ i32 846595819, i32 sub (i32 ptrtoint (i8**
>>>> [[IndirectRTTI_ZTIFvPFviEE]] to i32), i32 ptrtoint (void (void (i32)*)*
>>>> @_Z22indirect_function_callPFviE to i32)) }>
>>>>  void indirect_function_call(void (*p)(int)) {
>>>> -  // CHECK: [[PTR:%.+]] = bitcast void (i32)* {{.*}} to <{ i32, i8* }>*
>>>> +  // CHECK: [[PTR:%.+]] = bitcast void (i32)* {{.*}} to <{ i32, i32 }>*
>>>>
>>>>    // Signature check
>>>> -  // CHECK-NEXT: [[SIGPTR:%.+]] = getelementptr <{ i32, i8* }>, <{
>>>> i32, i8* }>* [[PTR]], i32 0, i32 0
>>>> +  // CHECK-NEXT: [[SIGPTR:%.+]] = getelementptr <{ i32, i32 }>, <{
>>>> i32, i32 }>* [[PTR]], i32 0, i32 0
>>>>    // CHECK-NEXT: [[SIG:%.+]] = load i32, i32* [[SIGPTR]]
>>>> -  // CHECK-NEXT: [[SIGCMP:%.+]] = icmp eq i32 [[SIG]], 1413876459
>>>> +  // CHECK-NEXT: [[SIGCMP:%.+]] = icmp eq i32 [[SIG]], 846595819
>>>>    // CHECK-NEXT: br i1 [[SIGCMP]]
>>>>
>>>>    // RTTI pointer check
>>>> -  // CHECK: [[RTTIPTR:%.+]] = getelementptr <{ i32, i8* }>, <{ i32,
>>>> i8* }>* [[PTR]], i32 0, i32 1
>>>> -  // CHECK-NEXT: [[RTTI:%.+]] = load i8*, i8** [[RTTIPTR]]
>>>> +  // CHECK: [[RTTIPTR:%.+]] = getelementptr <{ i32, i32 }>, <{ i32,
>>>> i32 }>* [[PTR]], i32 0, i32 1
>>>> +  // CHECK-NEXT: [[RTTIEncIntTrunc:%.+]] = load i32, i32* [[RTTIPTR]]
>>>> +  // CHECK-NEXT: [[RTTIEncInt:%.+]] = sext i32 [[RTTIEncIntTrunc]] to
>>>> i64
>>>> +  // CHECK-NEXT: [[FuncAddrInt:%.+]] = ptrtoint void (i32)* {{.*}} to
>>>> i64
>>>> +  // CHECK-NEXT: [[IndirectGVInt:%.+]] = add i64 [[RTTIEncInt]],
>>>> [[FuncAddrInt]]
>>>> +  // CHECK-NEXT: [[IndirectGV:%.+]] = inttoptr i64 [[IndirectGVInt]]
>>>> to i8**
>>>> +  // CHECK-NEXT: [[RTTI:%.+]] = load i8*, i8** [[IndirectGV]], align 8
>>>>    // CHECK-NEXT: [[RTTICMP:%.+]] = icmp eq i8* [[RTTI]], bitcast ({
>>>> i8*, i8* }* @_ZTIFviE to i8*)
>>>>    // CHECK-NEXT: br i1 [[RTTICMP]]
>>>> +
>>>>    p(42);
>>>>  }
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-commits mailing list
>>>> cfe-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>>>
>>>
>


-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20171013/63d11e71/attachment-0001.html>


More information about the cfe-commits mailing list