[LLVMdev] [cfe-dev] UB in TypeLoc casting

Eli Friedman eli.friedman at gmail.com
Thu Nov 29 19:43:46 PST 2012


On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <dblaikie at gmail.com> wrote:
> Moving to LLVM dev to discuss the possibility of extending the cast
> infrastructure to handle this.
>
> On Tue, Nov 20, 2012 at 5:51 PM, John McCall <rjmccall at apple.com> wrote:
>> On Nov 18, 2012, at 5:05 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>> TypeLoc casting looks bogus.
>>>
>>> TypeLoc derived types return true from classof when the dynamic type
>>> is not actually an instance of the derived type. The TypeLoc is then
>>> (erroneously) cast to the derived type and the derived type's implicit
>>> copy ctor is executed (so long as you remember to assign the result of
>>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>>> member functions are actually invoked on a TypeLoc object - bogas, but
>>> it works (because there's no other members in the SpecificTypeLoc
>>> types).
>>
>> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common and
>> essentially harmless, but if you want to stamp it out, feel free.
>>
>>> Richard / Kostya: what's the word on catching this kind of UB
>>> (essentially: calling member functions of one type on an instance not
>>> of that type)? (in the specific case mentioned below, as discussed in
>>> the original thread, ASan or MSan might be able to help)
>>>
>>> Clang Dev: what should we do to fix this? I don't think it's helpful
>>> to add machinery to llvm::cast to handle concrete type conversion, so
>>> I think we should consider a non-llvm::cast solution for this
>>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>>> ctor for every specific TypeLoc that has an appropriate assert & calls
>>> up through to the base copy ctor. It'll be a fair bit of code to add
>>> to TypeLoc, but nothing complicated.
>>>
>>> Any other ideas?
>>
>> I don't see why isa<> and cast<> aren't the right code interface for this,
>> rather than reinventing something else.  You just need to teach cast<>
>> to construct and return a proper temporary.
>
> LLVM-dev folks: what do you reckon? Should we extend the cast
> infrastructure to be able to handle value type conversions?

No, I don't think that makes sense; the casting infrastructure is
complicated enough without having to deal with this.  It would be easy
enough to implement a method on TypeLoc to handle this,

> It doesn't really feel like the right fit to me, but I'll defer to the
> community's wisdom if it's overwhelming. I'd just like to see a
> solution wherein we can't write this particular kind of bug again,
> ideally.

Attaching proof-of-concept solution which prevents this kind of bug.
(Ugly, and requires C++11 support as written, but potentially
interesting anyway.)

-Eli
-------------- next part --------------
Index: include/llvm/Support/Casting.h
===================================================================
--- include/llvm/Support/Casting.h	(revision 168949)
+++ include/llvm/Support/Casting.h	(working copy)
@@ -203,13 +203,27 @@
 //
 //  cast<Instruction>(myVal)->getParent()
 //
-template <class X, class Y>
+template <class X, class Y, class S = typename std::enable_if<!std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
 inline typename cast_retty<X, Y>::ret_type cast(const Y &Val) {
   assert(isa<X>(Val) && "cast<Ty>() argument of incompatible type!");
   return cast_convert_val<X, Y,
                           typename simplify_type<Y>::SimpleType>::doit(Val);
 }
 
+template <class X, class Y, class S = typename std::enable_if<std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
+inline typename cast_retty<X, Y>::ret_type cast(Y &Val) {
+  assert(isa<X>(Val) && "cast<Ty>() argument of incompatible type!");
+  return cast_convert_val<X, Y,
+                          typename simplify_type<Y>::SimpleType>::doit(Val);
+}
+
+template <class X, class Y, class S = typename std::enable_if<std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
+inline typename cast_retty<X, Y*>::ret_type cast(Y *Val) {
+  assert(isa<X>(Val) && "cast<Ty>() argument of incompatible type!");
+  return cast_convert_val<X, Y*,
+                          typename simplify_type<Y*>::SimpleType>::doit(Val);
+}
+
 // cast_or_null<X> - Functionally identical to cast, except that a null value is
 // accepted.
 //
@@ -229,11 +243,21 @@
 //  if (const Instruction *I = dyn_cast<Instruction>(myVal)) { ... }
 //
 
-template <class X, class Y>
+template <class X, class Y, class S = typename std::enable_if<!std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
 inline typename cast_retty<X, Y>::ret_type dyn_cast(const Y &Val) {
-  return isa<X>(Val) ? cast<X, Y>(Val) : 0;
+  return isa<X>(Val) ? cast<X>(Val) : 0;
 }
 
+template <class X, class Y, class S = typename std::enable_if<std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
+inline typename cast_retty<X, Y>::ret_type dyn_cast(Y &Val) {
+  return isa<X>(Val) ? cast<X>(Val) : 0;
+}
+
+template <class X, class Y, class S = typename std::enable_if<std::is_same<Y, typename simplify_type<Y>::SimpleType>::value>::type>
+inline typename cast_retty<X, Y*>::ret_type dyn_cast(Y *Val) {
+  return isa<X>(Val) ? cast<X>(Val) : 0;
+}
+
 // dyn_cast_or_null<X> - Functionally identical to dyn_cast, except that a null
 // value is accepted.
 //
Index: lib/VMCore/AsmWriter.cpp
===================================================================
--- lib/VMCore/AsmWriter.cpp	(revision 168949)
+++ lib/VMCore/AsmWriter.cpp	(working copy)
@@ -1760,7 +1760,7 @@
 
   // Special case conditional branches to swizzle the condition out to the front
   if (isa<BranchInst>(I) && cast<BranchInst>(I).isConditional()) {
-    BranchInst &BI(cast<BranchInst>(I));
+    const BranchInst &BI(cast<BranchInst>(I));
     Out << ' ';
     writeOperand(BI.getCondition(), true);
     Out << ", ";
@@ -1769,14 +1769,14 @@
     writeOperand(BI.getSuccessor(1), true);
 
   } else if (isa<SwitchInst>(I)) {
-    SwitchInst& SI(cast<SwitchInst>(I));
+    const SwitchInst& SI(cast<SwitchInst>(I));
     // Special case switch instruction to get formatting nice and correct.
     Out << ' ';
     writeOperand(SI.getCondition(), true);
     Out << ", ";
     writeOperand(SI.getDefaultDest(), true);
     Out << " [";
-    for (SwitchInst::CaseIt i = SI.case_begin(), e = SI.case_end();
+    for (SwitchInst::ConstCaseIt i = SI.case_begin(), e = SI.case_end();
          i != e; ++i) {
       Out << "\n    ";
       writeOperand(i.getCaseValue(), true);
Index: lib/MC/MCAssembler.cpp
===================================================================
--- lib/MC/MCAssembler.cpp	(revision 168949)
+++ lib/MC/MCAssembler.cpp	(working copy)
@@ -336,7 +336,7 @@
   }
 
   case MCFragment::FT_Org: {
-    MCOrgFragment &OF = cast<MCOrgFragment>(F);
+    const MCOrgFragment &OF = cast<MCOrgFragment>(F);
     int64_t TargetLocation;
     if (!OF.getOffset().EvaluateAsAbsolute(TargetLocation, Layout))
       report_fatal_error("expected assembly-time absolute expression");
@@ -393,7 +393,7 @@
   uint64_t FragmentSize = Asm.computeFragmentSize(Layout, F);
   switch (F.getKind()) {
   case MCFragment::FT_Align: {
-    MCAlignFragment &AF = cast<MCAlignFragment>(F);
+    const MCAlignFragment &AF = cast<MCAlignFragment>(F);
     uint64_t Count = FragmentSize / AF.getValueSize();
 
     assert(AF.getValueSize() && "Invalid virtual align in concrete fragment!");
@@ -432,14 +432,14 @@
   }
 
   case MCFragment::FT_Data: {
-    MCDataFragment &DF = cast<MCDataFragment>(F);
+    const MCDataFragment &DF = cast<MCDataFragment>(F);
     assert(FragmentSize == DF.getContents().size() && "Invalid size!");
     OW->WriteBytes(DF.getContents().str());
     break;
   }
 
   case MCFragment::FT_Fill: {
-    MCFillFragment &FF = cast<MCFillFragment>(F);
+    const MCFillFragment &FF = cast<MCFillFragment>(F);
 
     assert(FF.getValueSize() && "Invalid virtual align in concrete fragment!");
 
@@ -456,19 +456,19 @@
   }
 
   case MCFragment::FT_Inst: {
-    MCInstFragment &IF = cast<MCInstFragment>(F);
+    const MCInstFragment &IF = cast<MCInstFragment>(F);
     OW->WriteBytes(StringRef(IF.getCode().begin(), IF.getCode().size()));
     break;
   }
 
   case MCFragment::FT_LEB: {
-    MCLEBFragment &LF = cast<MCLEBFragment>(F);
+    const MCLEBFragment &LF = cast<MCLEBFragment>(F);
     OW->WriteBytes(LF.getContents().str());
     break;
   }
 
   case MCFragment::FT_Org: {
-    MCOrgFragment &OF = cast<MCOrgFragment>(F);
+    const MCOrgFragment &OF = cast<MCOrgFragment>(F);
 
     for (uint64_t i = 0, e = FragmentSize; i != e; ++i)
       OW->Write8(uint8_t(OF.getValue()));
@@ -506,7 +506,7 @@
         // Check that we aren't trying to write a non-zero contents (or fixups)
         // into a virtual section. This is to support clients which use standard
         // directives to fill the contents of virtual sections.
-        MCDataFragment &DF = cast<MCDataFragment>(*it);
+        const MCDataFragment &DF = cast<MCDataFragment>(*it);
         assert(DF.fixup_begin() == DF.fixup_end() &&
                "Cannot have fixups in virtual section!");
         for (unsigned i = 0, e = DF.getContents().size(); i != e; ++i)
Index: lib/Bitcode/Writer/BitcodeWriter.cpp
===================================================================
--- lib/Bitcode/Writer/BitcodeWriter.cpp	(revision 168949)
+++ lib/Bitcode/Writer/BitcodeWriter.cpp	(working copy)
@@ -1174,7 +1174,7 @@
   case Instruction::Br:
     {
       Code = bitc::FUNC_CODE_INST_BR;
-      BranchInst &II = cast<BranchInst>(I);
+      const BranchInst &II = cast<BranchInst>(I);
       Vals.push_back(VE.getValueID(II.getSuccessor(0)));
       if (II.isConditional()) {
         Vals.push_back(VE.getValueID(II.getSuccessor(1)));
@@ -1189,7 +1189,7 @@
       SmallVector<uint64_t, 128> Vals64;
 
       Code = bitc::FUNC_CODE_INST_SWITCH;
-      SwitchInst &SI = cast<SwitchInst>(I);
+      const SwitchInst &SI = cast<SwitchInst>(I);
 
       uint32_t SwitchRecordHeader = SI.hash() | (SWITCH_INST_MAGIC << 16);
       Vals64.push_back(SwitchRecordHeader);
@@ -1198,9 +1198,9 @@
       pushValue64(SI.getCondition(), InstID, Vals64, VE);
       Vals64.push_back(VE.getValueID(SI.getDefaultDest()));
       Vals64.push_back(SI.getNumCases());
-      for (SwitchInst::CaseIt i = SI.case_begin(), e = SI.case_end();
+      for (SwitchInst::ConstCaseIt i = SI.case_begin(), e = SI.case_end();
            i != e; ++i) {
-        IntegersSubset& CaseRanges = i.getCaseValueEx();
+        const IntegersSubset& CaseRanges = i.getCaseValueEx();
         unsigned Code, Abbrev; // will unused.
 
         if (CaseRanges.isSingleNumber()) {


More information about the llvm-dev mailing list