[llvm] r361744 - llvm-undname: Make demangling of MD5 names more robust

Nico Weber via llvm-commits llvm-commits at lists.llvm.org
Sun May 26 17:48:59 PDT 2019


Author: nico
Date: Sun May 26 17:48:59 2019
New Revision: 361744

URL: http://llvm.org/viewvc/llvm-project?rev=361744&view=rev
Log:
llvm-undname: Make demangling of MD5 names more robust

Demangler::parse() for MD5 names would:

1. Put all remaining text into the MD5 name sight unseen
2. Not modify MangledName

This meant that if the demangler recursively called parse() (e.g. in
demangleLocallyScopedNamePiece()), every recursive call that started on
an MD5 name would add all remaining bytes to the output buffer but
only advance the input by a byte.  For valid inputs, MD5 types are
never (well, see comments for 2 exceptions) nested, but for invalid
input this could cause memory use quadratic in the input size.

Modified:
    llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp
    llvm/trunk/test/Demangle/ms-md5.test

Modified: llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp?rev=361744&r1=361743&r2=361744&view=diff
==============================================================================
--- llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp (original)
+++ llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp Sun May 26 17:48:59 2019
@@ -747,16 +747,38 @@ SymbolNode *Demangler::demangleDeclarato
 
 // Parser entry point.
 SymbolNode *Demangler::parse(StringView &MangledName) {
-  // We can't demangle MD5 names, just output them as-is.
-  // Also, MSVC-style mangled symbols must start with '?'.
   if (MangledName.startsWith("??@")) {
     // This is an MD5 mangled name.  We can't demangle it, just return the
     // mangled name.
+    // An MD5 mangled name is ??@ followed by 32 characters and a terminating @.
+    size_t MD5Last = MangledName.find('@', strlen("??@"));
+    if (MD5Last == StringView::npos) {
+      Error = true;
+      return nullptr;
+    }
+    const char* Start = MangledName.begin();
+    MangledName = MangledName.dropFront(MD5Last + 1);
+
+    // There are two additional special cases for MD5 names:
+    // 1. For complete object locators where the object name is long enough
+    //    for the object to have an MD5 name, the complete object locator is
+    //    called ??@...@??_R4@ (with a trailing "??_R4@" instead of the usual
+    //    leading "??_R4". This is handled here.
+    // 2. For catchable types, in versions of MSVC before 2015 (<1900) or after
+    //    2017.2 (>= 1914), the catchable type mangling is _CT??@...@??@... at 8
+    //    instead of_CT??@... at 8 with just one MD5 name. Since we don't yet
+    //    demangle catchable types anywhere, this isn't handled for MD5 names
+    //    either.
+    MangledName.consumeFront("??_R4@");
+
+    StringView MD5(Start, MangledName.begin());
     SymbolNode *S = Arena.alloc<SymbolNode>(NodeKind::Md5Symbol);
-    S->Name = synthesizeQualifiedName(Arena, MangledName);
+    S->Name = synthesizeQualifiedName(Arena, MD5);
+
     return S;
   }
 
+  // MSVC-style mangled symbols must start with '?'.
   if (!MangledName.startsWith('?')) {
     Error = true;
     return nullptr;

Modified: llvm/trunk/test/Demangle/ms-md5.test
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Demangle/ms-md5.test?rev=361744&r1=361743&r2=361744&view=diff
==============================================================================
--- llvm/trunk/test/Demangle/ms-md5.test (original)
+++ llvm/trunk/test/Demangle/ms-md5.test Sun May 26 17:48:59 2019
@@ -1,4 +1,4 @@
-; These tests are based on clang/test/CodeGenCXX/mangle-ms-cxx11.cpp
+; These tests are based on clang/test/CodeGenCXX/mangle-ms-md5.cpp
 
 ; RUN: llvm-undname < %s | FileCheck %s
 
@@ -8,4 +8,16 @@
 ; two check lines here since the tool echos the input.
 ??@a6a285da2eea70dba6b578022be61d81@
 ; CHECK: ??@a6a285da2eea70dba6b578022be61d81@
-; CHECK-NEXT: ??@a6a285da2eea70dba6b578022be61d81@
\ No newline at end of file
+; CHECK-NEXT: ??@a6a285da2eea70dba6b578022be61d81@
+
+; Don't include trailing garbage:
+??@a6a285da2eea70dba6b578022be61d81 at asdf
+; CHECK: ??@a6a285da2eea70dba6b578022be61d81 at asdf
+; CHECK-NEXT: ??@a6a285da2eea70dba6b578022be61d81@
+
+; The complete object locator special case:
+; FIXME: This should probably print
+; ??@a6a285da2eea70dba6b578022be61d81@::`RTTI Complete Object Locator' instead.
+??@a6a285da2eea70dba6b578022be61d81@??_R4@
+; CHECK: ??@a6a285da2eea70dba6b578022be61d81@??_R4@
+; CHECK-NEXT: ??@a6a285da2eea70dba6b578022be61d81@??_R4@




More information about the llvm-commits mailing list