[llvm] r354998 - [llvm-cxxfilt] Split and demangle stdin input on certain non-alphanumerics.

Matt Davis via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 27 08:29:50 PST 2019


Author: mattd
Date: Wed Feb 27 08:29:50 2019
New Revision: 354998

URL: http://llvm.org/viewvc/llvm-project?rev=354998&view=rev
Log:
[llvm-cxxfilt] Split and demangle stdin input on certain non-alphanumerics.

Summary:
This patch attempts to replicate GNU c++-filt behavior when splitting stdin input for demangling.

Previously, cxx-filt would split input only on spaces.  Each delimited item is then demangled.
>From what I have tested, GNU c++filt also splits input on any character that does not make
up the mangled name (notably commas, but also a large set of non-alphanumeric characters).

This patch splits stdin input on any character that does not belong to the Itanium mangling
format (since Itanium is currently the only supported format in llvm-cxxfilt).

This is an update to PR39990

Reviewers: jhenderson, tejohnson, compnerd

Reviewed By: compnerd

Subscribers: erik.pilkington, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58416

Added:
    llvm/trunk/test/tools/llvm-cxxfilt/delimiters.test
Modified:
    llvm/trunk/tools/llvm-cxxfilt/llvm-cxxfilt.cpp

Added: llvm/trunk/test/tools/llvm-cxxfilt/delimiters.test
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/llvm-cxxfilt/delimiters.test?rev=354998&view=auto
==============================================================================
--- llvm/trunk/test/tools/llvm-cxxfilt/delimiters.test (added)
+++ llvm/trunk/test/tools/llvm-cxxfilt/delimiters.test Wed Feb 27 08:29:50 2019
@@ -0,0 +1,64 @@
+RUN: echo ",,_Z3Foo!" \
+RUN:      "_Z3Foo\""  \
+RUN:      "_Z3Foo\""  \
+RUN:      "_Z3Foo#"   \
+RUN:      "_Z3Foo%"   \
+RUN:      "_Z3Foo&"   \
+RUN:      "_Z3Foo'"   \
+RUN:      "_Z3Foo("   \
+RUN:      "_Z3Foo)"   \
+RUN:      "_Z3Foo*"   \
+RUN:      "_Z3Foo+"   \
+RUN:      "_Z3Foo,"   \
+RUN:      "_Z3Foo-"   \
+RUN:      "_Z3Foo/"   \
+RUN:      "_Z3Foo:"   \
+RUN:      "_Z3Foo;"   \
+RUN:      "_Z3Foo<"   \
+RUN:      "_Z3Foo="   \
+RUN:      "_Z3Foo>"   \
+RUN:      "_Z3Foo?"   \
+RUN:      "_Z3Foo@"   \
+RUN:      "_Z3Foo["   \
+RUN:      "_Z3Foo\\"  \
+RUN:      "_Z3Foo]"   \
+RUN:      "_Z3Foo^"   \
+RUN:      "_Z3Foo\`"  \
+RUN:      "_Z3Foo{"   \
+RUN:      "_Z3Foo|"   \
+RUN:      "_Z3Foo}"   \
+RUN:      "_Z3Foo~,," \
+RUN:      "_Z3Foo,,_Z3Bar::_Z3Baz  _Z3Foo,_Z3Bar:_Z3Baz" \
+RUN:      '_Z3Foo$ ._Z3Foo' | llvm-cxxfilt | FileCheck %s
+
+CHECK: ,,Foo!
+CHECK: Foo"
+CHECK: Foo#
+CHECK: Foo%
+CHECK: Foo&
+CHECK: Foo'
+CHECK: Foo(
+CHECK: Foo)
+CHECK: Foo*
+CHECK: Foo+
+CHECK: Foo,
+CHECK: Foo-
+CHECK: Foo/
+CHECK: Foo:
+CHECK: Foo;
+CHECK: Foo<
+CHECK: Foo=
+CHECK: Foo>
+CHECK: Foo?
+CHECK: Foo@
+CHECK: Foo[
+CHECK: Foo\
+CHECK: Foo]
+CHECK: Foo^
+CHECK: Foo`
+CHECK: Foo{
+CHECK: Foo|
+CHECK: Foo}
+CHECK: Foo~,,
+CHECK: Foo,,Bar::Baz  Foo,Bar:Baz
+CHECK: _Z3Foo$ ._Z3Foo

Modified: llvm/trunk/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-cxxfilt/llvm-cxxfilt.cpp?rev=354998&r1=354997&r2=354998&view=diff
==============================================================================
--- llvm/trunk/tools/llvm-cxxfilt/llvm-cxxfilt.cpp (original)
+++ llvm/trunk/tools/llvm-cxxfilt/llvm-cxxfilt.cpp Wed Feb 27 08:29:50 2019
@@ -78,19 +78,50 @@ static std::string demangle(llvm::raw_os
   return Result;
 }
 
+// Split 'Source' on any character that fails to pass 'IsLegalChar'.  The
+// returned vector consists of pairs where 'first' is the delimited word, and
+// 'second' are the delimiters following that word.
+static void SplitStringDelims(
+    StringRef Source,
+    SmallVectorImpl<std::pair<StringRef, StringRef>> &OutFragments,
+    function_ref<bool(char)> IsLegalChar) {
+  // The beginning of the input string.
+  const auto Head = Source.begin();
+
+  // Obtain any leading delimiters.
+  auto Start = std::find_if(Head, Source.end(), IsLegalChar);
+  if (Start != Head)
+    OutFragments.push_back({"", Source.slice(0, Start - Head)});
+
+  // Capture each word and the delimiters following that word.
+  while (Start != Source.end()) {
+    Start = std::find_if(Start, Source.end(), IsLegalChar);
+    auto End = std::find_if_not(Start, Source.end(), IsLegalChar);
+    auto DEnd = std::find_if(End, Source.end(), IsLegalChar);
+    OutFragments.push_back({Source.slice(Start - Head, End - Head),
+                            Source.slice(End - Head, DEnd - Head)});
+    Start = DEnd;
+  }
+}
+
+// This returns true if 'C' is a character that can show up in an
+// Itanium-mangled string.
+static bool IsLegalItaniumChar(char C) {
+  // Itanium CXX ABI [External Names]p5.1.1:
+  // '$' and '.' in mangled names are reserved for private implementations.
+  return isalnum(C) || C == '.' || C == '$' || C == '_';
+}
+
 // If 'Split' is true, then 'Mangled' is broken into individual words and each
 // word is demangled.  Otherwise, the entire string is treated as a single
 // mangled item.  The result is output to 'OS'.
 static void demangleLine(llvm::raw_ostream &OS, StringRef Mangled, bool Split) {
   std::string Result;
   if (Split) {
-    SmallVector<StringRef, 16> Words;
-    SplitString(Mangled, Words);
-    for (auto Word : Words)
-      Result += demangle(OS, Word) + ' ';
-    // Remove the trailing space character.
-    if (Result.back() == ' ')
-      Result.pop_back();
+    SmallVector<std::pair<StringRef, StringRef>, 16> Words;
+    SplitStringDelims(Mangled, Words, IsLegalItaniumChar);
+    for (const auto &Word : Words)
+      Result += demangle(OS, Word.first) + Word.second.str();
   } else
     Result = demangle(OS, Mangled);
   OS << Result << '\n';




More information about the llvm-commits mailing list