[llvm] r338226 - [MS Demangler] Demangle symbols in function scopes.
David Majnemer via llvm-commits
llvm-commits at lists.llvm.org
Sun Jul 29 20:43:45 PDT 2018
On Sun, Jul 29, 2018 at 8:12 PM Zachary Turner via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Author: zturner
> Date: Sun Jul 29 20:12:34 2018
> New Revision: 338226
>
> URL: http://llvm.org/viewvc/llvm-project?rev=338226&view=rev
> Log:
> [MS Demangler] Demangle symbols in function scopes.
>
> There are a couple of issues you run into when you start getting into
> more complex names, especially with regards to function local statics.
> When you've got something like:
>
> int x() {
> static int n = 0;
> return n;
> }
>
> Then this needs to demangle to something like
>
> int `int __cdecl x()'::`1'::n
>
> The nested mangled symbols (e.g. `int __cdecl x()` in the above
> example) also share state with regards to back-referencing, so
> we need to be able to re-use the demangler in the middle of
> demangling a symbol while sharing back-ref state.
>
> To make matters more complicated, there are a lot of ambiguities
> when demangling a symbol's qualified name, because a function local
> scope pattern (usually something like `?1??name?`) looks suspiciously
> like many other possible things that can occur, such as `?1` meaning
> the second back-ref and disambiguating these cases is rather
> interesting. The `?1?` in a local scope pattern is actually a special
> case of the more general pattern of `? + <encoded number> + ?`, where
> "encoded number" can itself have embedded `@` symbols, which is a
> common delimeter in mangled names. So we have to take care during the
> disambiguation, which is the reason for the overly complicated
> `isLocalScopePattern` function in this patch.
>
> I've added some pretty obnoxious tests to exercise all of this, which
> exposed several other problems related to back-referencing, so those
> are fixed here as well. Finally, I've uncommented some tests that were
> previously marked as `FIXME`, since now these work.
>
> Differential Revision: https://reviews.llvm.org/D49965
>
> Added:
> llvm/trunk/test/Demangle/ms-nested-scopes.test
> Modified:
> llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp
> llvm/trunk/lib/Demangle/StringView.h
> llvm/trunk/test/Demangle/ms-mangle.test
>
> Modified: llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp?rev=338226&r1=338225&r2=338226&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp (original)
> +++ llvm/trunk/lib/Demangle/MicrosoftDemangle.cpp Sun Jul 29 20:12:34 2018
> @@ -33,11 +33,21 @@ class ArenaAllocator {
> struct AllocatorNode {
> uint8_t *Buf = nullptr;
> size_t Used = 0;
> + size_t Capacity = 0;
> AllocatorNode *Next = nullptr;
> };
>
> + void addNode(size_t Capacity) {
> + AllocatorNode *NewHead = new AllocatorNode;
> + NewHead->Buf = new uint8_t[Capacity];
> + NewHead->Next = Head;
> + NewHead->Capacity = Capacity;
> + Head = NewHead;
> + NewHead->Used = 0;
> + }
> +
> public:
> - ArenaAllocator() : Head(new AllocatorNode) { Head->Buf = new
> uint8_t[Unit]; }
> + ArenaAllocator() { addNode(Unit); }
>
> ~ArenaAllocator() {
> while (Head) {
> @@ -49,10 +59,25 @@ public:
> }
> }
>
> + char *allocUnalignedBuffer(size_t Length) {
> + uint8_t *Buf = Head->Buf + Head->Used;
> +
> + Head->Used += Length;
> + if (Head->Used > Head->Capacity) {
> + // It's possible we need a buffer which is larger than our default
> unit
> + // size, so we need to be careful to add a node with capacity that
> is at
> + // least as large as what we need.
> + addNode(std::max(Unit, Length));
> + Head->Used = Length;
> + Buf = Head->Buf;
> + }
> +
> + return reinterpret_cast<char *>(Buf);
> + }
> +
> template <typename T, typename... Args> T *alloc(Args &&...
> ConstructorArgs) {
>
> size_t Size = sizeof(T);
> - assert(Size < Unit);
> assert(Head && Head->Buf);
>
> size_t P = (size_t)Head->Buf + Head->Used;
> @@ -62,15 +87,12 @@ public:
> size_t Adjustment = AlignedP - P;
>
> Head->Used += Size + Adjustment;
> - if (Head->Used < Unit)
> + if (Head->Used < Head->Capacity)
> return new (PP) T(std::forward<Args>(ConstructorArgs)...);
>
> - AllocatorNode *NewHead = new AllocatorNode;
> - NewHead->Buf = new uint8_t[ArenaAllocator::Unit];
> - NewHead->Next = Head;
> - Head = NewHead;
> - NewHead->Used = Size;
> - return new (NewHead->Buf) T(std::forward<Args>(ConstructorArgs)...);
> + addNode(ArenaAllocator::Unit);
> + Head->Used = Size;
> + return new (Head->Buf) T(std::forward<Args>(ConstructorArgs)...);
> }
>
> private:
> @@ -386,6 +408,47 @@ static void outputCallingConvention(Outp
> }
> }
>
> +static bool startsWithLocalScopePattern(StringView S) {
> + if (!S.consumeFront('?'))
> + return false;
> + if (S.size() < 2)
> + return false;
> +
> + size_t End = S.find('?');
> + if (End == StringView::npos)
> + return false;
> + StringView Candidate = S.substr(0, End);
> + if (Candidate.empty())
> + return false;
> +
> + // \?[0-9]\?
> + // ?@? is the discriminator 0.
> + if (Candidate.size() == 1)
> + return Candidate[0] == '@' || (Candidate[0] >= '0' && Candidate[0] <=
> '9');
> +
> + // If it's not 0-9, then it's an encoded number terminated with an @
> + if (Candidate.back() != '@')
> + return false;
> + Candidate = Candidate.dropBack();
> +
> + // An encoded number starts with B-P and all subsequent digits are in
> A-P.
> + // Note that the reason the first digit cannot be A is two fold.
> First, it
> + // would create an ambiguity with ?A which delimits the beginning of an
> + // anonymous namespace. Second, A represents 0, and you don't start a
> multi
> + // digit number with a leading 0. Presumably the anonymous namespace
> + // ambiguity is also why single digit encoded numbers use 0-9 rather
> than A-J.
> + if (Candidate[0] < 'B' || Candidate[0] > 'P')
>
We just did a dropBack(). I think we need to check Candidate.empty()
otherwise the range check against 'B' - 'P' will end up accessing Candidate
out of bounds.
Perhaps something like:
if (Candidate.empty() || Candidate[0] < 'B' || Candidate[0] > 'P')
> + return false;
> + Candidate = Candidate.dropFront();
> + while (!Candidate.empty()) {
> + if (Candidate[0] < 'A' || Candidate[0] > 'P')
> + return false;
> + Candidate = Candidate.dropFront();
> + }
> +
> + return true;
> +}
> +
> // Write a function or template parameter list.
> static void outputParameterList(OutputStream &OS, const ParamList
> &Params) {
> if (!Params.Current) {
> @@ -763,6 +826,10 @@ private:
> int demangleNumber(StringView &MangledName);
>
> void memorizeString(StringView s);
> +
> + /// Allocate a copy of \p Borrowed into memory that we own.
> + StringView copyString(StringView Borrowed);
> +
> Name *demangleFullyQualifiedTypeName(StringView &MangledName);
> Name *demangleFullyQualifiedSymbolName(StringView &MangledName);
>
> @@ -777,6 +844,7 @@ private:
> Name *demangleOperatorName(StringView &MangledName);
> Name *demangleSimpleName(StringView &MangledName, bool Memorize);
> Name *demangleAnonymousNamespaceName(StringView &MangledName);
> + Name *demangleLocallyScopedNamePiece(StringView &MangledName);
>
> void demangleOperator(StringView &MangledName, Name *);
> FuncClass demangleFunctionClass(StringView &MangledName);
> @@ -813,6 +881,13 @@ private:
> };
> } // namespace
>
> +StringView Demangler::copyString(StringView Borrowed) {
> + char *Stable = Arena.allocUnalignedBuffer(Borrowed.size() + 1);
> + std::strcpy(Stable, Borrowed.begin());
> +
> + return {Stable, Borrowed.size()};
> +}
> +
> // Parser entry point.
> Symbol *Demangler::parse(StringView &MangledName) {
> Symbol *S = Arena.alloc<Symbol>();
> @@ -956,6 +1031,18 @@ Name *Demangler::demangleClassTemplateNa
>
> Name *Node = demangleSimpleName(MangledName, false);
> Node->TemplateParams = demangleTemplateParameterList(MangledName);
> +
> + // Render this class template name into a string buffer so that we can
> + // memorize it for the purpose of back-referencing.
> + OutputStream OS = OutputStream::create(nullptr, nullptr, 1024);
> + outputName(OS, Node);
> + OS << '\0';
> + char *Name = OS.getBuffer();
> +
> + StringView Owned = copyString(Name);
> + memorizeString(Owned);
> + std::free(Name);
> +
> return Node;
> }
>
> @@ -1103,6 +1190,34 @@ Name *Demangler::demangleAnonymousNamesp
> return nullptr;
> }
>
> +Name *Demangler::demangleLocallyScopedNamePiece(StringView &MangledName) {
> + assert(startsWithLocalScopePattern(MangledName));
> +
> + Name *Node = Arena.alloc<Name>();
> + MangledName.consumeFront('?');
> + int ScopeIdentifier = demangleNumber(MangledName);
> +
> + // One ? to terminate the number
> + MangledName.consumeFront('?');
> +
> + assert(!Error);
> + Symbol *Scope = parse(MangledName);
> + if (Error)
> + return nullptr;
> +
> + // Render the parent symbol's name into a buffer.
> + OutputStream OS = OutputStream::create(nullptr, nullptr, 1024);
> + OS << '`';
> + output(Scope, OS);
> + OS << '\'';
> + OS << "::`" << ScopeIdentifier << "'";
> + OS << '\0';
> + char *Result = OS.getBuffer();
> + Node->Str = copyString(Result);
> + std::free(Result);
> + return Node;
> +}
> +
> // Parses a type name in the form of A at B@C@@ which represents C::B::A.
> Name *Demangler::demangleFullyQualifiedTypeName(StringView &MangledName) {
> Name *TypeName = demangleUnqualifiedTypeName(MangledName);
> @@ -1140,6 +1255,10 @@ Name *Demangler::demangleUnqualifiedType
> }
>
> Name *Demangler::demangleUnqualifiedSymbolName(StringView &MangledName) {
> + if (startsWithDigit(MangledName))
> + return demangleBackRefName(MangledName);
> + if (MangledName.startsWith("?$"))
> + return demangleClassTemplateName(MangledName);
> if (MangledName.startsWith('?'))
> return demangleOperatorName(MangledName);
> return demangleSimpleName(MangledName, true);
> @@ -1155,6 +1274,9 @@ Name *Demangler::demangleNameScopePiece(
> if (MangledName.startsWith("?A"))
> return demangleAnonymousNamespaceName(MangledName);
>
> + if (startsWithLocalScopePattern(MangledName))
> + return demangleLocallyScopedNamePiece(MangledName);
> +
> return demangleSimpleName(MangledName, true);
> }
>
> @@ -1727,9 +1849,6 @@ void Demangler::output(const Symbol *S,
> Type::outputPre(OS, *S->SymbolType);
> outputName(OS, S->SymbolName);
> Type::outputPost(OS, *S->SymbolType);
> -
> - // Null terminate the buffer.
> - OS << '\0';
> }
>
> char *llvm::microsoftDemangle(const char *MangledName, char *Buf, size_t
> *N,
> @@ -1745,5 +1864,6 @@ char *llvm::microsoftDemangle(const char
>
> OutputStream OS = OutputStream::create(Buf, N, 1024);
> D.output(S, OS);
> + OS << '\0';
> return OS.getBuffer();
> }
>
> Modified: llvm/trunk/lib/Demangle/StringView.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Demangle/StringView.h?rev=338226&r1=338225&r2=338226&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Demangle/StringView.h (original)
> +++ llvm/trunk/lib/Demangle/StringView.h Sun Jul 29 20:12:34 2018
> @@ -22,6 +22,8 @@ class StringView {
> const char *Last;
>
> public:
> + static const size_t npos = ~size_t(0);
> +
> template <size_t N>
> StringView(const char (&Str)[N]) : First(Str), Last(Str + N - 1) {}
> StringView(const char *First_, const char *Last_)
> @@ -35,6 +37,17 @@ public:
> return StringView(begin() + From, size() - From);
> }
>
> + size_t find(char C, size_t From = 0) const {
> + size_t FindBegin = std::min(From, size());
> + // Avoid calling memchr with nullptr.
> + if (FindBegin < size()) {
> + // Just forward to memchr, which is faster than a hand-rolled loop.
> + if (const void *P = ::memchr(First + FindBegin, C, size() -
> FindBegin))
> + return static_cast<const char *>(P) - First;
> + }
> + return npos;
> + }
> +
> StringView substr(size_t From, size_t To) const {
> if (To >= size())
> To = size() - 1;
> @@ -49,11 +62,22 @@ public:
> return StringView(First + N, Last);
> }
>
> + StringView dropBack(size_t N = 1) const {
> + if (N >= size())
> + N = size();
> + return StringView(First, Last - N);
> + }
> +
> char front() const {
> assert(!empty());
> return *begin();
> }
>
> + char back() const {
> + assert(!empty());
> + return *(end() - 1);
> + }
> +
> char popFront() {
> assert(!empty());
> return *First++;
>
> Modified: llvm/trunk/test/Demangle/ms-mangle.test
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Demangle/ms-mangle.test?rev=338226&r1=338225&r2=338226&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/Demangle/ms-mangle.test (original)
> +++ llvm/trunk/test/Demangle/ms-mangle.test Sun Jul 29 20:12:34 2018
> @@ -265,18 +265,18 @@
> ?s6 at PR13182@@3PBQBDB
> ; CHECK: char const *const *PR13182::s6
>
> -; FIXME: We don't properly support static locals in functions yet.
> +; FIXME: We don't properly support extern "C" functions yet.
> ; ?local@?1??extern_c_func@@9 at 4HA
> ; FIXME: int `extern_c_func'::`2'::local
>
> ; ?local@?1??extern_c_func@@9 at 4HA
> ; FIXME: int `extern_c_func'::`2'::local
>
> -; ?v@?1??f@@YAHXZ at 4U<unnamed-type-v>@?1??1 at YAHXZ@A
> -; FIXME: struct `int __cdecl f(void)'::`2'::<unnamed-type-v> `int __cdecl
> f(void)'::`2'::v
> +?v@?1??f@@YAHXZ at 4U<unnamed-type-v>@?1??1 at YAHXZ@A
> +; CHECK: struct `int __cdecl f(void)'::`2'::<unnamed-type-v> `int __cdecl
> f(void)'::`2'::v
>
> -; ?v@?1???$f at H@@YAHXZ at 4U<unnamed-type-v>@?1???$f at H@@YAHXZ at A
> -; FIXME: struct `int __cdecl f<int>(void)'::`2'::<unnamed-type-v> `int
> __cdecl f<int>(void)'::`2'::v
> +?v@?1???$f at H@@YAHXZ at 4U<unnamed-type-v>@?1???$f at H@@YAHXZ at A
> +; CHECK: struct `int __cdecl f<int>(void)'::`2'::<unnamed-type-v> `int
> __cdecl f<int>(void)'::`2'::v
>
> ??2OverloadedNewDelete@@SAPAXI at Z
> ; CHECK: static void * __cdecl OverloadedNewDelete::operator new(unsigned
> int)
> @@ -335,8 +335,8 @@
> ; ?overloaded_fn@@$$J0YAXXZ
> ; FIXME-EXTERNC: extern \"C\" void __cdecl overloaded_fn(void)
>
> -; ?f at UnnamedType@@YAXQAPAU<unnamed-type-T1>@S at 1@@Z
> -; FIXME: void __cdecl UnnamedType::f(struct
> UnnamedType::S::<unnamed-type-T1> ** const)
> +?f at UnnamedType@@YAXQAPAU<unnamed-type-T1>@S at 1@@Z
> +; CHECK: void __cdecl UnnamedType::f(struct
> UnnamedType::S::<unnamed-type-T1> **const)
>
> ?f at UnnamedType@@YAXUT2 at S@1@@Z
> ; CHECK: void __cdecl UnnamedType::f(struct UnnamedType::S::T2)
>
> Added: llvm/trunk/test/Demangle/ms-nested-scopes.test
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Demangle/ms-nested-scopes.test?rev=338226&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Demangle/ms-nested-scopes.test (added)
> +++ llvm/trunk/test/Demangle/ms-nested-scopes.test Sun Jul 29 20:12:34 2018
> @@ -0,0 +1,146 @@
> +; RUN: llvm-undname < %s | FileCheck %s
> +
> +; CHECK-NOT: Invalid mangled name
> +
> +; Test demangling of function local scope discriminator IDs.
> +?M@?@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`0'::M
> +
> +?M@?0??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`1'::M
> +
> +?M@?1??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`2'::M
> +
> +?M@?2??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`3'::M
> +
> +?M@?3??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`4'::M
> +
> +?M@?4??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`5'::M
> +
> +?M@?5??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`6'::M
> +
> +?M@?6??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`7'::M
> +
> +?M@?7??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`8'::M
> +
> +?M@?8??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`9'::M
> +
> +?M@?9??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`10'::M
> +
> +?M@?L@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`11'::M
> +
> +?M@?M@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`12'::M
> +
> +?M@?N@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`13'::M
> +
> +?M@?O@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`14'::M
> +
> +?M@?P@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`15'::M
> +
> +?M@?BA@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`16'::M
> +
> +?M@?BB@??L@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L(void)'::`17'::M
> +
> +?j@?1??L@@YAHXZ at 4UJ@@A
> +; CHECK: struct J `int __cdecl L(void)'::`2'::j
> +
> +; Test demangling of name back-references
> +?NN at 0XX@@3HA
> +; CHECK: int XX::NN::NN
> +
> +?MM at 0NN@XX@@3HA
> +; CHECK: int XX::NN::MM::MM
> +
> +?NN at MM@0XX@@3HA
> +; CHECK: int XX::NN::MM::NN
> +
> +?OO at 0NN@01XX@@3HA
> +; CHECK: int XX::NN::OO::NN::OO::OO
> +
> +?NN at OO@010XX@@3HA
> +; CHECK: int XX::NN::OO::NN::OO::NN
> +
> +; Test demangling of name back-references combined with function local
> scopes.
> +?M@?1??0 at YAHXZ@4HA
> +; CHECK: int `int __cdecl M(void)'::`2'::M
> +
> +?L@?2??M at 0?2??0 at YAHXZ@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl `int __cdecl L(void)'::`3'::L::M(void)'::`3'::L
> +
> +?M@?2??0L@?2??1 at YAHXZ@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl `int __cdecl L(void)'::`3'::L::M(void)'::`3'::M
> +
> +; Function local scopes of template functions
> +?M@?1???$L at H@@YAHXZ at 4HA
> +; CHECK: int `int __cdecl L<int>(void)'::`2'::M
> +
> +; And member functions of template classes
> +?SN@?$NS at H@NS@@QEAAHXZ
> +; CHECK: int __cdecl NS::NS<int>::SN(void)
> +
> +?NS@?1??SN@?$NS at H@0 at QEAAHXZ@4HA
> +; CHECK: int `int __cdecl NS::NS<int>::SN(void)'::`2'::NS
> +
> +?SN@?1??0?$NS at H@NS@@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl NS::NS<int>::SN(void)'::`2'::SN
> +
> +?NS@?1??SN@?$NS at H@10 at QEAAHXZ@4HA
> +; CHECK: int `int __cdecl NS::SN::NS<int>::SN(void)'::`2'::NS
> +
> +?SN@?1??0?$NS at H@0NS@@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl NS::SN::NS<int>::SN(void)'::`2'::SN
> +
> +; Make sure instantiated templates participate in back-referencing.
> +; In the next 3 examples there should be 3 back-references:
> +; 0 = X (right most name)
> +; 1 = C<int> (second from right)
> +; 2 = C (third from right)
> +; Make sure all 3 work as expected by having the 4th component take each
> value
> +; from 0-2 and confirming it is the right component.
> +?X@?$C at H@C at 0@2HB
> +; CHECK: static int const X::C::C<int>::X
> +
> +?X@?$C at H@C at 1@2HB
> +; CHECK: static int const C<int>::C::C<int>::X
> +
> +?X@?$C at H@C at 2@2HB
> +; CHECK: static int const C::C::C<int>::X
> +
> +; Putting everything together.
> +
> +; namespace A { namespace B { namespace C { namespace B { namespace C {
> +; template<typename T>
> +; struct C {
> +; int B() {
> +; static C<int> C;
> +; static int B = 7;
> +; static int A = 7;
> +; return C.B() + B + A;
> +; }
> +; };
> +; } } } } }
> +
> +?C@?1??B@?$C at H@0101A@@QEAAHXZ at 4U201013@A
> +; CHECK: struct A::B::C::B::C::C<int> `int __cdecl
> A::B::C::B::C::C<int>::B(void)'::`2'::C
> +
> +?B@?1??0?$C at H@C at 020A@@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl A::B::C::B::C::C<int>::B(void)'::`2'::B
> +
> +?A@?1??B@?$C at H@C at 1310@QEAAHXZ at 4HA
> +; CHECK: int `int __cdecl A::B::C::B::C::C<int>::B(void)'::`2'::A
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180729/bb7373c6/attachment.html>
More information about the llvm-commits
mailing list