[llvm] r351481 - [demangler] Ignore leading underscores if present
John McCall via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 23 23:39:36 PST 2019
On 23 Jan 2019, at 15:48, Duncan Exon Smith wrote:
> +John and Nick
>
> I'm not sure this is the right thing to do.
>
> I believe there was an intentional decision way back when (before my
> time) to require clients to strip the leading underscore. c++filt on
> my desktop does this (passing `-_` by default, which means symbols
> without the leading underscore don't get demangled).
>
> Nick and John, do you have any recollection of why the demangler was
> restrictive like this? Any thoughts on whether it's problematic to
> relax it?
Okay. If you step back and consider the whole operation of taking a
string, figuring out whether it's a C++ mangled name, and then trying to
demangle it, there is a plausible reason why you might not want to
accept `__Z` as a prefix: it permits an algorithm which, when followed
by clients on leading-underscore targets, reliably avoids demangling
unreserved C function names like `Z3zapEv`. If clients can present you
with either a stripped or unstripped name, then it's ambiguous whether
`_Z3zapEv` is a stripped reserved name (hence okay to demangle) or an
unstripped unreserved name (hence not okay to demangle). If clients on
leading-underscore targets can be trusted to first find and remove the
leading underscore, and to just not call you if there's no leading
underscore, this problem resolves itself. Not allowing `__Z` as a
prefix therefore encourages clients to implement the logic correctly so
that you can get the corner cases right.
On the other hand, you can make a strong argument that the only prefix
which matters is the complete prefix on symbol names, so that the right
way of thinking about it is that the C++ mangling prefix is `__Z`
instead of `_Z` on leading-underscore targets; certainly this leads to a
more reasonable conceptual model in the face of languages like Swift
that eschew the underscore on all targets. But actually following that
logic would mean that the demangler would need to have target-specific
behavior, and no tooling using the demangler is set up to propagate
target information down (and what would that mean for c++filt anyway?).
In practice it's incredibly frustrating that `c++filt` doesn't allow an
actual symbol name on the command line, although arguably this is a
usability problem with `c++filt` rather than a flaw in the lower-level
interface.
But none of that really matters. This patch is a change to a function
that's basically implementing (the first half of) `__cxa_demangle`, and
it's already inappropriate to call `__cxa_demangle` in a generalized
symbol-demangling use case with a non-C++ symbol name because
`__cxa_demangle` is required to try to demangle anything that doesn't
start with `_Z` as a type. Since `__Z` is not and will never be the
start of a valid type mangling, it should be harmless to change
`__cxa_demangle` to also recognize a `__Z` prefix, and it does permit
clients to more comfortably adopt that better conceptual model for the
nature of the C++ prefix.
John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190124/a5db5537/attachment.html>
More information about the llvm-commits
mailing list