[PATCH] D102729: [Demangle][Rust] Parse binders
David Blaikie via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 21 17:02:19 PDT 2021
dblaikie added a comment.
Yeah, I think there's a bug in the bound count handling (off by one) due to the quirk in the parseOptionalBase62 parsing?
================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:471-476
+ if (consumeIf('L')) {
+ if (auto Lifetime = parseBase62Number()) {
+ printLifetime(Lifetime);
+ print(' ');
+ }
+ }
----------------
Might be nice to have this written once/reused (rather than duplicated here between 'R' and 'Q' for instance) - not sure if it can be generalized over the use in `demangleGenericArg` too - all of them use the <lifetime> Production, but there's no common implementation of/function for that production - maybe there could be?
================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:549
+ // Use the input size as a weak upper bound on the total number of bound
+ // lifetimes. Inputs exceeding the bound are invalid, but could generate
+ // excessive amounts of output.
----------------
Judging by the phrasing, was this meant to say "Inputs exceeding the bound are /valid/" rather than invalid?
Maybe it'd be more suitable to have a fixed upper bound on the size of a demangled name in general instead of these special cases? I guess that could cover the recursion and this bound on bound lifetimes...
That might be more predictable for users and for testing, etc.
================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:707
// Parses optional base 62 number. The presence of a number is determined using
// Tag. Returns 0 when tag is absent and parsed value + 1 otherwise.
uint64_t Demangler::parseOptionalBase62Number(char Tag) {
----------------
This seems like a really subtle and confusing API, especially given the existing "+ 1" behavior of the Base 62 encoding. I've been staring at the code in demangleFnSig+demangleBinder to figure out the first binder test case for a while before I realized that.
Any chance this could use uintmax for the invalid value, rather than offsetting the result in this way?
================
Comment at: llvm/test/Demangle/rust.test:237-238
+
+CHECK: binders::<for<'a> fn(&'a _)>
+ _RIC7bindersFG_RL0_pEuE
+
----------------
I'm having a hard time understanding this test in the context of the spec.
The spec says that `_` encodes zero. So the `G_` should be zero, right? Is there somewhere in the spec that says that `<binder>` mangled value is then 1 larger than that? (so a binder value 0 (`_`) is actually for 1 binder)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D102729/new/
https://reviews.llvm.org/D102729
More information about the llvm-commits
mailing list