[PATCH] D102729: [Demangle][Rust] Parse binders

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 21 17:02:19 PDT 2021


dblaikie added a comment.

Yeah, I think there's a bug in the bound count handling (off by one) due to the quirk in the parseOptionalBase62 parsing?



================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:471-476
+    if (consumeIf('L')) {
+      if (auto Lifetime = parseBase62Number()) {
+        printLifetime(Lifetime);
+        print(' ');
+      }
+    }
----------------
Might be nice to have this written once/reused (rather than duplicated here between 'R' and 'Q' for instance) - not sure if it can be generalized over the use in `demangleGenericArg` too - all of them use the <lifetime> Production, but there's no common implementation of/function for that production - maybe there could be?


================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:549
+  // Use the input size as a weak upper bound on the total number of bound
+  // lifetimes. Inputs exceeding the bound are invalid, but could generate
+  // excessive amounts of output.
----------------
Judging by the phrasing, was this meant to say "Inputs exceeding the bound are /valid/" rather than invalid?

Maybe it'd be more suitable to have a fixed upper bound on the size of a demangled name in general instead of these special cases? I guess that could cover the recursion and this bound on bound lifetimes... 

That might be more predictable for users and for testing, etc.


================
Comment at: llvm/lib/Demangle/RustDemangle.cpp:707
 // Parses optional base 62 number. The presence of a number is determined using
 // Tag. Returns 0 when tag is absent and parsed value + 1 otherwise.
 uint64_t Demangler::parseOptionalBase62Number(char Tag) {
----------------
This seems like a really subtle and confusing API, especially given the existing "+ 1" behavior of the Base 62 encoding. I've been staring at the code in demangleFnSig+demangleBinder to figure out the first binder test case for a while before I realized that.

Any chance this could use uintmax for the invalid value, rather than offsetting the result in this way?


================
Comment at: llvm/test/Demangle/rust.test:237-238
+
+CHECK: binders::<for<'a> fn(&'a _)>
+       _RIC7bindersFG_RL0_pEuE
+
----------------
I'm having a hard time understanding this test in the context of the spec. 

The spec says that `_` encodes zero. So the `G_` should be zero, right? Is there somewhere in the spec that says that `<binder>` mangled value is then 1 larger than that? (so a binder value 0 (`_`) is actually for 1 binder)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102729/new/

https://reviews.llvm.org/D102729



More information about the llvm-commits mailing list