[PATCH] D111417: [Demangle] Add support for D symbols back referencing

Luís Ferreira via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 5 17:36:30 PST 2022


ljmf00 added inline comments.


================
Comment at: llvm/lib/Demangle/DLangDemangle.cpp:136-147
+  /// Extract the back reference position from a given string
+  ///
+  /// \param Mangled string to extract the back reference position
+  /// \param Ret assigned result value
+  ///
+  /// \return the remaining string on success or nullptr on failure
+  ///
----------------
dblaikie wrote:
> Perhaps it'd make sense if this took Mangled as a const char** in/out, and returned Optional<Ret>? (both callers overwrite their Mangled parameter with the return result anyway - so seems simpler for it to be in/out?) if Optional isn't available, maybe long min, -1, something could be the invalid result value?
> 
> (similarly for decodeBackref)
As is, I understand your point, but what if I want to preserve Mangled and use `decodeBackref`/`decodeBackrefPos` as an expression for an if, strcmp or else? e.g. On the patch I introduce integer literal values I use `decodeBackref` like the following:
```
if (decodeBackref(Mangled, &Backref) == nullptr)
{
    //...
}
```
If I pass a reference of Mangled and use old Mangled inside, I would need to preserve a copy of its old position to make sure it won't be a "dirty" value. e.g. back reference "AAAA0", will produce a dirty Mangled and can produce bad behaviour afterwards.


================
Comment at: llvm/lib/Demangle/DLangDemangle.cpp:386-387
+      Val += Mangled[0] - 'a';
+      if ((long)Val <= 0)
+        break;
+      *Ret = Val;
----------------
dblaikie wrote:
> is this an overflow check? Can it be reached if the other overflow check at 379 succeeds? (is it tested?)
This is a safe check when casting to long. The above overflow check is for unsigned long, but Ret is a long value. This is being done that way because signed overflow checks are considered undefined behaviour, since there is various ways to represent a signed value in different CPU targets. I checked locally, `_D8demangleQDXXXXXXXXXXXXx` triggers it with the fuzzer when that code is commented, so I'm going to add this to the test suite.


================
Comment at: llvm/lib/Demangle/DLangDemangle.cpp:402-403
+
+  if (Mangled == nullptr || *Mangled != 'Q')
+    return nullptr;
+
----------------
dblaikie wrote:
> Looks like this condition's probably untested? Should it be removed or turned into an assertion?
Well this shouldn't happen and since those methods are not exposed somewhere else, the compiler can optimize this out easily. I think it is reasonable to keep this check as a safety measure, although if we do some action, I maybe prefer to assert it. In my vision, triggering an assert on, e.g. a failed demangling for a linker error is worse than just outputting the mangled symbol anyway (`nullptr` will make `llvm::demangle` fallback to the initial Mangled reference). It is also worth mentioning that removing this when assert are disabled in release mode, can produce wrong and hard to debug behaviour.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111417/new/

https://reviews.llvm.org/D111417



More information about the llvm-commits mailing list