[PATCH] D34667: [Demangler] [DO NOT SUBMIT] Initial patch for Microsoft demangler.

Zachary Turner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 30 16:31:07 PDT 2017


zturner added a comment.

Will add more comments later.



================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:189
+  // Parser functions. This is a recursive-descendent parser.
+  void read_var_type(Type &ty);
+  void read_member_func_type(Type &ty);
----------------
compnerd wrote:
> ruiu wrote:
> > eugene wrote:
> > > Here a below a lot of function names violate LLVM's conventions (http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly)
> > > 
> > > Please let me know if that document is obsolete.
> > > 
> > That's intentional. This file is intended to be used not only in LLVM but also in other projects. So this file does not use any LLVM functions and does not follow the LLVM coding style. I did not make that decision -- I just followed ItaniumDemangler.cpp which exists in the same directory.
> I think that we should adhere to the LLVM style.  The itanium demangler was imported stock from libc++abi where it must adhere to the C++ ABI specification.  I expect that over time it will move back to the LLVM style.
Agree with this.  I would like it if we can adhere to LLVM style wherever possible.


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:317
+
+  if (0 < input.len && '0' <= *input.p && *input.p <= '9') {
+    int32_t ret = *input.p - '0' + 1;
----------------
`isdigit(*input.p)`


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:318
+  if (0 < input.len && '0' <= *input.p && *input.p <= '9') {
+    int32_t ret = *input.p - '0' + 1;
+    input.trim(1);
----------------
Why are we adding `1`?  If `*input.p == '0'` then wouldn't we want to return the number `0`?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:531-536
+      if (consume("B"))
+        ty.sclass = Const;
+      else if (consume("C") || consume("D"))
+        ty.sclass = Const | Volatile;
+      else if (!consume("A"))
+        error = "unkonwn storage class: " + input.str();
----------------
Why not use `read_storage_class()` here?  This misses all the cases like `unaligned` and `restrict`, and it also treats `C` and `D` as equivalent, which doesn't seem right.   `C` is just `volatile`, while `D` is `const volatile`.  Regardless, I think you can just say `ty.sclass = read_storage_class();`


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:543
+
+  if (consume("P6A")) {
+    ty.prim = Ptr;
----------------
You should also handle `3A6`, which is a reference to a function.


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:143
+  Fastcall,
+  Regcall,
+};
----------------
Missing vectorcall.


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:357
+  type.func_class = (FuncClass)read_func_class();
+  expect("E"); // if 64 bit
+  type.sclass = read_func_access_class();
----------------
The comment here mentions that this is only for 64-bit, but we expect it always.  Won't this generate an error in non-x86?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:382
+  if (input.startswith_digit()) {
+    int32_t ret = *input.p - '0' + 1;
+    input.trim(1);
----------------
What's the +1 for?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:394
+    }
+    if ('A' <= c && c <= 'P') {
+      ret = (ret << 4) + (c - 'A');
----------------
You can save 2 lines if you invert this conditional and break on the negation.  You can drop the curly braces as well as the continue statement.


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:559
+  case 'C': return Private | Static;
+  case 'D': return Private | Static;
+  case 'E': return Private | Virtual;
----------------
Should this be `Private | Static | FFar`?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:561
+  case 'E': return Private | Virtual;
+  case 'F': return Private | Virtual;
+  case 'I': return Protected;
----------------
Should this be `Private | Virtual | Far`?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:562
+  case 'F': return Private | Virtual;
+  case 'I': return Protected;
+  case 'J': return Protected | FFar;
----------------
Are `G` and `H` not used?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:605
+  case 'G': return Stdcall;
+  case 'I': return Fastcall;
+  default:
----------------
`vectorcall` is `Q`


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:656
+// Reads a variable type.
+void Demangler::read_var_type(Type &ty) {
+  if (consume("W4")) {
----------------
Can we have comments above every `read_xxx` function that describes the grammar of what is being read?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:657
+void Demangler::read_var_type(Type &ty) {
+  if (consume("W4")) {
+    ty.prim = Enum;
----------------
If you write this code:

```
nullptr_t NullPtr;
```

Then this is mangled as `?NullPtr@@3$$TA`.  Because of the `3` it makes it into this function, but nothing in this function handles a `$$`.  In fact, the only place in the entire demangler that handles `$$` is in the processing of an array, and it expects `$$C`.  So this looks to be missing.


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:703
+  switch (input.get()) {
+  case 'X': return Void;
+  case 'D': return Char;
----------------
Can you sort this list in alphabetical order?


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:720
+    case 'J': return Int64;
+    case 'K': return Uint64;
+    case 'W': return Wchar;
----------------
```
case 'S': return Char16;
case 'U': return Char32;
```


================
Comment at: llvm/lib/Demangle/MicrosoftDemangle.cpp:737
+  ty.prim = prim;
+  expect("E"); // if 64 bit
+  ty.ptr = new (arena) Type;
----------------
The same problem here as mentioned earlier.  For 32-bit, it's `A`.

Note that both can theoretically appear in the same mangled name, for example:  `void foo(int * __ptr32 PA, int * __ptr64 PB);`


https://reviews.llvm.org/D34667





More information about the llvm-commits mailing list