[PATCH] D80512: [MC] Changes to help improve target specific symbol disassembly

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 28 14:18:21 PDT 2020


scott.linder added inline comments.


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:76-97
   /// Ternary decode status. Most backends will just use Fail and
   /// Success, however some have a concept of an instruction with
   /// understandable semantics but which is architecturally
   /// incorrect. An example of this is ARM UNPREDICTABLE instructions
   /// which are disassemblable but cause undefined behaviour.
   ///
   /// Because it makes sense to disassemble these instructions, there
----------------
jhenderson wrote:
> Seems like this whole comment needs updating if you are adding a new enum value.
It seems like beyond the need to just update the comment all the uses of `DeocdeStatus` need to be audited to possibly consider the new variant. The big one is the `CHECK` macro which relies on the bitwise-AND reduction property of the current encoding, which `Ignore` seems to break (and maybe it just doesn't really fit in to begin with; what should `Ignore & Success` mean? It currently means `Fail`, which doesn't seem particularly helpful).

It seems like what we want instead is to have two orthogonal statuses: a boolean one to indicate whether any decoding is possible and was attempted, and then the ternary status for the decode in the case that it happens. Maybe we just want to leave `DecodeStatus` alone and have `onSymbolStart` return an `Optional<onSymbolStart>` where `None` indicates no decoding was attempted?


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:86
   ///
+  /// When disassembling symbols, targets can also return Ignore or SoftFail in
+  /// case they partially disassembled it.
----------------
This wording seems ambiguous; when I first read it I took it to mean that `Ignore` and `SoftFail` are equivalent. As the above describes `SoftFail` already can this just describe `Ignore` here? I also don't know what you mean when you say `SoftFail` can be used for partial disassembly; I thought from the description it just meant the decoded instruction is architecturally invalid, but is a decodeable form?

The comment also seems to use `Unpredictable` and `SoftFail` interchangeably, can this all just be changed to refer to `SoftFail`?


================
Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:130
     int64_t FunctionCount;
     if (!nextLEB(FunctionCount, Bytes, Size, false))
+      return MCDisassembler::Ignore;
----------------
I'm a little confused here on what the intended semantics for this impl are, but it doesn't seem like the intention would be to `Ignore` here, right? Since `llvm-objdump.cpp` ignores the return value completely it isn't clear what was really intended (i.e. this could have returned anything, yet chose to return different values for different cases).

Maybe @aardappel can describe the intent a bit better?


================
Comment at: llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp:155
   outs() << "\n";
-  return MCDisassembler::Success;
+  return MCDisassembler::SoftFail;
 }
----------------
This seems to be overloading what `SoftFail` means. It seems like the original intent of `SoftFail` was to capture a `Success` which is "poisoned" by architectural meaningless-ness. Thus `Success & SoftFail == SoftFail` but `SoftFail & Fail == Fail`.

It seems like `Success` is the right thing to return here still. Can't the callee know whether or not the "whole" symbol was consumed by just considering the `Size` value that is returned along with the status?


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1434
+
+      if (status == MCDisassembler::Success || status == MCDisassembler::Fail) {
+        if (status == MCDisassembler::Fail)
----------------
Going along with the comments I left above, I don't think we need to subvert `Success` and `SoftFail` here to support "partial" decoding. I would vote to just use `Size` to detect when the "entire" symbol was decoded.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1437
+          outs() << "Error in decoding " << SymbolName << "\n";
+        continue;
+      }
----------------
I would prefer this just fall out of the code below, rather than need to be explicit here. Things like `onSymbolEnd` should still be considered even when the whole symbol is decoded, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80512/new/

https://reviews.llvm.org/D80512





More information about the llvm-commits mailing list