[Lldb-commits] [PATCH] D69210: [Disassembler] Simplify MCInst predicates

Mon Oct 21 11:42:58 PDT 2019

clayborg added a comment.

In D69210#1717042 <https://reviews.llvm.org/D69210#1717042>, @vsk wrote:

> In D69210#1716861 <https://reviews.llvm.org/D69210#1716861>, @clayborg wrote:
>
> > In D69210#1715679 <https://reviews.llvm.org/D69210#1715679>, @vsk wrote:
> >
> > > Hm, this patch is bugging me.
> > >
> > > It looks a bit like instructions are still decoded multiple times in different ways (e.g. in the `Decode` and `CalculateMnemonicOperandsAndComment` methods, which both modify `m_opcode`). Any ideas on whether/how to consolidate these?
> >
> >
> > I am all for anything that will improve efficiency. This class has evolved over time where we started with just the "CalculateMnemonicOperandsAndComment" and then many other features (can branch, etc) were later built into the class. I don't believe instructions are kept around for long so they typically serve one of two purposes:
> >
> > - disassembly of instruction stream where only CalculateMnemonicOperandsAndComment is needed
> > - inspection of multiple instructions for stepping looking at can branch and other information requests
> >
> >   So I am not sure the decoded multiple times in different ways is really important unless we do have a costly client that does both CalculateMnemonicOperandsAndComment and inspecting of instruction attributes (can branch, etc). Again, these objects are created, used and discarded currently AFAIK.
>
>
> Thanks for your comment Greg. Let me try and restate the issue I see as my concern isn't about performance.
>
> It looks like `Decode` and `CalculateMnemonicOperandsAndComment` mutate `m_opcode` in different ways. Separately, the predicates read `m_opcode`. So I'm not sure whether/in-which-order the mutating methods need to be run before the predicates can safely be called. I'd like to consolidate all the code that mutates `m_opcode` in one place, to make the predicates always safe to call. Does that seem reasonable? Or am I overthinking something?

It seems that CalculateMnemonicOperandsAndComment only mutates m_opcode when the instruction size returned by:

  size_t inst_size = mc_disasm_ptr->GetMCInst(opcode_data, opcode_data_len, pc, inst);

is zero. It also is unclear to me that the mutating calls in CalculateMnemonicOperandsAndComment really do anything? They decode a value from the data, then then put them back into m_opcode? Also, if "inst_size" is zero in decode:

  const size_t inst_size =
      mc_disasm_ptr->GetMCInst(opcode_data, opcode_data_len, pc, inst);
  if (inst_size == 0)
    m_opcode.Clear();
  else {
    m_opcode.SetOpcodeBytes(opcode_data, inst_size);
    m_is_valid = true;
  }

Then m_opcode.Clear() is called and m_opcode won't contain anything, so I am guess only architectures with fixed opcode sizes will be able to show ".long" or any of that kind of stuff? And only those will trigger mutating the opcode value in CalculateMnemonicOperandsAndComment?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69210/new/

https://reviews.llvm.org/D69210