[PATCH] D120958: [TableGen] Add support for variable length instruction in decoder generator

Thu Mar 31 18:41:03 PDT 2022

0x59616e added a comment.

In D120958#3419921 <https://reviews.llvm.org/D120958#3419921>, @myhsu wrote:

> In D120958#3416426 <https://reviews.llvm.org/D120958#3416426>, @0x59616e wrote:
>
>> In D120958#3415717 <https://reviews.llvm.org/D120958#3415717>, @myhsu wrote:
>>
>>> The logics look cleaner now.
>>> Now a bigger question is: Should we still calling it //FixedLen//DecoderEmitter?
>>
>> What about `DecoderEmitter.cpp` ?
>
> Sounds good to me, but I would suggest you to send a RFC to the forum asking for consensus.
> Especially to see if there is any comment from existing FixedLenDecoder users (e.g. AArch64 and ARM). Also, in the RFC, please briefly explain why you want to build this feature on top of FixedLenDecoder rather than writing a separate disassembler.

No problem.

In D120958#3419939 <https://reviews.llvm.org/D120958#3419939>, @myhsu wrote:

> Also, please make sure all existing disassembler tests, especially targets that use FixedLenDecoder, are passing.

Sure.

================
Comment at: llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:2386
+     << "      unsigned PtrLen = 0;\n"
+     << "      uint64_t ExpectedValue = decodeULEB128(++Ptr, &PtrLen);\n"
+     << "      Ptr += PtrLen;\n"
----------------
myhsu wrote:
> why did you use another PtrLen variable here?
quick answer: for debug purpose.

In the original code, `Len` is the length of the instruction field. But it is changed in the function `decodeULEB128'. That will break the debug message below.

================
Comment at: llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:224
+  const RecordVal *RV = def.getValue(str);
+  if (BitsInit *bits = dyn_cast<BitsInit>(RV->getValue()))
+    return *bits;
----------------
myhsu wrote:
> 0x59616e wrote:
> > myhsu wrote:
> > > local variables should start with uppercase 
> > Just wondering: there are some variables that is started with lowercase in the original code. For example the parameter "&def" in this function.
> > 
> > Is that a mistake ? or is there any special naming rule that I don't know ?
> > 
> > 
> it's likely that those code were committed before LLVM coding style was solidified
Ah, that makes sense.

================
Comment at: llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:2362
      << "      unsigned Len;\n"
-     << "      InsnType Val = decodeULEB128(++Ptr, &Len);\n"
+     << "      uint64_t Val = decodeULEB128(++Ptr, &Len);\n"
      << "      Ptr += Len;\n"
----------------
myhsu wrote:
> 0x59616e wrote:
> > myhsu wrote:
> > > why change to uint64_t? If you're worry about InsnType being APInt, APInt has `operator=(uint64_t)`
> > But it yells "conversion from ‘uint64_t’ {aka ‘long unsigned int’} to non-scalar type ‘llvm::APInt’ requested"
> > 
> you're right and I was wrong about `APInt::operator=(uint64_t)`: the operator won't be used in the case of initialization (constructor will be used instead but APInt doesn't have `APInt(uint64_t)`).
C++ is hard.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120958/new/

https://reviews.llvm.org/D120958