[PATCH] Add support for ARM modified immediate syntax

Mon Aug 19 09:06:03 PDT 2013

Tim,

I found a way to avoid having a big switch that converts opcodes between ri
and rii.
Basically, I am assigning all the rii instructions to a new
DecoderNamespace.

In ARMDisassembler:getInstruction I change the flow so that when an
instruction is matched
int "ARM" Namespace, the new namespace is tested as well. If I have a match
this second
time then the decoded instruction has a modified immediate. All I have to
do is to decide
which of the two decoded instructions to keep (i.e. ri comes from ARM
namespace, rii comes
from new namespace).

Decision is as follows:

1. locate the two immediates in the rii instruction's operand list
2. decide whether the <value, rot> pair is "canonical" (i.e. lowest
possible rot out of all representations
of that immediate)
3. if canonical return the ri version (i.e. friendly syntax) and the rii
version (pair syntax) otherwise

Now, surprisingly I am having problems with step 1!
Although the two immediates are always the final inputs, they are not the
final MCOperands, nor
the final immediate MCOperands.

It varies from class to class and given the lack of information in MCInsts
I really can't see how
I could locate them reliably.

Any advice?

Regards,
Mihai

On Thu, Aug 15, 2013 at 11:57 AM, Mihail Popa <mihail.popa at gmail.com> wrote:

> I don't disagree with that. However I doubt you can challenge the fact
> that tablegen and the assembler and disassembler it generates are somewhat
> lacking
> and whatever they lack has to be compensated for in custom code.
>
> The disassembler is particularly foolish, the assembly parser is not even
> a parser
> and the tablegen language is irregular, inexpressive and insufficiently
> well defined.
>
> Most of the issues we discussed in my working on the MC layer would have
> been
> quicker, more easily and more elegantly fixed if:
>
> 1. the disassembler would be a proper decoding automaton that looks at all
> the bits
> in the instruction
> 2. the "inheritance" in the instruction classes would be proper inheritance
> 3. the assembly "parser" would actually be a simple LL(1) parser
> 4. the tablegen language would be better defined and would provide a few
> extra features
>
> So the root cause of most disagreements is systemic. While your purpose as
> reviewer is
> to keep me from writing crap, you cannot claim that I fix systemic issues
> in order to support
> very specific use cases. It's simply not fair.
>
> People who own pieces of code should take note of what other engineers
> have to do to
> work with them and push for improvement. And yes, once new features are in
> place, they can
> enforce them to be used.
>
> I can appreciate you don't agree with the solution; I myself don't like
> it. It is however horrifically
> unfair to have me write a new disassembler just because the current one
> doesn't work properly.
> And really, short of this we will never be able to have a clean
> description free of duplication
> and switch-rich custom decoders.
>
> I am not even the first one to run into problems - just skim through the
> code and see what
> people have to do in custom decoders. Surely you know.
>
> Mihai
>
>
>
>
> On Thu, Aug 15, 2013 at 11:35 AM, Tim Northover <t.p.northover at gmail.com>wrote:
>
>> > The plan is to detect at disassembly time which immediates
>> > have multiple representations and choose between ri and rii versions
>> > based on that.
>>
>> A switch or something that knows each and every instruction? That's
>> horrific. And not a step towards a non-duplicated solution.
>>
>> Tim.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130819/31fb5e6a/attachment.html>