[PATCH] Add support for ARM modified immediate syntax

JF Bastien jfb at google.com
Tue Jul 30 10:01:16 PDT 2013


This is actually covered very clearly by the ARM ARM, section A5.2.4
Modified immediate constants in ARM instructions, Constants with
multiple encodings:

======
Some constant values have multiple possible encodings. In this case, a
UAL assembler must select the encoding with the lowest unsigned value
of the rotation field.

[...]

An alternative syntax is available for a modified immediate constant
that permits the programmer to specify the encoding directly. In this
syntax, #<const> is instead written as #<byte>, #<rot>, where:
  <byte> is the numeric value of abcdefgh, in the range 0-255
  <rot> is twice the numeric value of rotation, an even number in the
range 0-30.

This syntax permits all ARM data-processing instructions with modified
immediate constants to be disassembled to assembler syntax that
assembles to the original instruction.

This syntax also makes it possible to write variants of some
flag-setting logical instructions that have different effects on
APSR.C to those obtained with the normal #<const> syntax. For example,
ANDS R1, R2, #12, #2 has the same behavior as ANDS R1, R2, #3 except
that it sets APSR.C to 0 instead of leaving it unchanged. Such
variants of flag-setting logical instructions do not have equivalents
in the Thumb instruction set, and ARM deprecates their use.
=====

There *is* a canonical encoding when using the regular syntax, but
that canonical encoding can be circumvented by using what Mihail is
suggesting. Re-assembling to the same thing is a nice goal, but
generating the same flags is, I think, a much stronger case for
supporting what he proposes. +1 from me (though I haven't reviewed the
code).



More information about the llvm-commits mailing list