RFC: AArch64/ARM64 canonical assembler syntax

Tue Apr 29 08:21:54 PDT 2014

Hi all,

In case the paint on everyone's previous bike shed has thoroughly
dried and they find themselves unaccountably without a lawn...

There are a few areas where AArch64 and ARM64 have chosen different
ways to print the same assembly (actually, even within a single
backend it's happened), and I think it would be good to settle on one,
preferably with as much self-consistency as we can muster.

So, the main issues are:

1. Floating-point immediates: 1.2500000e-01 or 0.12500000?
2. System registers & kin: should they be UPPER_CASE or lower_case?
3. Complex immediates: do we stick to the ARM ARM or print convenience
syntax: "add x0, x1, #123, lsl #12" or "add x0, x1, x2, #503808".
4. Integer immediates: hex or decimal
5. Hashes with relocated exprs or not: "ldr x0, [x1, #:lo12:symbol]"
or "ldr x0, [x1, :lo12:symbol]".

My opinions are:
1. 0.12500000 (I find scientific notation works intuitively in LaTeX,
but not with the "e")
2. lower_case (everything else is, switching to upper-case just for
those few operands looks weird).
3. Print it with the "lsl"/"msl"/etc (I prefer explicit notations).
4. Hex in almost all cases (certainly for logical insts, probably
arithmetic, my one uncertainty is loads & stores: "ldr x0, [x1,
#0x123]" looks odd somehow, but that could be me). The exception is
probably immediates in the 0-31 or 0-63 range (e.g. I like "lsl x0,
x1, #55")
5. I prefer with the hashes.

What does everyone else think, before I run off and change everything?

Cheers.

Tim.