[llvm-dev] Inline assembly in intel syntax mishandling i constraint
Stephen Checkoway via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 7 14:44:27 PST 2020
Hi all,
I'm getting rather odd behavior from a call asm inteldialect(). TL;DR is "mov reg, $0" with a "i" constraint on $0 is behaving identical to "mov reg, dword ptr [$0]" and differently from "movl $0, reg" in AT&T syntax.
I'm not sure how to get clang to emit an inteldialect, so for this example, I'm emitting llvm and then modifying the resultant .ll file. (I get similar behavior with rust's asm!(… : "intel") so I'm assuming that's what rust is using, although I didn't verify this).
Here's the example
static int foo;
static int bar;
void _start(void) {
asm volatile("movl %0, %%eax" : : "i"(&foo));
asm volatile("movl %0, %%ebx" : : "i"(&bar));
}
This produces
define void @_start() #0 {
call void asm sideeffect "movl $0, %eax", "i,~{dirflag},~{fpsr},~{flags}"(i32* @foo) #1, !srcloc !3
call void asm sideeffect "movl $0, %ebx", "i,~{dirflag},~{fpsr},~{flags}"(i32* @bar) #1, !srcloc !4
ret void
}
When assembled, I get the expected output
80480a3: b8 b0 90 04 08 mov eax,0x80490b0
80480a8: bb b4 90 04 08 mov ebx,0x80490b4
After modifying the second one to be
call void asm sideeffect inteldialect "mov ebx, $0", "i,~{dirflag},~{fpsr},~{flags}"(i32* @bar) #1, !srcloc !4
and assembling, I get the unexpected output
80480a3: b8 b0 90 04 08 mov eax,0x80490b0
80480a8: 8b 1d b4 90 04 08 mov ebx,DWORD PTR ds:0x80490b4
This is identical to the output I get if I change the assembly template to "mov ebx, dword ptr [$0]"
I think the underlying issue here is that whichever variant of Intel syntax this supports (MASM?) treats
mov reg, symbol
as a load and it wants
mov reg, offset symbol
E.g., if I ask Clang to output assembly in Intel syntax via -mllvm --x86-asm-syntax=intel, I get
#APP
mov eax, offset foo
#NO_APP
#APP
mov ebx, dword ptr [bar]
#NO_APP
(I have no idea where those extra newlines are coming from.)
If I try to change the assembly template to "mov ebx, offset $0" it complains about multiple symbols being present:
<inline asm>:2:18: error: cannot use more than one symbol in memory operand
mov ebx, offset bar
I attached my source file and my modified .ll file. I compiled the source file with
clang -m32 a.c -ffreestanding -nostdlib -S -emit-llvm
$ clang --version
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Is this an LLVM bug or am I misusing inteldialect?
Thank you,
Steve
--
Stephen Checkoway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.c
Type: application/octet-stream
Size: 151 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200107/2efa41e3/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug.ll
Type: application/octet-stream
Size: 1301 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200107/2efa41e3/attachment-0001.obj>
-------------- next part --------------
More information about the llvm-dev
mailing list