<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [x86asm intel syntax] `mov` with a symbol from a .set directive not handled correctly (?)"
href="http://llvm.org/bugs/show_bug.cgi?id=22511">22511</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[x86asm intel syntax] `mov` with a symbol from a .set directive not handled correctly (?)
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>nicolasweber@gmx.de
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Consider this asm hello world (on OS X):
.intel_syntax
str:
.ascii "Hello, ASM.\n"
.set mylen, .-str
.global start
start:
mov rdi, 1
lea rsi, qword ptr [rip+str] // [rip+str@GOTPCREL] for GOT instead of
rip-rel
mov rdx, mylen // doesn't. clang -cc1as bug?
mov rax, 0x2000004 # SYSCALL_WRITE
syscall
mov rdi, 42
mov rax, 0x2000001 # SYSCALL_EXIT
syscall
$ clang -c -o hello.o hello.asm && ld -o hello hello.o
$ ./hello
Segmentation fault: 11
The reason this crashes is because `mov rdx, mylen` is compiled as `mov rdx,
[12]` -- mylen is correctly converted to "12", but clang thinks that it should
be dereferenced:
$ r2 hello
[0x00001fd1]> px 10
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x00001fd1 48c7 c701 0000 0048 8d35 H......H.5
[0x00001fd1]> pd 10
;-- entry0:
0x00001fd1 48c7c701000. mov rdi, 1
0x00001fd8 488d35e6fff. lea rsi, qword [rip - 0x1a]
0x00001fdf 488b14250c0. mov rdx, qword [0xc]
This doesn't look right to me. In AT&T syntax, I have to say `mov $len, %rdx`
(with a $) to not dereference len, but that's consistent with other immediates.
The same program in AT&T syntax works fine (compiled with the same commands):
str:
.ascii "Hello world!\n"
.set mylen, .-str
.globl start
start:
movl $0x2000004, %eax
movl $1, %edi
movq str@GOTPCREL(%rip), %rsi
mov $mylen, %rdx
syscall
movl $42, %ebx
movl $0x2000001, %eax # exit 0
syscall
The same equivalent program in intel syntax works fine with gas on linux:
.intel_syntax noprefix
str:
.ascii "Hello, ASM.\n"
.set len, .-str
.global _start
_start:
movq rdi, 1
movq rsi, OFFSET FLAT:str
movq rdx, len
movq rax, 1 # sys_write
syscall
movq rdi, 42
movq rax, 60 # sys_exit
syscall
$ gcc -c test.s && ld test.o
$ ./a.out
Hello, ASM.
So I the behavior of clang's integrated assembler might be incorrect for .set
directives.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>