[llvm-bugs] [Bug 48742] New: X86AsmBackend::finishLayout causes different assembler output with and w/o -g (2-byte jmp/jcc vs 5-byte)
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jan 13 10:10:33 PST 2021
https://bugs.llvm.org/show_bug.cgi?id=48742
Bug ID: 48742
Summary: X86AsmBackend::finishLayout causes different assembler
output with and w/o -g (2-byte jmp/jcc vs 5-byte)
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: DebugInfo
Assignee: unassignedbugs at nondot.org
Reporter: i at maskray.me
CC: dblaikie at gmail.com, jdevlieghere at apple.com,
jyknight at google.com, keith.walker at arm.com,
listmail at philipreames.com, llvm-bugs at lists.llvm.org,
paul_robinson at playstation.sony.com
bug 42138#c13 was reopened due to different assembler output with -O1 and -O1
-g. Because the assembler issue is so different from the original BranchFolding
bug, I am opening a new bug.
> < 40: eb 0e jmp 50 <_ZN1k1lEv+0x50>
> < 42: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> < 49: 00 00 00
> < 4c: 0f 1f 40 00 nopl 0x0(%rax)
> ---
> > 40: e9 0b 00 00 00 jmpq 50 <_ZN1k1lEv+0x50>
> > 45: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> > 4c: 00 00 00
> > 4f: 90 nop
The D75203 assembler optimization locates MCRelaxableFragment's within two
MCSymbol's and relaxes some MCRelaxableFragment's (jmp/jcc) to reduce the size
of a MCAlignFragment.
Its behavior is dependent on the MCSymbol's in the text section.
A -g compile may have more labels (due to ranges/locations referenced by
.debug_*; currently it seems that some .Ltmp* may be redundant (I am going to
investigate further) but **many cannot be removed**).
.p2align 4, 0x90 is common due to loops. For a larger program, with a lot of
temporary labels, the assembly output difference is somewhat destined.
I think the cost of D75203 overweighs the benefits, so I think we should
default to -x86-pad-for-align=false for now (https://reviews.llvm.org/D94542 ).
When -mbranches-within-32B-boundaries (to mitigate microcode update for Intel
JCC Erratum) is used, there are many alignment fragments. I think D75203 in
that case. In the absence of -mbranches-within-32B-boundaries, the advantage of
D75203 is questionable.
Other opinions: https://reviews.llvm.org/D75203#2496082 (jyknight), its
previous comment (skan).
I agree that to make the behavior of D75203 deterministic with -g and without
we will need to "find all sections referenced by a relaxable fixup in the text
section", and recursively. This will be very complex and dilute the gain of
D75203
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210113/e50dbdd5/attachment-0001.html>
More information about the llvm-bugs
mailing list