<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/155045>155045</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[X86] emitInstructionBegin macro-fusion check broken after getOrCreateDataFragment optimization
</td>
</tr>
<tr>
<th>Labels</th>
<td>
bug,
backend:X86,
llvm:mc
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
aleks-tmb
</td>
</tr>
</table>
<pre>
### Description
When X86AsmBackend::emitInstructionBegin emits the beginning of an instruction it prevents inserting alignment between macro-fused instructions by checking that the current MC fragment (where the previous macro-fused instruction was inserted) is equal to the following pending branch-alignment fragment, which means that no other fragment has been inserted after the previous instruction:
```
PendingBA -> CurrentFragment (contains previous instr)
```
```c++
if (PendingBA && PendingBA->getNext() == OS.getCurrentFragment()) {
...
return;
}
```
In the 39c8cfb70d20 patch, getOrCreateDataFragment was optimized by eagerly allocating an empty fragment when adding a fragment with a variable-size tail. This means that in this case the current MC fragment is no longer the one where the instruction was inserted, and the check
`PendingBA && PendingBA->getNext() == OS.getCurrentFragment()` fails, since CurrentFragment is now the empty fragment instead of the fragment containing the instruction.
```
PendingBA -> Fragment with a variable-size tail (contains previous instruction) -> CurrentFragment (newly allocated empty fragment)
```
This breaks the macro-fusion logic because it incorrectly assumes another fragment has been inserted between the fused instructions.
### Reproducer
```asm
.globl f
.type f,@function
f:
.p2align 5
xchgw %ax, %ax
pushq %rbp
pushq %r14
pushq %rbx
subq $16, %rsp
movq %rdx, %r14
movq %rsi, %rbx
cmpl $0, %gs:104
jne .L_EXIT
.LBB0_1:
testq %rcx, %rcx
je .L_EXIT
.L_EXIT:
ret
.size f, .L_EXIT-f
```
```
$bin/llvm-mc -filetype=obj -mattr=+prfchw -x86-pad-max-prefix-size=15 -triple=x86_64 -x86-align-branch-boundary=32 -x86-align-branch=jcc+fused test.s -o test.out
$objdump -d test.out
```
Disassembly of section .text:
```asm
0000000000000000 <f>:
0: 66 90 xchg %ax,%ax
2: 55 push %rbp
3: 41 56 push %r14
5: 53 push %rbx
6: 48 83 ec 10 sub $0x10,%rsp
a: 49 89 d6 mov %rdx,%r14
d: 48 89 f3 mov %rsi,%rbx
10: 65 83 3c 25 68 00 00 cmpl $0x0,%gs:0x68
17: 00 00
19: 75 07 jne 22 <f+0x22>
1b: 48 85 c9 test %rcx,%rcx
1e: 66 90 xchg %ax,%ax
20: 74 00 je 22 <f+0x22>
22: c3 retq
```
In this case macro fusion is broken by the inserted NOP. The `je` is no longer immediately adjacent to the `test`, so the CPU cannot fuse them into a single µop:
```asm
test %rcx,%rcx
xchg %ax,%ax
je 22 <f+0x22>
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJysV1-P4jgS_zTmpQQyDgnkgQdotqWR9nZGd3u6eRs5TiVxd-Jkbaeh99Of_CdAQzMjnQ4hQWxX1c9Vv_oTboysFeKWpHuSHmZ8tE2vt7zFVzO3XTEr-vJ9S1gSvnBAI7QcrOwVoTtCd_9pUMH3TbYz3Z6LV1QlSXYk2WEn7RdlrB6FO7zHWipwiwZsg1C4ZyVVDX0FXIG8HAVpYdD4hsoat47aunO8lbXqUFko0B4RFXRc6H5ejQbLa3kDxTuIBsWrE7MNt96iGLV20v94gkrz2msibHNsUKM_4GzKfjSP9MKRT3iwJCwHaQD_GnkLtvcKqr5t-6MzOqAq3W-huRLN_AJ9skzYExwbKRrokCsTUKoeetugvuBruIHCXXUyC7yyqD_CvYLoXO-jQjIav3T3LYDZ72BOkt_gKfjh-coHoleWS2VudBKW36i6fhSE7d2X7gBk5dRcDBGWEZbBecEZrtH-gSdL2Mb5jiQHkhzg678WNdobSOGIP7UO-gEWi0X8p9GOWpEk7JD14R7jF-U9lORiI6piTUtGYeBWNM7tNdqv-kkjt3jglp_94KLbD1Z28m8sHYWQ16jbd-Bt2wseOOgoPNj3S4SOjv689NHmV8vSNsDhjWvJixbnRv6NYLlsF_BnI8111KUDKw0IbvAhUaVx7Gh7Vcfo9wrhQt3HLH0Crsqg1mVE8NX_OVAZhYrL1jhrRiqBdxzz8I8exo3_HHLkpasCPoWm9UjJkMEfLriAX1L8-Zdh-AnrYyax_GG6KDxeaIHlzZU-zRof9EIjfw3l71xiXMTavpYCChR8NOiqn1Si1xqFdWaMGTs0wNUva8NUF70j76riIrrtXMr_iYPuy1GgvsbLTecP5ou67YuW0LyKz_Z9QP_InsiKVqMSsQtUoewAuFMD89UOIA1iJ9HUR0JzwlJ-cgwJf_zeMJrmr7Cni-F-bbn65FyUNWMRllbLLKrVJuro-rfpeDnZPCu72jRy2py0im5og1Yat2pDkt2SRuEX5Vyw-P3Hb9-__EnobvH7fk9_LKe6m1s0dtIuzqZF1P5yKxz-TcIabXS142l0NcRT8-ontdjHdVVIRdhz2751807AvJIt-pglh754gXnHrdUundl-0JVojjA_bbL5wMt5x0_zQWMlTz5FSHJYpjC3Wg6tezhtsh_ZKhz30Z3Hxlb0oyq5fifJIWH3-yQ5vAjXJwIbnXMWBuZ9-NePNuDui5dy7AaYlx82ri53kIYbg13RvrtKYTCUuoV1dSrZ3fOX3nyAJE8VSX6bmAqUJO4nyyCncP9xrAWAibQTZwGABcE0_UQKABxVg2BgNAAkQWK1hDT7uYSnqFMebSS_thFRZdHGBjYJoIDlx0uZsQAvsaKnJQ0XCtkCADzK5rDJobyD2PVvEK2V0RdnoOXFbA7VHdwrUZ9qV4iXUwBShzgRwFLINkCp-wKAS8QJcQTsM5Gesk1QsA4KvERYycPKOgW6_sRtLwrdD2OBDGxPT4w5Tnjh4nKVFER-K-yIGa8iJi-IeBX8H7nEog_Wq3jpW8Ae7wPALBJRPCCJRleIPp2MpmHDNyGITcj3p_4VlRt9YssNbeWPr9_c1IJAMvqCrtt_mEZk12EpuUXXrMoXLlxzihMxyahznDPvZoOw-PTt3yC4Ur31XcqtdSCV7YG78aFuEcgTI_u0Hz7N7YeheODon7jxrHlWbpMyT3I-w-1ynaY5TegqmzXb9YYtl5u1SIoqW6bJuig4S5cbtk5XXKRYzOSWUZbSDWOMpXSVLZYJF1gUWOSrvBRpRVYUOzf5ucK86HU9k8aMuF2mKV2ls5YX2Br_-sVYMdaEMY-dFee3qe-b7LzqlJBk1wm3kh5meuvrfTHWhqxoK401F0NW2ta_2TkN6QE-fSf7MIr4IXEiQnjXeDQvx1mZO02zUbfbxtrBpShhz4Q919I2Y7EQfRd70tSaBt2_oLCEPXs_GMKeoyvetuy_AQAA__8EDVDc">