[PATCH] D70157: Align branches within 32-Byte boundary
Fangrui Song via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Dec 4 16:32:01 PST 2019
MaskRay added a comment.
I am still trying to understand the patch. Just made some comments about the tests.
================
Comment at: llvm/include/llvm/MC/MCFragment.h:663
+ enum SubType : uint8_t {
+ // BranchPadding - The variable size fragment to insert NOP before branch.
+ BranchPadding,
----------------
Don’t duplicate function or class name at the beginning of the comment (`BranchPadding - `). (ref: https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments)
================
Comment at: llvm/include/llvm/MC/MCFragment.h:674
+ FusedJccPadding,
+ // HardCodeBegin - The zero size fragment to mark the begin of the sequence
+ // of hard code
----------------
Full stop.
================
Comment at: llvm/include/llvm/MC/MCFragment.h:709
+ switch (SubKind) {
+ default:
+ llvm_unreachable("Unknown subtype of MCMachineDependentFragment");
----------------
Move llvm_unreachable below the switch, otherwise clang will give a warning:
warning: default label in switch which covers all enumera
tion values [-Wcovered-switch-default]
Unfortunately all GCC (even 9) -Wall will warn `warning: control reaches end of non-void function [-Wreturn-type]` unless you place an unreachable statement.
================
Comment at: llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp:537
+ // Linker may rewrite the instruction with variant symbol operand.
+ if(hasVariantSymbol(Inst)) return false;
+
----------------
Space after `if`
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1a.s:1
+# Check option --x86-branches-within-32B-boundaries is equivalent to the combination of options --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc+jmp --x86-align-branch-prefix-size=5
+# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-branches-within-32B-boundaries %s | llvm-objdump -d - > %t
----------------
1a~1g use the same source file. Move the source to `Inputs/align-branch-1-64.s`
According to the local naming convention, this test should probably be renamed to `align-branch-1-64.s`
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1a.s:7
+
+# CHECK: file format {{.*}}
+
----------------
Delete
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1a.s:11
+
+# CHECK: Disassembly of section .text:
+# CHECK: 0000000000000000 foo:
----------------
Delete `Disassembly of section .text:`. Ditto below.
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1b.s:4
+
+# CHECK: file format {{.*}}
+
----------------
Delete
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1b.s:10
+# CHECK: 0000000000000000 foo:
+# CHECK-NEXT: 0: 64 89 04 25 01 00 00 00 movl %eax, %fs:1
+# CHECK-NEXT: 8: 2e 55 pushq %rbp
----------------
I think 1a.s and 1b.s should be merged. FileCheck supports
--check-prefixes=CHECK,PREFIX5
--check-prefixes=CHECK,PREFIX1
```
CHECK: common part
CHECK-NEXT: common part
PREFIX5:
PREFIX5-NEXT:
PREFIX1:
PREFIX1-NEXT:
CHECK:
CHECK-NEXT:
```
```
% diff -U1 x86-64-align-branch-1[ab].s
# CHECK: 0000000000000000 foo:
-# CHECK-NEXT: 0: 64 64 64 64 89 04 25 01 00 00 00 movl %eax, %fs:1
-# CHECK-NEXT: b: 55 pushq %rbp
-# CHECK-NEXT: c: 55 pushq %rbp
-# CHECK-NEXT: d: 55 pushq %rbp
+# CHECK-NEXT: 0: 64 89 04 25 01 00 00 00 movl %eax, %fs:1
+# CHECK-NEXT: 8: 2e 55 pushq %rbp
+# CHECK-NEXT: a: 2e 55 pushq %rbp
+# CHECK-NEXT: c: 2e 55 pushq %rbp
# CHECK-NEXT: e: 48 89 e5 movq %rsp, %rbp
```
Is there performance benefit to add 4 prefixes to the same instruction?
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1c.s:2
+# Check only fused conditional jumps and conditional jumps are aligned with option --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5
+# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=fused+jcc --x86-align-branch-prefix-size=5 %s | llvm-objdump -d - | FileCheck %s
+
----------------
The difference between 1a and 1c is that 1c does not allow "jmp", but in 1a no jmp instructions get a prefix in the test, so it is unclear why 1c has different output.
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1d.s:4
+
+# CHECK: file format {{.*}}
+
----------------
Delete
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1e.s:46
+# CHECK-NEXT: 5c: eb 27 jmp {{.*}}
+# CHECK-NEXT: 5e: 90 nop
+# CHECK-NEXT: 5f: 90 nop
----------------
This is weird. Comparing this with 1d, 1e allows more instruction types, yet it inserts two NOPs which actually seems to degrade performance.
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1f.s:8
+
+# CHECK: Disassembly of section .text:
+# CHECK: 0000000000000000 foo:
----------------
No disassembly is needed. Just check that `--x86-align-branch-boundary=0` and the default (no x86- specific options) have the identical output (`cmp %t %t2`)
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1g.s:1
+# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 -mcpu=x86-64 --x86-align-branch=jcc+jmp --x86-align-branch-prefix-size=5 %s | llvm-objdump -d - | FileCheck %s
+
----------------
Merge 1e and 1g. State that `-mcpu=x86-64` generates `66 90` instead of `90 90` (but why?)
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1g.s:3
+
+# CHECK: file format {{.*}}
+
----------------
Delete
================
Comment at: llvm/test/MC/X86/x86-64-align-branch-1g.s:7
+
+# CHECK: Disassembly of section .text:
+# CHECK: 0000000000000000 foo:
----------------
Delete `Disassembly of section .text:`
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70157/new/
https://reviews.llvm.org/D70157
More information about the cfe-commits
mailing list