[llvm] [llvm-exegesis] [AArch64] Add support for Load Instructions in subprocess execution mode (PR #144895)
Lakshay Kumar via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 7 05:37:41 PDT 2025
lakshayk-nv wrote:
Current state of PR will enable the following :
- Manual memory snippets are now supported for AArch64 in exegesis.
- Load instructions no longer segfault - Registers are now properly initialized with valid memory addresses in subprocess execution mode.
- Basic subprocess execution - Load instructions (e.g., LD1B) execute without errors in subprocess mode
- Auxiliary memory mmap and Manual snippet mmap working with anonymous memory mapping.
Limitations:
- Unreliable measurements due to ioctl syscall failing, file descriptor for perf event is not populated at auxiliary memory location where it is expected. (Inclined to take up this in future)
- (FYI) All Memory mapping is anonymous, not a functional problem. (Inclined to take up this in future)
- Temporary fix of loading first register with a valid memory address needed to generalized to load register based on instruction structure. (Out of scope for this PR, Resolve in future PR)
And, for reference output of LD1B in subprocess mode for benchmarking latency:-
```yaml
$ build/bin/llvm-exegesis -mode=latency --execution-mode=subprocess --opcode-name=LD1B --debug-only="preview-gen-assembly"
Warning: generateMmapAuxMem using anonymous mapping
Warning: setStackRegisterToAuxMem called but not required for AArch64
Warning: configurePerfCounter ioctl syscall failing
Warning: configurePerfCounter ioctl syscall failing
Warning: generateMmapAuxMem using anonymous mapping
Warning: setStackRegisterToAuxMem called but not required for AArch64
Warning: configurePerfCounter ioctl syscall failing
Warning: configurePerfCounter ioctl syscall failing
Generated assembly snippet:
'''
0: fc1f0fed str d13, [sp, #-16]!
4: f90007f7 str x23, [sp, #8]
8: b26f77e0 mov x0, #140737488224256
c: d2820001 mov x1, #4096
10: d2800062 mov x2, #3
14: d2800423 mov x3, #33
18: f2a00203 movk x3, #16, lsl #16
1c: 92800004 mov x4, #-1
20: d2800005 mov x5, #0
24: d2801bc8 mov x8, #222
28: d4000001 svc #0
2c: f81f0fe0 str x0, [sp, #-16]!
30: 2518e3e6 ptrue p6.b
34: f84107ef ldr x15, [sp], #16
38: d2800017 mov x23, #0
3c: 2518e3e0 ptrue p0.b
40: 25f8c00d mov z13.d, #0
44: f81f0fe8 str x8, [sp, #-16]!
48: f81f0fe0 str x0, [sp, #-16]!
4c: f81f0fe1 str x1, [sp, #-16]!
50: f81f0fe2 str x2, [sp, #-16]!
54: b26f77f0 mov x16, #140737488224256
58: b9400200 ldr w0, [x16]
5c: d2848061 mov x1, #9219
60: d2800022 mov x2, #1
64: d28003a8 mov x8, #29
68: d4000001 svc #0
6c: f84107e2 ldr x2, [sp], #16
70: f84107e1 ldr x1, [sp], #16
74: f84107e0 ldr x0, [sp], #16
78: f84107e8 ldr x8, [sp], #16
7c: a41759f0 ld1b { z16.b }, p6/z, [x15, x23]
80: 244d0207 cmphs p7.h, p0/z, z16.h, z13.h
84: a41759f0 ld1b { z16.b }, p6/z, [x15, x23]
88: 244d0207 cmphs p7.h, p0/z, z16.h, z13.h
... (9994 more instructions)
9cb4: a41759f0 ld1b { z16.b }, p6/z, [x15, x23]
9cb8: 244d0207 cmphs p7.h, p0/z, z16.h, z13.h
9cbc: b26f77f0 mov x16, #140737488224256
9cc0: b9400200 ldr w0, [x16]
9cc4: d2848021 mov x1, #9217
9cc8: d2800022 mov x2, #1
9ccc: d28003a8 mov x8, #29
9cd0: d4000001 svc #0
9cd4: d2800000 mov x0, #0
9cd8: d2800ba8 mov x8, #93
9cdc: d4000001 svc #0
9ce0: f94007f7 ldr x23, [sp, #8]
9ce4: fc4107ed ldr d13, [sp], #16
9ce8: d65f03c0 ret
'''
---
mode: latency
key:
instructions:
- 'LD1B Z16 P6 X15 X23'
- 'CMPHS_PPzZZ_H P7 P0 Z16 Z13'
config: ''
register_initial_values:
- 'P6=0x0'
- 'X15=0x0'
- 'X23=0x0'
- 'P0=0x0'
- 'Z13=0x0'
cpu_name: neoverse-v2
llvm_triple: aarch64-unknown-linux-gnu
min_instructions: 10000
measurements:
- { key: latency, value: 7.2836, per_snippet_value: 14.5672, validation_counters: {} }
error: ''
info: Repeating two instructions
assembled_snippet: ED0F1FFCF70700F9E0776FB2010082D2620080D2230480D20302A0F204008092050080D2C81B80D2010000D4E00F1FF8E6E31825EF0741F8170080D2E0E318250DC0F825E80F1FF8E00F1FF8E10F1FF8E20F1FF8F0776FB2000240B9618084D2220080D2A80380D2010000D4E20741F8E10741F8E00741F8E80741F8F05917A407024D24F05917A407024D24F05917A407024D24F05917A407024D24F0776FB2000240B9218084D2220080D2A80380D2010000D4000080D2A80B80D2010000D4F70740F9ED0741FCC0035FD6
...
```
https://github.com/llvm/llvm-project/pull/144895
More information about the llvm-commits
mailing list