[llvm-dev] Stackmap question

Thu Dec 1 08:43:00 PST 2016

Hello, all

We are working on the Multi-OS-Engine (https://multi-os-engine.org), which uses Android's ART runtime ported to iOS to allow Java developers to write iOS applications. Currently we are working on its 2.0 version, which will use LLVM to generate native code from dalvik bytecode, in order to support Apple's new BITCODE requirement for its Store. We already have a working version where the generated code actively maintains the managed stack data, but it has a significant performance impact.

Currently I am trying to use stackmaps to make important values accessible by the runtime with libunwind. The general design was to put one stackmap after every call and from the entry points we walk over the frames, compute the instruction offset, by subtracting starting IP from current IP, use that instruction offset to lookup the relevant stackmap record and finally, load the values using libunwind based on that stackmap record.

We encountered some unforeseen issues with this design, as it is not guaranteed that the stackmap position is immediately adjacent to the entry point call. In most cases it is just some register copy instruction that restore previous register values. This may not only break instruction offset, but also the locations, because this way we might try to load a value from a register, before it gets restored with its original, proper value.

Here is an example, when the generated stackmap position is not directly adjacent to the entry point call.

LLVM IR:

declare i32 @A()
declare void @llvm.experimental.stackmap(i64, i32, ...)

define i32 @F()  {
entry:
 %0 = call i32 @A()
 call void (i64, i32, ...)  @llvm.experimental.stackmap(i64 0, i32 0, i32 %0)
 %1 = call i32 @A()
 ret i32 %0
}

Generated x86-64 assembly:

	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 11
	.globl	_F
	.p2align	4, 0x90
_F:                                     ## @F
	.cfi_startproc
## BB#0:                                ## %entry
	pushq	%rbp
Ltmp0:
	.cfi_def_cfa_offset 16
Ltmp1:
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
Ltmp2:
	.cfi_def_cfa_register %rbp
	pushq	%rbx
	pushq	%rax
Ltmp3:
	.cfi_offset %rbx, -24
	callq	_A
	movl	%eax, %ebx
Ltmp4:
	callq	_A
	movl	%ebx, %eax
	addq	$8, %rsp
	popq	%rbx
	popq	%rbp
	retq
	.cfi_endproc

	.section	__LLVM_STACKMAPS,__llvm_stackmaps
__LLVM_StackMaps:
	.byte	1
	.byte	0
	.short	0
	.long	1
	.long	0
	.long	1
	.quad	_F
	.quad	24
	.quad	0
	.long	Ltmp4-_F
	.short	0
	.short	1
	.byte	1
	.byte	4
	.short	3
	.long	0
	.short	0
	.short	0
	.p2align	3

.subsections_via_symbols

The label Ltmp4 used to compute instruction offset is not right after the call.

With my understanding stackmaps should be able to be used for extracting interesting values from a whole stacktrace, but I don’t know how is this possible, if the stackmap is not generated for the position right after the call.

One possible alternative would be referencing the registers and stackslots that are used for restoring the clobbered regisers, this way the stackmap would not have to rely on registers that have to be restored.

Did I make some incorrect assumptions or is something wrong with the IR I generate?

Thanks!

Best regards,
Daniel Mihalyi