[llvm-dev] [RFC] Introducing an explicit calling convention

Frej Drejhammar via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 15 00:20:07 PST 2019


Hi All,

TLDR: Allow calling conventions to be defined on-the-fly for functions
in LLVM-IR, comments are requested on the mechanism and syntax.

Summary
=======

This is a proposal for adding a mechanism by which LLVM can be used to
generate code fragments adhering to an arbitrary calling
convention. Intended use cases are: generating code intended to be
called from the shadow of a stackmap or patchpoint; generating the
target function of a statepoint; or simply for generating a piece of
shell-code during reverse engineering or binary patching.

Motivation
==========

The LLVM assembly language provides stackmaps, patchpoints, and
statepoints which all provide the user with the value or storage
location of operands given to the respective intrinsic. All three
intrinsics emit the information in a special stackmap section [3] of
the produced object file. The previous three intrinsics are useful to
the implementer of a JIT-compiler for an interpreted language [2] (the
author's use case) as stackmaps can be used to incrementally extend
blocks of native code and a statepoint can both be used as a mechanism
to call native code and as a landing-pad for reentry from
native-code. Other uses, such as inserting a stackmap and later
overwriting its shadow with a call to logging function are also
possible.

The information in the stackmap section can be seen as a custom
calling convention which is unique for this particular
location. Unfortunately there is currently no way to define the
details of a LLVM calling convention dynamically, as LLVM only allows
the user to choose among a fixed set of predefined conventions.

Approach
========

This proposal adds a new calling convention called 'explicitcc', which
can be applied to void functions. A function using the explicit
calling convention requires that each element of the argument list has
a parameter attribute 'hwreg(metadata)' specifying the register from
which the argument gets its value. An 'explicit' function can have an
optional 'noclobber(metadata)' function attribute to tell the compiler
which registers are to be treated as callee save. Additionally a new
'@llvm.experimental.retwr(...)' (standing for return with registers)
intrinsic is introduced. By giving each parameter to retwr a hwreg
attribute, it allows the 'explicit' function to return to its caller
with a defined register state.

Only parameters passed in registers are considered as the
llvm.addressofreturnaddress intrinsic can be used to calculate the
location of values on the callers stack.

Example
=======

The following is a function which exchanges the values of rcx and rdx
without clobbering rax and rbx.

define explicitcc void @example(i64 hwreg(metadata !1) %a,
                                i64 hwreg(metadata !2) %b)
			       noclobber(metadata !0) {
  call void (...) @llvm.experimental.retwr(i64 hwreg(metadata !2) %a,
                                           i64 hwreg(metadata !1) %b)
  ret void
}

!0 = !{!"rax", !"rbx"}
!1 = !{!"rcx"}
!2 = !{!"rdx"}

Open Questions
==============

Are parameter attributes the best way to encode the register
information? The metadata reference requires adding a pointer to the
ISD::ArgFlagsTy struct, thus growing it by 50%, is this acceptable?
An alternative could be to instead have a function attribute which
points to a metadata tuple with the explicit registers.

References
==========

[1] https://llvm.org/docs/StackMaps.html#stack-map-format

[2] https://dl.acm.org/citation.cfm?id=2633450

[3] https://llvm.org/docs/StackMaps.html#stack-map-section

Regards,

--Frej Drejhammar


More information about the llvm-dev mailing list