[cfe-dev] 32-bit pointers and calls from 64-bit code

John McCall via cfe-dev cfe-dev at lists.llvm.org
Tue Aug 7 14:34:43 PDT 2018


> On Aug 7, 2018, at 4:46 PM, Charles Davis via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> Oops, forgot to reply all...
> ---------- Forwarded message ----------
> From: Charles Davis <cdavis5x at gmail.com <mailto:cdavis5x at gmail.com>>
> Date: Tue, Aug 7, 2018 at 3:45 PM
> Subject: Re: [cfe-dev] 32-bit pointers and calls from 64-bit code
> To: John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>>
> 
> 
> 
> 
> On Tue, Aug 7, 2018 at 3:11 PM, John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>> wrote:
>> On Aug 7, 2018, at 4:06 PM, Charles Davis <cdavis5x at gmail.com <mailto:cdavis5x at gmail.com>> wrote:
>> On Mon, Aug 6, 2018 at 7:39 PM, John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>> wrote:
>> 
>> 
>>> On Aug 6, 2018, at 6:34 PM, John McCall via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>> 
>>> 
>>> 
>>>> On Aug 6, 2018, at 5:49 PM, Charles Davis via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>> 
>>>> Resending to the new mailing list address...
>>>> 
>>>> ---------- Forwarded message ----------
>>>> From: Charles Davis <cdavis5x at gmail.com <mailto:cdavis5x at gmail.com>>
>>>> Date: Thu, Aug 2, 2018 at 5:59 PM
>>>> Subject: 32-bit pointers and calls from 64-bit code
>>>> To: Clang Developers List <cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>>
>>>> 
>>>> 
>>>> Hello,
>>>> 
>>>> With Apple's impending 32-bit deprecation, my new employer is having a bit of an existential crisis. We think we've found a way forward, but we need a little help from the compiler to accomplish this.
>>> 
>>> This is a pretty big project to be throwing together like this, but good luck.
>>> 
>>>> Basically, what we want to do is have code that is technically x86-64 but that can use 32-bit pointers and that can make calls to 32-bit code.
>>> 
>>> Sure, this is a relatively well-understood problem.  The most well-known modern example is Linux's support for "x32".
>>> 
>>>> In a separate thread on llvmdev, I asked about LLVM changes we need for this. Here are the corresponding C language extensions we need:
>>>> (Possibly) a preprocessor macro to be defined when we're building 64-bit code to be called by 32-bit code. This code, while still technically being 64-bit, needs to look and act like 32-bit code to our clients--so we'd like to define __i386__ et al. as though we were building real 32-bit code.
>>> __i386__ generally means, well, the 32-bit Intel target; muddying this by pretending that your variant target is i386 seems like a bad decision.   Can you ask your clients to switch to __LP64__ or some other portable pointer-size macro instead?
>>>> Microsoft's __ptr32 and __ptr64 keywords--to distinguish 32-bit and 64-bit pointers.
>>> I don't know if we actually support those keywords, but conceptually they're just address spaces so it's not a problem.
>>>> A pragma to choose the default pointer size--with this, we can avoid littering our code with lots of __ptr32 keywords.
>>> Doable.  You'll want something that works like the ARC auditing pragmas and tries to force the pragma to stay bounded to a single header, I think.
>>>> Attributes for various 32-bit calling conventions. We can't use the existing ones, because they are defined not to do anything in 64-bit code. I'll probably define new ones that are just the old ones suffixed with '32' (e.g. __attribute__((cdecl32))).
>>> Why do you want this?  You're recompiling all of your code in this new mode, and the default x86_64 CC is more efficient than all of the specialized i386 CCs.  Why not just ignore the attributes?
>>>> An attribute to declare that a function pointer must be called far, and to declare the segment selector to use when calling it. We need this to be able to transition to a 32-bit code segment. I'm currently leaning towards (ab)using Microsoft's __based extension (which originally supported something like this, I believe) for this purpose.
>>> Are you really messing around with segments, or are you just trying to be able to to distinguish 32-bit from 64-bit function pointers?
>> 
>> Wait, I just put a few things together.  Are you planning to perform a far call to existing, i386-compiled code in the same process?
>> Well... Not exactly.
>> 
>> We plan to use the Hypervisor framework to create a VM with 32-bit and 64-bit code segments. We'll then make the entirety of the host process's memory visible inside the guest--the intent is to reduce the number of expensive VM exits. But inside this VM... yes, we intend to make far calls directly to existing i386 code. (We also intend to get called by existing i386 code.) We're trying not to implement an entire OS for this purpose--we want to use the host OS to support as much as we can, which is why we're mapping the host process memory into the VM's physical memory.
>>  That is not going to work without operating system support, and I strongly doubt that macOS has that support — probably not today, and especially not after i386 support is removed.
>> I thought so too, but it turns out you don't need a call gate to do this. All you need is the 32-bit segment selector--and on most OSes, that almost never changes, since it's an entry in the GDT. We're well aware that macOS won't even have a 32-bit CS in 10.15--that's why we want to use the Hypervisor framework.
> 
> I don't know anything about the Hypervisor framework, but what happens if you take an interrupt while in 32-bit code?
> There shouldn't be any virtual devices to generate external interrupts. Exceptions and syscalls (the other two primary causes of interrupts) will be handled with a VM exit.

I think I understand the architecture here a little better, thanks.  So you have a guest VM running in a host process, and within the guest you have a minimal kernel and a user process that includes both some uncontrolled i386-compiled payload code and some amount of VM-aware support code that you do control.  Since you want the guest to map the host process's full address space, you need at least some of the guest support code to be 64-bit, but it also needs to be able to interoperate directly with the payload code; hence the special target which generates 64-bit code but can perform layout, accesses, etc. with 32-bit data pointers and which can also perform a far call when calling a 32-bit function pointer.

Address spaces definitely seem like the right language tool for this.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180807/0c2b1f85/attachment.html>


More information about the cfe-dev mailing list