[PATCH] Use ARM-style representation for C++ method pointers under PNaCl

Mon Jul 15 12:49:09 PDT 2013

On 8 July 2013 19:29, John McCall <rjmccall at apple.com> wrote:

> On Jul 8, 2013, at 7:16 PM, Mark Seaborn <mseaborn at chromium.org> wrote:
>
> On 8 July 2013 15:42, John McCall <rjmccall at apple.com> wrote:
>
>> On Jul 8, 2013, at 3:38 PM, Mark Seaborn <mseaborn at chromium.org> wrote:
>>
>> On 8 July 2013 13:51, John McCall <rjmccall at apple.com> wrote:
>>
>>> On Jul 8, 2013, at 12:56 PM, Mark Seaborn <mseaborn at chromium.org> wrote:
>>>
>>> On 19 June 2013 22:43, Mark Seaborn <mseaborn at chromium.org> wrote:
>>>
>>>> On 19 June 2013 15:20, John McCall <rjmccall at apple.com> wrote:
>>>>
>>>>> On Jun 19, 2013, at 3:17 PM, Mark Seaborn <mseaborn at chromium.org>
>>>>> wrote:
>>>>>
>>>>> On 19 June 2013 13:17, John McCall <rjmccall at apple.com> wrote:
>>>>>
>>>>>> On Jun 19, 2013, at 1:05 PM, Mark Seaborn <mseaborn at chromium.org>
>>>>>> wrote:
>>>>>>
>>>>>> On 19 June 2013 13:01, Mark Seaborn <mseaborn at chromium.org> wrote:
>>>>>>
>>>>>>> Use ARM-style representation for C++ method pointers under PNaCl
>>>>>>>
>>>>>>> Before this change, Clang uses the x86 representation for C++ method
>>>>>>> pointers when generating code for PNaCl.  However, the resulting code
>>>>>>> will assume that function pointers are 0 mod 2.  This assumption is
>>>>>>> not safe for PNaCl, where function pointers could have any value
>>>>>>> (especially in future sandboxing models).
>>>>>>>
>>>>>>> So, switch to using the ARM representation for PNaCl code, which
>>>>>>> makes
>>>>>>> no assumptions about the alignment of function pointers.
>>>>>>>
>>>>>>> See: https://code.google.com/p/nativeclient/issues/detail?id=3450
>>>>>>>
>>>>>>
>>>>>> Oops, I meant to send this to cfe-commits rather than llvm-commits.
>>>>>>
>>>>>>
>>>>>> I do not think you should just unconditionally opt in to random ARM
>>>>>> behavior.  In particular, ARM uses 32-bit guard variables because
>>>>>> that's
>>>>>> the size of a pointer on ARM;  PNaCl needs to be able to efficiently
>>>>>> support 64-bit platforms as well.
>>>>>>
>>>>>
>>>>> The code does always use 64-bit guard variables on 64-bit systems.  It
>>>>> does this:
>>>>>
>>>>>     // Guard variables are 64 bits in the generic ABI and size width
>>>>> on ARM
>>>>>     // (i.e. 32-bit on AArch32, 64-bit on AArch64).
>>>>>     guardTy = (IsARM ? CGF.SizeTy : CGF.Int64Ty);
>>>>>
>>>>> Having said that, PNaCl is 32-bit-only:  PNaCl programs assume a
>>>>> 32-bit address space.  We don't support 64-bit pointers in PNaCl.  In
>>>>> Clang, targeting PNaCl is identified by "le32" being in the triple, and I
>>>>> assume there's no way to get 64-bit pointers with "le32". :-)
>>>>>
>>>>>
>>>>> Interesting, okay.
>>>>>
>>>>> I still do not want PNaCl to claim to be ARM.  Abstract the code so
>>>>> that
>>>>> you can opt into the specific behaviors you want without pretending to
>>>>> be ARM.
>>>>>
>>>>
>>>> OK, I have changed the patch to split IsARM into two separate fields.
>>>> I called the fields UseARMMethodPtrABI and UseARMGuardVarABI out of a lack
>>>> of imagination.
>>>>
>>>> I've changed the patch to use the non-ARM guard variable ABI for
>>>> PNaCl.  Having looked at the code more carefully, I see it's inlining a
>>>> different guard variable check on ARM, which we don't necessarily want to
>>>> use for PNaCl; it's not just a different guard var size.
>>>>
>>>
>>> I've updated the patch to also add a test for C++ guard variable usage
>>> under PNaCl.  Is this OK to commit?
>>>
>>>
>>> +  bool UseARMMethodPtrABI;
>>> +  bool UseARMGuardVarABI;
>>>
>>> Good enough.
>>>
>>> +    if (CGM.getContext().getTargetInfo().getTriple().getArch()
>>> +        == llvm::Triple::le32) {
>>>
>>> I think testing the OS would be more reasonable here.
>>>
>>
>> Actually, we do want to test the architecture field here.  If we use the
>> triple "i686-unknown-nacl", we want to get the x86 ABI for C++ method
>> pointers, so that the generated code is interoperable with the NaCl GCC
>> toolchain.  If Clang tested the OS field here, that would break this use
>> case.
>>
>>
>> So, PNaCl cannot be used to generate interoperable i386 code; you always
>> have to emit a separate i386 slice?
>>
>
> That's right.  At least, interoperability with native ABIs is limited
> unless you pass an architecture-specific target triple to Clang when
> compiling.
>
>
> Okay, then I agree that you need to at least check the arch.
>
> I'm not familiar with the LLVM history, but the comment in Triple.h
> suggests that there are multiple platforms on le32.  So either (1) check
> specifically for your platform, which is the combination of le32 and nacl,
> or (2) convince me (shouldn't take much) that all the platforms either want
> this or explicitly don't care.
>

Yes, "le32" covers PNaCl and Emscripten.  The comment in Triple.h is:

    le32,    // le32: generic little-endian 32-bit CPU (PNaCl / Emscripten)

I think both platforms want this change.  Emscripten contains a workaround
for making C++ method pointers work when Clang generates code using the
Itanium/x86 ABI.  Emscripten allocates every function a small integer ID,
which is an index into a function table.  Currently, Emscripten ensures
that every function gets an even-numbered ID by inserting dummy entries
into the function table.  If my Clang change were committed, and if it were
enabled for Emscripten, Emscripten could remove this workaround.

Specifically, the workaround is implemented by the following line in
Emscripten's modules.js:
        this.nextIndex += 2; // Need to have indexes be even numbers, see
|polymorph| test
(https://github.com/kripken/emscripten/blob/master/src/modules.js)

That said, I don't work on Emscripten -- CC'ing Emscripten maintainer Alon
Zakai in case he wants to comment.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130715/265b1f8f/attachment.html>