[llvm-dev] Is it possible to execute Objective-C code via LLVM JIT?

Stanislav Pankevich via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 3 12:12:47 PDT 2018


Hi Lang,

Since the last letter in May, I have done more testing of the solution
based on objc_readClassPair/objc_registerClassPair and I can confirm
that I didn't encounter any major issues related to Objective-C or
Swift. We have also connected my prototype code to our mutation
testing tool and got some first mutation testing reports. We have
tested it on Swift code that contains Objective-C classes (a few small
projects randomly taken from Github, for example,
raywenderlich/swift-algorithm-club).

It turns out that I no longer have enough of my spare time to invest
into this Objective-C/Swift branch of our work on Mull, the mutation
testing tool. I have written a post where I share what I learned while
trying to make LLVM JIT run Objective-C and Swift. I have also tried
to make it friendly to beginners in case if anyone would want to build
something on top of this prototype. The post is called "LLVM JIT,
Objective-C, and Swift on macOS: knowledge dump" and can be found
here: https://stanislaw.github.io/2018/09/03/llvm-jit-objc-and-swift-knowledge-dump.html.

Another part of the story about Mull, Swift, and Objective-C is that
now we are having another task but it is not related to LLVM JIT but
rather requires a good knowledge of the Swift compiler. I have also
written a post about this issue and also on our progress on mutation
testing for Swift: "Mutation testing for Swift with Mull: how it could
work. Looking for contributors.",
https://stanislaw.github.io/2018/09/03/mull-and-swift-how-it-almost-works.html.

It is sad that this ObjC code for LLVM JIT is still only a prototype
but I hope that my post might help someone to make a further progress
(be it mutation testing or any other tool that requires LLVM JIT to
run Swift/ObjC).

On a positive note, Alex has been working on great performance
improvements in Mull: instead of recompiling the whole Modules with
mutated LLVM IR, mutated functions are precompiled along with the
non-mutated functions. No overhead on recompilation and re-linkage of
mutated clones of the program and Mull is now running on C/C++
programs faster than ever before.

Thanks for attention,
Stanislav
On Sun, May 13, 2018 at 8:38 PM Stanislav Pankevich
<s.pankevich at gmail.com> wrote:
>
> Hi Lang,
>
> Thank you very much for the answer. It turns out that, within my code,
> objc_readClassPair works much better than my previous attempts to use
> allocateClassPair/registerClassPair. My conclusion is that
> allocate/registerClassPair could work for Objective-C code but
> definitely not for Swift code.
>
> With objc_readClassPair it is all much better. I use my own XCTest
> runner and I am able to run various parts of Swift projects from
> Github without any issues: SwiftGraph,
> raywenderlich/swift-algorithm-club, VectorMath - they all have
> XCTest-based test suites so I am running their code using their tests
> inside JIT.
>
> The only minor issue that I am having is a funny warning familiar to
> every Mac/iOS developer: when one loads two classes with the same name
> in Runtime:
>
> objc[7620]: Class TestCase is implemented in both ?? (0x105ff8270) and
> ?? (0x105ff8270). One of the two will be used. Which one is undefined.
>
> However, looking at the reason for this warning in libobjc, it seems
> to me that it just does not implement the slightly different case of
> the classes registered via objc_readClassPair so it is harmless in
> this custom case and can be ignored (for now). I have verified that
> only a single instance of each class is registered.
>
> I am still missing many test cases so I want to ask if you can point
> me to those Non-Trivial Test Cases that you mentioned. I am happy to
> try my code on those to see if I would hit the same issues as you did.
>
> Meanwhile I am going to add more complex test cases and run some more
> real-world Swift projects.
>
> Thanks again for the pointer,
>
> Stanislav
>
> On Fri, May 4, 2018 at 1:44 AM, Lang Hames <lhames at gmail.com> wrote:
> > Hi Stanislav,
> >
> > Sorry -- I am not much help here. I would like to get ObjC runtime support
> > for the JIT, but have not had time to look in to it closely. When I last
> > tested the idea (a couple of years ago now) we used selector registration
> > and objc_readClassPair to get basic test cases working as you have, but ran
> > into (possibly similar) failures on non-trivial test cases. I think the
> > interpretation (and possibly the layout?) of some runtime metadata varies
> > between relocatable objects and linked images. The advice we got at the time
> > was that we would be better off parsing the metadata in the object file and
> > calling the runtime registration methods in objc/runtime.h, rather than
> > fixing up the metadata ourselves. That might be worth investigating.
> >
> > If you are not worried about performance at the moment, another option may
> > be to ditch RuntimeDyld entirely. ORC is now decoupled from RuntimeDyld, so
> > you are free to substitute your own linking layer. Last year I hacked up an
> > alternative linking layer that dumps objects to disk and links them into
> > dylibs, then dlopens them. The performance is obviously terrible, but this
> > punts all the ObjC/Swift metadata parsing to ld64 and dyld (and you get
> > debug info for free!) so it is handy for prototyping. Let me know if you
> > would like me to share the code for that.
> >
> > Cheers,
> > Lang.
> >
> > On Fri, Apr 6, 2018 at 12:19 PM, Stanislav Pankevich <s.pankevich at gmail.com>
> > wrote:
> >>
> >> Hi again,
> >>
> >> I had tried to follow David's suggestion to take a step back and look
> >> into codegen instead of hacking on RuntimeDyld but then I quickly
> >> realized that I don't understand what exactly needs to be done to
> >> fully register Objective-C runtime. I decided to iterate on JIT code
> >> again and somehow I found that I can hook into SectionMemory by
> >> subclassing it and working with its allocateDataSection method:
> >>
> >> 1) I collect pointers to objc-related sections for which the memory is
> >> allocated. Before SectionMemoryManager::finalizeMemory() method is
> >> called I register the ObjC runtime classes.
> >> 2) I iterate over __objc_selrefs sections and fixup selectors. This
> >> does fix the original crash of this thread.
> >> 3) I iterate over __objc_classlist sections and register the new
> >> classes using objc_allocateClassPair function, register the properties
> >> and ivars to these new classes, run objc_registerClassPair to complete
> >> the registration. (I still have to implement the protocol
> >> registration)
> >> 4) I iterate over __objc_classrefs and __objc_superrefs and fix up the
> >> class pointers with the new classes created at step 2.
> >> 5) I iterate over __objc_classlist and fix up its classes with the new
> >> classes created at step 2.
> >>
> >> The very basic Objective-C code seems to work now without any issues,
> >> however when I switch to mixed Objective-C/Swift code I start getting
> >> some crashes which I don't fully understand. In particular I am trying
> >> to run a simple XCTestCase test written in Swift and my code crashes
> >> when I access the property of this Swift's ObjC-based class.
> >>
> >> My questions are:
> >>
> >> 1) Should I do anything else to properly register Objective-C besides
> >> what I am doing at steps 1-5?
> >>
> >> 2) Do I have to do anything additionally to make run Swift code with
> >> LLVM JIT? On of the attached files show that the object I am working
> >> with has sections like: __swift3_* or __swift2_*. Should I do anything
> >> about this sections or they are irrelevant?
> >>
> >> 3) Anything that I am missing / should know about if I want to run
> >> mixed Objective-C/Swift code via LLVM JIT?
> >>
> >> My very hacky code is located here [1]. If needed I can also share the
> >> code of my latest attempts to run the combined ObjC/Swift code.
> >>
> >>
> >> https://github.com/mull-project/mull-jit-lab/tree/master/lab-jit-objc/llvm-jit-lab/src
> >>
> >> Any help very much appreciated.
> >>
> >> Stanislav
> >>
> >>
> >> On Thu, Feb 15, 2018 at 2:33 AM, Lang Hames <lhames at gmail.com> wrote:
> >> > Hi David, Stanislav,
> >> >
> >> > Sorry for the delayed reply.
> >> >
> >> > Short version: There hasn't been any progress on this just yet, as I
> >> > have
> >> > been busy with an overhaul of the underlying ORC APIs.
> >> >
> >> >> 1) Hack up something in RuntimeDyldMachO to handle the data structures
> >> >> currently generated by clang.  This is fragile, because the interface
> >> >> between the compiler and the runtime is not documented, and is unique
> >> >> to
> >> >> each runtime.  This code will be different on i386 and ARM, for
> >> >> example.
> >> >>
> >> >>
> >> >>
> >> >> 2) Create a new CGObjCRuntime subclass that creates a module init
> >> >> function
> >> >> that constructs all of the classes using the public APIs, by adding
> >> >> something like -fobjc-runtime=jit to the clang flags.  This is not
> >> >> particularly difficult and means that the same code can be used with
> >> >> any
> >> >> Objective-C runtime.
> >> >
> >> >
> >> > (1) is the preferred long-term solution as we want to minimize
> >> > differences
> >> > between generated code in the JIT'd and non-JIT'd cases. (2) seems like
> >> > a
> >> > reasonable interim solution if it is easier to implement.
> >> >
> >> > Steven -- what is the status of the ObjC parsing code these days?
> >> >
> >> > -- Lang.
> >> >
> >> >
> >> > On Wed, Feb 14, 2018 at 3:08 AM, David Chisnall via llvm-dev
> >> > <llvm-dev at lists.llvm.org> wrote:
> >> >>
> >> >>
> >> >> > On 13 Feb 2018, at 17:42, Stanislav Pankevich <s.pankevich at gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> > On Tue, Feb 13, 2018 at 12:18 PM, David Chisnall
> >> >> > <David.Chisnall at cl.cam.ac.uk> wrote:
> >> >> >> On 12 Feb 2018, at 22:31, Stanislav Pankevich via llvm-dev
> >> >> >> <llvm-dev at lists.llvm.org> wrote:
> >> >> >>>
> >> >> >>> Specifically I explored the latest objc4-723
> >> >> >>> from Apple Open Source and it looks like all of the APIs that allow
> >> >> >>> the registration of Objective-C classes, selectors, etc. are all
> >> >> >>> very
> >> >> >>> private.
> >> >> >>
> >> >> >> The Objective-C runtime provides public APIs for doing all of this.
> >> >> >> They’re even documented.  They are also more or less standard and so
> >> >> >> work
> >> >> >> with all runtime implementations, not just the Apple one.  I was
> >> >> >> using them
> >> >> >> for JIT’d code on macOS and FreeBSD 10 years ago.
> >> >> >
> >> >> > Which methods are you referring to? For example of class
> >> >> > registration,
> >> >> > do you mean objc_allocateClassPair/objc_registerClassPair or
> >> >> > something
> >> >> > else?
> >> >>
> >> >> Yes, those set of APIs.  They provide an interface for building
> >> >> classes,
> >> >> protocols, and so on.
> >> >>
> >> >> >>> One year ago you said you could help anyone interested in working
> >> >> >>> on
> >> >> >>> this. Let me check here again as a volunteer (if this work can ever
> >> >> >>> be
> >> >> >>> accomplished by someone outside Apple).
> >> >> >>
> >> >> >> As I said in the earlier thread, the best way of doing this is to
> >> >> >> add a
> >> >> >> new subclass of CGObjCRuntime that generates the code using the
> >> >> >> public APIs.
> >> >> >
> >> >> > Let me get this right. What clang::CodeGen:: CGObjCRuntime has to do
> >> >> > with this? My understanding of Lang's hint was that one has to extend
> >> >> > llvm's classes like RuntimeDyldMachO to parse Mach-O, find classes,
> >> >> > selectors, categories etc and register them all manually. Are you
> >> >> > saying that something has to be be added to CodeGen/*?
> >> >>
> >> >> You have two options:
> >> >>
> >> >> 1) Hack up something in RuntimeDyldMachO to handle the data structures
> >> >> currently generated by clang.  This is fragile, because the interface
> >> >> between the compiler and the runtime is not documented, and is unique
> >> >> to
> >> >> each runtime.  This code will be different on i386 and ARM, for
> >> >> example.
> >> >>
> >> >> 2) Create a new CGObjCRuntime subclass that creates a module init
> >> >> function
> >> >> that constructs all of the classes using the public APIs, by adding
> >> >> something like -fobjc-runtime=jit to the clang flags.  This is not
> >> >> particularly difficult and means that the same code can be used with
> >> >> any
> >> >> Objective-C runtime.
> >> >>
> >> >> >> If you’re running in the same process as the JIT, you could register
> >> >> >> the selectors in the host environment and just inject the values as
> >> >> >> symbols
> >> >> >> (this is what I did).  I’d be happy to help out someone who wants to
> >> >> >> do
> >> >> >> this.
> >> >> >
> >> >> > It would be nice to get this working without embedding any of
> >> >> > Objective-C to the host process this is
> >> >>
> >> >> It’s an optimisation, not a compulsory part of the process.
> >> >>
> >> >> > why I am particularly
> >> >> > interested in knowing how to do the work that objc4 does in the
> >> >> > methods such as: objc4/_objc_init, objc4/map_images_nolock and
> >> >> > objc4/_read_images.
> >> >> >
> >> >> > My understanding of the goal is to make the lli example from this
> >> >> > thread working:
> >> >> >
> >> >> >
> >> >> > https://stackoverflow.com/questions/10375324/all-selectors-unrecognised-when-invoking-objective-c-methods-using-the-llvm-exec.
> >> >> >
> >> >> > I would be happy to get a hint on which functions of Objective-C
> >> >> > Runtime's public API should I use to get that simple example working
> >> >> > in a quick and dirty way.
> >> >>
> >> >> You seem to have decided that you want to use unmodified IR from a
> >> >> specific version of Apple's Objective-C implementation.  I can’t help
> >> >> you
> >> >> there.
> >> >>
> >> >> David
> >> >>
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> llvm-dev at lists.llvm.org
> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >
> >> >
> >
> >


More information about the llvm-dev mailing list