[cfe-dev] New diff

Tue Jul 1 06:14:47 PDT 2008

Okay, I think this all makes sense.  I'll add back in the Selector and  
CGM dependency, and resubmit the diff.  It will take me a little while  
to refactor and test it with both codebases, but I hope to have  
something for you this week (the code for typed selectors isn't dead,  
it's just resting.  Both the GNU and Étoilé runtimes use these and can  
easily - or do already - support errors on lookups with incorrectly  
typed selectors.  The GNUstep implementation of Distributed Objects is  
also faster if typed selectors are used, since it avoids the extra  
call to -methodSignatureForSelector: which can trigger a dozen or so  
message sends to look up information which can be provided by the  
compiler).

David

On 1 Jul 2008, at 06:36, Chris Lattner wrote:

> On Jun 30, 2008, at 6:43 AM, David Chisnall wrote:
>> On 29 Jun 2008, at 00:16, Chris Lattner wrote:
>>> <moving this to cfe-dev, which should have been cc'd the whole time>
>>>> Here's a new version.  It pulls out most of the cases where the  
>>>> value from an llvm::Constant is taken in CodeGen and passes in  
>>>> the initialisers instead.
>>>
>>> Now you're copying strings around all over the place for no  
>>> reason.  Converting things to a vector<std::string> does a bunch  
>>> of string copies.  Why do we want this?
>>
>> Ideally I wouldn't.  I'd want it to just take references, but since  
>> Selector returns a string, rather than a pointer to a string, a  
>> copy is required, and I made the others strings for consistency.
>
> Why not pass the selector around?  It is exactly one word and is  
> cheap to copy.  It is perfect for this sort of interface.
>
>> Selector objects are not an adequate abstraction here because they  
>> do not contain type information,
>
> Neither do strings.
>
>> which we can extract from  the AST, and they are not really the  
>> right thing to pass in semantically since it is the method name,  
>> not the selector, that we want (the two are often used  
>> interchangeably, but are not the same).
>
> Why not?
>
>>>> I've also changed it so that the Selector passed to the the  
>>>> methods for generating message sends is a selector, rather than  
>>>> the name of a selector.  This should be previously obtained by  
>>>> GetSelector.  There are now two versions of this, one taking a  
>>>> pair of char*s and one taking a pair of llvm::Value*s as arguments.
>>>
>>> I don't understand where you're going with this.  You've taken all  
>>> the nice simple APIs that take Selectors and pass Value*'s  
>>> around.  You are inverting the logic of the code generator to  
>>> avoid passing in clang data structures.
>>
>> This part of the change is definitely cleaner.  The runtime  
>> interface now more closely mirrors how message dispatch works (as a  
>> two stage process, getting the selector and sending the message).   
>> It also allows as much type information to be provided to the  
>> runtime as is available.  Since clang knows the types of the  
>> arguments being passed, it can pass this when looking up the  
>> selector (it doesn't yet), without changing the message send  
>> method.  This will aid debugging Objective-C code in the future,  
>> since a runtime exception can be thrown if a method is called with  
>> arguments of the wrong type: this is not always possible to detect  
>> at compile time, and currently causes a corrupt stack and is a  
>> colossal pain to trace.
>
> Is this something supported by existing ObjC runtimes or some new  
> feature you are prototyping?  I don't see how this relates to the  
> Selector vs string issue, since neither provides type info.
>
>>> Further, I don't see why the ObjC codegen stuff shouldn't pull in  
>>> CodeGenModule.  It seems that you're again inverting control flow  
>>> here to eliminate a very reasonable dependency.  Because of this,  
>>> you've reinvented things (like CGObjCGNU::MakeConstantString), and  
>>> introduced problems (e.g. MakeConstantString doesn't unique the  
>>> strings like CGM's does).
>>
>> I would be more than happy to refactor this to reuse existing code  
>> if it can be done in a way that didn't introduce direct  
>> dependencies on clang.  I would propose creating a delegate object  
>> that can be passed in, which exposes any methods in CGM that are  
>> needed and can be replaced by a different delegate in other  
>> implementations, if this would be acceptable.
>
> Not acceptable.  Why not just have your other implementations  
> provide a corresponding version of CGM and Selector that they can  
> pass around?
>
>
>>> I understand (now) that you are trying to reuse the clang codegen  
>>> in the context of your smalltalk implementation.
>>
>> And in other compilers.  The Nu team recently approached me to  
>> discuss using the code for their object model (currently they  
>> target the Apple runtime directly), and others are planning on  
>> using it for JavaScript and Io ports.
>>
>> I have found bugs in the clang code directly as a result of using  
>> it in other work, and I consider reuse to be an important way of  
>> testing the code.  I would rather not have to maintain a portable  
>> and a clang version of this code, since it would mean that bug  
>> fixes would take much longer to make their way into clang.
>
> That is very cool and I don't want to lose that.  However, clang  
> maintainability and performance is my #1 prio, so it trumps  
> reusability of the code in other contexts.  I think that we really  
> just need to find the right way to factor this, I think we can come  
> to a design that fits both purposes reasonably well.
>
>>> While this is an interesting goal, this is *not* an acceptable  
>>> reason to make the clang ObjC codegen code inefficient and weird.
>>
>> Weird is highly subjective - I consider that the code now  
>> corresponds closely to the object model semantics, and so don't  
>> find it weird.  Inefficient is probably true, and I would welcome  
>> any constructive suggestions for optimisation.
>
> You're right and I was being overly critical here.  The thing I  
> really didn't like about the old design is that you emitted dead  
> strings to LLVM IR and read them back out on the "other side" of the  
> boundary.  Passing std::strings around *is* a step up from that, but  
> it still isn't as efficient as passing around Selectors.
>
>>> I'm somewhat ok with you adding codepaths that are not used by  
>>> clang (and are thus dead) as long as they are not huge, but making  
>>> codepaths clang *does* use inefficient and inelegant is not  
>>> acceptable.
>>
>> I have not written anything in this file which is not intended for  
>> use in compiling Objective-C.  Some parts are not yet connected to  
>> implementations, however anything intended for other languages I  
>> have kept in separate files and not submitted.  The only code paths  
>> that are dead are ones where I have not yet written the calling code.
>
> There was code in the file for handling "typed selectors" which was  
> dead.
>
> -Chris
>