[llvm] r209386 - Override runOnMachineFunction for ARMISelDAGToDAG so that we can

Thu May 22 21:54:00 PDT 2014

On May 22, 2014, at 2:25 PM, Jim Grosbach <grosbach at apple.com> wrote:

> 
> On May 22, 2014, at 1:51 PM, Bob Wilson <bob.wilson at apple.com> wrote:
> 
>> 
>> On May 22, 2014, at 1:07 PM, Eric Christopher <echristo at gmail.com> wrote:
>> 
>>> On Thu, May 22, 2014 at 1:00 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>>> 
>>>> On May 22, 2014, at 12:48 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>> 
>>>>> On Thu, May 22, 2014 at 12:46 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>>>>> 
>>>>>> On May 22, 2014, at 11:29 AM, Tim Northover <t.p.northover at gmail.com> wrote:
>>>>>> 
>>>>>>>> Preparation for something bigger, basically the CGContext stuff from
>>>>>>>> another thread. The basic idea comes down to being able to link
>>>>>>>> together modules (or include in a single module) functions which
>>>>>>>> target different subtargets/subtarget features and still be able to
>>>>>>>> codegen and optimize them correctly. It's similar to some of the
>>>>>>>> function specific optimization that happened recently.
>>>>>>> 
>>>>>>> Oh, excellent! Looking forward to seeing more of that.
>>>>>> 
>>>>>> I’m concerned that this is the wrong approach. Especially as we start thinking about parallel compilation, I really think we need to be explicit about the context. We ought to be moving toward having runOnMachineFunction take a CGContext as an argument, with all the target information and other context accessible via the CGContext.  This patch seems to be further entrenching the current approach of having a single global target that is specified implicitly, not explicitly.
>>>>>> 
>>>>>> If I’ve missed some discussions that explain how this fits into the bigger picture of what we need to do, can you point me at it? If those discussions haven’t happen yet, this would be a good time to start.
>>>>> 
>>>>> This happened in the CGContext thread.
>>>>> 
>>>>> And yes, I agree, however, unless you want parallel pass managers then
>>>>> you'll _still_ need to do this. Also you still don't want to globally
>>>>> set what's legal based on a single target machine.
>>>>> 
>>>>> This is all scaffolding to make this work and is incremental.
>>>> 
>>>> I have not seen any discussion at all about how to move toward explicit use of CGContext. I just re-read the CGContext thread and didn’t see any discussion of that at all.
>>>> 
>>>> I understand we may not some incremental changes now, and that’s fine as long as we have a plan for where we’re trying to go in the longer term.
>>>> 
>>>> I definitely agree that we should not set what is legal based on a single target machine.  That is exactly what I’m concerned about. We need to get away from the whole idea that there is a single target machine. Your change here seems to be perpetuating that instead of fixing it.
>>> 
>>> There's single target machine as in X86, and single target machine as
>>> in Haswell. Right now I'm avoiding the issue of mixing ARM and X86 in
>>> the same module and assuming a single TM instance with different
>>> versions of the sub target.
>> 
>> Mixing ARM and x86 (or more realistically, ARM and Thumb) isn’t a priority but it would be good to do things in a way that will move us in that direction.
> 
> Mixing ARM and Thumb is a completely different beast than mixing ARM and X86. The former are expressed as Subtarget features. The latter are completely different targets. Conflating those two use case in any way beyond the conceptual is a mistake.

This is really a tangent to my main point, but Thumb goes way beyond other subtarget features. It is treated as a completely separate architecture in the target triple and supporting mixing of ARM and Thumb will require a lot of the same work as mixing completely unrelated targets like x86 and ARM. That said, yes, you are right that those are different cases.

> 
>> 
>>> This will sit at the higher level and each
>>> individual CGContext controlling the subtarget features for an
>>> individual function. Anything that needs the subtarget can ask for a
>>> CGContext/Subtarget from the target machine without needing the churn
>>> necessary to update all callers of passes and the correct one will get
>>> returned to it.
>> 
>> This seems backwards to me. You shouldn’t ask the target for a CGContext. It should be the other way around.
> 
> Seems an implementation detail to me, honestly. There will be, somewhere, a collection of contexts. Each codegen pass will want to say “which one should I use for this function?” That can be done either as an explicit parameter in the runOnMachineFunction() (and friends) APIs, or it can be a getter method to obtain the right information. Either way, there will be a collection of these contexts with a mapping from functions to contexts that gets queried. It’s just a question of whether the pass manager does the query and passes it in directly or the pass itself is responsible for asking for the information.

I think it’s much more fundamental than that. If we get the right abstraction here, it will make a lot of things easier in the future. If we get it wrong, we’ll at least have a lot of unnecessary churn in the code to rewrite a bunch of stuff.

For example, it the target and subtarget are recorded as part of the CGContext, and if we start migrating all of code-gen to access the target info via the CGContext, then we will be much closer to eventually supporting mixing different targets, even if in the short term we continue to have a constraint that all the CGContexts must have the same target.

We should also view CGContext more broadly than just a way to solve the immediate problem of compiling for different subtargets. We’ve got a bunch of code-gen state scattered all over the place, and we have to pass various data structures all over the backend. Wouldn’t it be easier if we could always get at that state via CGContext? (Thanks to Andy for suggesting that to me earlier.) And, if we’re going to tackle parallel compilation of different functions, I expect there will be some context information that each thread needs to keep separate. CGContext would be the natural place to stash that.

> 
>> 
>>> Right now the scaffolding to do that is getting put
>>> into place so that those queries can be a simple and obvious change.
>>> The first thing for this is to make sure that the various passes and
>>> backends don't do things based on the initial subtarget which is the
>>> set of changes I've been making.
>> 
>> The fact that you refer to “the initial subtarget” is the root of my concern. We should get away from the implicit argument of “the current subtarget” and make that an explicit argument (via the CGContext) of everything in code-gen.
> 
> I don’t think there’s any high level disagreement here. I certainly agree with both of you there should not be any state involved. There will be no “current subtarget” in any meaningful sense. I hear “initial subtarget” and  related language as saying that the various Subtarget data structures will be created lazily when they’re requested for a function and will be uniqued in the backing storage. Personally, I’d probably pre-compute them all in a first pass and make it lazy in a later cleanup, but I don’t care too much about doing it lazy from the start.
> 
> Right now we have getSubtarget<>(). Eric’s proposal will make that be getSubtarget<>(const MachineFunction*) or something along those lines. That gets called often enough there will probably need to be some backend caching on a per-function data-structure, but that’s details.

Instead of calling getSubtarget for a MachineFunction all over the place with caching to make it reasonably efficient, it would make a lot more sense to have a separate CGContext for each function, store the subtarget info in the CGContext once at the start of code-gen for that function, and then just grab the subtarget info from CGContext wherever it is needed.

> 
> Basically, anything that can be expressed as a function attribute or equivalent (i.e., varies per function) is moving to live on the Subtarget (so far, that’s all making sense as a place for it). Current queries that rely on it being elsewhere will change to ask the subtarget for it instead. To ask for the subtarget, they must provide the MachineFunction that they’re dealing with, so they’ll make sure to get the right subtarget information specifically for that function.
> 
> Now the interesting thing to me is that this is 90% the exact same work that would need to be done to implement having a CGContext passed in explicitly. All the same implicit assumptions need to be unraveled and made to explicitly ask for the information in a way that is contextually accurate for the function being dealt with. What’s different is how all of that gets surfaced in a final C++ API. I haven’t seen anything yet that makes me concerned we won’t be able to refactor that aspect of all of this into whatever shape we want when the time comes.

Doing it this way would be essentially doing the work twice, once to implement a caching mechanism and change all the getSubtarget calls to take a MachineFunction, and then once to use CGContext. That’s a lot more work and unnecessary churn in the source.

> 
> -Jim
> 
>>> Next will be making sure that
>>> subtargets really do own their own code generation while things like
>>> the register sets can stay on the target machine.
>>> 
>>> If we decide to mix targets in a single module at some point in the
>>> future then we'll need another set of indirection, but that can sit on
>>> top of it (and I'm not convinced we wouldn't want different pass
>>> managers still and just have a target machine for each actual target).
>>> 
>>> Make sense?
>> 
>> I think I understand what you’re trying to do. I just don’t agree that it is the right approach.
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140522/7768585c/attachment.html>