[LLVMdev] Instruction bundles before RA: Rematerialization

Fri Jun 8 09:35:37 PDT 2012

Hi again!

On 08/06/2012 17:11, Ivan Llopard wrote:
> Hi Sergei, Jakob,
>
> Thanks for your comments !
>
> On 07/06/2012 20:41, Sergei Larin wrote:
>>
>> Jakob,
>>
>> Please see my comments below. Hope this helps.
>>
>> Sergei
>>
>> --
>>
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.
>>
>> *From:*Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk]
>> *Sent:* Thursday, June 07, 2012 1:02 PM
>> *To:* Sergei Larin
>> *Cc:* 'Ivan Llopard'; 'LLVM Developers Mailing List'
>> *Subject:* Re: [LLVMdev] Instruction bundles before RA: Rematerialization
>>
>> On Jun 7, 2012, at 10:25 AM, "Sergei Larin" <slarin at codeaurora.org
>> <mailto:slarin at codeaurora.org>> wrote:
>>
>>
>>
>> Generally as far as I concern, there is no way “generic” (platform
>> independent) code can add instructions to bundles optimally
>>
>> I agree, there are too many ways of modeling stuff with bundles. That
>> is why I took the philosophical stance of treating bundles as black
>> boxes during RA. I think the target should be involved whenever
>> bundles are formed, and we shouldn't delete instructions from inside
>> bundles without permission from the target.
>>
>
> I also agree. Adding instructions into a bundle strongly depends on the
> target and may be a quite complex task, sometimes too complex to be done
> at allocation time if allocation speed is an issue. Removing them should
> relax resource constraints in almost every conventional VLIW target and
> it's something that the RA can potentially handle in a simple way by
> *consulting the target before*.
>
>> I think we need to tweak some of the TargetInstrInfo hooks to make
>> bundle remat possible. I would like your input.
>>
>> Rematerialization has multiple steps:
>>
>> 1. Feasibility. RA knows the bundle defining a given SSA value of a
>> virtual register. It calls TII.isTriviallyReMaterializable() to
>> determine if the defining instruction can (and should) be
>> rematerialized. See LiveRangeEdit::anyRematerializable().
>>
>> */[Larin, Sergei] Obviously if you treat bundle as a black box this
>> does not make much sense… or isTriviallyReMaterializable() should be
>> able to parse bundle and produce compound answer. Problem is – you
>> will not be able to find many opportunities like that. On the other
>> hand if you detect a _single_ instruction as a remat candidate inside
>> a bundle, you might chose to dissolve (disassemble) the bundle (if
>> possible, as I said before) and treat new serial group as you normally
>> would for remat. There should be a platform dependent pass to rebundle
>> this new serial sequence again, but even if it is not done, but if the
>> dissolution was performed properly, this will be a performance, not a
>> correctness issue. As a side note, you might chose simply remove
>> desired instruction from the bundle (often it is possible and trivial
>> to do without affecting semantics) and proceed as described above.
>> _BUT_ instruction removal does need back end support. Example:/*
>>
>> */{ /*
>>
>> */r0 = add (r0, r1);/*
>>
>> */P0 = cmp (r0.new, r0);/*
>>
>> */}/*
>>
>> */The r0.new means that the new value of r0 is used (reg file bypass
>> in the same cycle). You can see all possible implications of this. To
>> offload this mental logic to the back end, we need a utility of form
>> CanMoveMIBeforeBundle(MI, BundleHeader)/ CanMoveMIAfterBundle(MI,
>> BundleHeader)/ MoveMIBeforeBundle(MI, BundleHeader)/
>> MoveMIAfterBundle(MI, BundleHeader). Calling this repeatedly should
>> achieve desired effect – remove what could be removed, and live what
>> needs to remain bundled intact. The move utility can change
>> instruction properly for each target./*
>>
>
> That's explains the hook isLegalToPruneDependencies() in the packetizer :-).
>
> I don't see why that case cannot be handled by the existent model and
> needs that kind of API's. For example, if 'add' needs to be rematted,
> it's possible as long as all its operands are available at the remat
> position. The same holds for 'cmp'. The original 'add' cannot be removed
> because of its internal read and the RA should consult the target before.
> Anyway, I think we both agree that instruction removal is a
> target-specific task. It also remains simple compared to the insertion
> of instructions into a bundle.

I rushed my answer, I apologize. I had virtual registers dancing in my 
head instead of those you put in the example!

Please, correct me if I'm wrong: as long as the second operand of 'cmp' 
has not been allocated (and if it's available of course), it can be 
remated. Otherwise, it's tied to the bundle. Is that correct?

>
>> *//*
>>
>> 2. Feasibility at desired location.
>> LiveRangeEdit::canRematerializeAt() then checks that the instruction
>> can be rematerialized at the new location. This can fail if the
>> instruction depends on virtual register values that are not available
>> at the new location.
>>
>> */[Larin, Sergei] This looks similar to the previous case. Good thing
>> that you can potentially have zero cost remat (if you can place your
>> new instruction inside existing bundle), but to know this you need an
>> answer _while_ you are computing the cost. For that I can easily
>> imagine a back end hook of something like CanAddMIToBundle(MI,
>> BundleHeader). This is often easier than the previous task./*
>>
>
> I like the idea. The cost can potentially be zero, at least from a
> latency point of view (we care about power consumption also).
> Conversely, I don't like the fact that the RA looks for bundling
> opportunities while rematting, but it's just a personal feeling. As you
> pointed out for Hexagon, bundling is a complex task and sometimes it
> cannot be managed in a simple way. In our BE, a DFA to model resource
> constraints is not enough and a wrapper has been created which adds
> complexity to the process.
>
>> 3. Remat. LiveRangeEdit::rematerializeAt() calls TII.reMaterialize()
>> (sic) to insert a copy of the defining instruction at the new location.
>>
>> */[Larin, Sergei] At this point you need to update liveness on bundle
>> level, and then update global picture. Updating liveness on bundle
>> level also might need help from the back end. See the above example
>> with .new, and you can easily imagine local defs/kills inside a bundle
>> that should not even be visible outside the black box. As of now I
>> consider this mechanism somewhat broken on trunk (it is overly
>> pessimistic)… but API in this case is rather straightforward. /*
>>
>
> I thought internals def/use were already modelled, is it right ?
>
>> 4. Shrink original live range. The original live range may be smaller
>> after some uses have been rematerialized. This may discover dead defs
>> if there are no remaining uses.
>>
>
> IMHO, as long as internal defs/uses are taken into account, I don't see
> any particular problem.
>
>> 5. LiveRangeEdit::eliminateDeadDefs() erases the dead defs.
>>
>> */[Larin, Sergei] Last two should be easy with the above API support –
>> if you can CanMoveMIBeforeBundle(..) outside the bundle, you can use
>> existing API to delete it./*
>>
>
> How can be dead defs eliminated by calling CanMoveMIBeforeBundle() ?
>
>
> Ivan
>
>> It looks like each of these steps require some help from the target if
>> they are to handle bundles.
>>
>> /jakob
>>