[LLVMdev] VLIW Ports

Evan Cheng evan.cheng at apple.com
Mon Oct 31 09:36:02 PDT 2011

The key is there should a *single* mechanism to represent instruction bundles. That means it has to be able to model intra-bundle dependencies. It doesn't mean the support is in the codeine on day one. That can be added when a target needs it. But the representation must have buy in from code owners who are responsible for the components that are affected, e.g. register allocator.


On Oct 26, 2011, at 1:01 PM, Sergei Larin wrote:

> Evan, 
>  What would change if tomorrow we got a VLIW target/back end with some
> certain properties - let's say no intra-packed deps - would it sway your
> opinion in either direction? Would it be a natural prerogative to implement
> it certain way for such hypothetical contributor/submitter? 
> Thanks. 
> Sergei Larin
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
> Behalf Of Evan Cheng
> Sent: Wednesday, October 26, 2011 2:08 PM
> To: Stripf, Timo
> Cc: LLVM Dev
> Subject: Re: [LLVMdev] VLIW Ports
> On Oct 25, 2011, at 1:59 AM, Stripf, Timo wrote:
>> Hi all,
>>> Ok, so in your proposal a bundle is just a special MachineInstr? That
> sounds good. How are the MachineInstr's embedded inside a bundle? How are
> the cumulative operands, implicit register defs and uses represented?
>> I attached the packing and unpacking pass I used within my backend. In my
> solution multiple MachineInstruction are packed into one variadic "PACK"
> MachineInstruction. The opcode and operands of the original instruction are
> encoded as operands of the PACK instruction. The opcode is added as
> immediate following by the operands of the original instructions. Within the
> operands one instruction is terminated by an "EndOfOp" operand. The implicit
> defs/uses are also added to the PACK instruction but not used for unpacking.
> Unpacking reconstructs them from the TargetDescriptionInfo. 
>> I took a look at the packing/unpacking solution of Evan and I think it is
> more elegant to use a derived class of MachineInstr for storing multiple
> instructions into one.
> Here are my thoughts on instruction bundle.
> First, let's talk about the prerequisite for adding a codegen level IR
> extension. A MachineInstr bundle should be generic enough to support the
> followings 1) VLIW bundles (where there are no intra-dependencies between
> instructions in a bundle), 2) bundles for other targets where there may be
> intra-dependencies between instructions in a bundle. #2 is very important
> for the extension to be accepted into LLVM mainline today since there are no
> proper VLIW targets.
> Now let's look at the options.
> 1. Extend MachineInstr to represent a bundle. This can be achieved either a
> derived class or add a pointer in MachineInstr that points to the next
> instruction in the bundle.
> 2. Add a bit to MachineInstr that indicates it is part of a bundle /
> sequence.
> The advantage of #1 is this requires minimum change to register allocator
> and many other codegen passes. However, that's only true for VLIW targets
> with no intra-bundle dependencies. For other targets or for use of
> optimizations which model a sequence of instructions, this is not true. The
> register allocator and scheduler need to know the cumulative properties of a
> bundle. For example, the register allocator needs to know what are the input
> operands, what are the outputs. The scheduler needs to know the cumulative
> latency of the bundle. Other passes that examine individual instruction
> properties (e.g. is it a load / store, control flow) will need to know the
> combined properties of individual instructions in a bundle.
> Of course, this is a solvable problem. The pass that combine instructions
> into bundles can construct the bundle MachineInstr properly so it presents
> the right information. The down size is this will add memory overhead and it
> needs to be carefully studied.
> The advantage of #2 is the low overhead. Adding a bit won't add much if any
> memory overhead. Packing / unpacking are both very easy. This is especially
> good for register allocator, which can still model register liveness even
> when there are intra-bundle dependencies. The downsize of #2 is also
> obvious. Every pass that operates on MachineInstr will have to be aware of
> bundles. This is the only real downsize that I can think of, but it's a big
> one.
> Evan
>> Best regards,
>> Timo Stripf
>> -----Ursprüngliche Nachricht-----
>> Von: Evan Cheng [mailto:evan.cheng at apple.com] 
>> Gesendet: Dienstag, 25. Oktober 2011 01:55
>> An: Carlos Sánchez de La Lama
>> Cc: Stripf, Timo; LLVM Dev
>> Betreff: Re: [LLVMdev] VLIW Ports
>> On Oct 24, 2011, at 2:38 PM, Carlos Sánchez de La Lama wrote:
>>> Hi Evan (and all),
>>>> I think any implementation that makes a "bundle" a different entity from
> MachineInstr is going to be difficult to use. All of the current backend
> passes will have to taught to know about bundles. 
>>> The approach in the patch I sent (and I believe Timo's code works
> similar, according to his explanations) is precisely to make "bundles" no
> different from MachineInstructions. They are MIs (a class derived from it),
> so all other passes work transparently with them. For example, in my code
> register allocator does not know it is allocating regs for a bundle, it sees
> it just as a MI using a lot of registers. Of course, normal (scalar) passes
> can not "inspect" inside bundles, and wont be able for example to put
> spilling code into bundles or anything like that.
>>> But the good point is that bundles (which are MIs) and regular MIs can
> coexist inside a MachineBasicBlock, and bundles can easily be "broken back"
> to regular MIs when needed for some pass.
>> Ok, so in your proposal a bundle is just a special MachineInstr? That
> sounds good. How are the MachineInstr's embedded inside a bundle? How are
> the cumulative operands, implicit register defs and uses represented?
>>>> I think what we need is a concept of a sequence of fixed machine
> instructions. Something that represent a number of MachineInstr's that are
> scheduled as a unit, something that is never broken up by MI passes such as
> branch folding. This is something that current targets can use to, for
> example, pre-schedule instructions. This can be useful for macro-fusing
> optimization. It can also be used for VLIW targets.
>>> There might be something I am missing, but I do not see the advantage
> here. Even more, if you use sequences you need to find a way to tell the
> passes how long a sequence is. On the other hand, if you use a class derived
> from MI, the passes know already (from their POV their are just dealing with
> MIs). You have of course to be careful on how you build the bundles so they
> have the right properties matching those of the inner MIs, and there is
> where the pack/unpack methods come in.
>> A "sequence" would not be actually a sequence of MachineInstr's. I'm
> merely proposing you using a generic concept that is not tied to VLIW. In
> the VLIW bundle, there are no inter-dependencies between the instructions.
> However, I'm looking for a more generic concept that may represent a
> sequence of instructions which may or may not have dependencies between
> them. The key is to introduce a concept that can be used by an existing
> target today.
>> Sounds like what you are proposing is not very far what I've described. Do
> you have patches ready for review?
>> Evan
>>> BR
>>> Carlos
>>>> On Oct 21, 2011, at 4:52 PM, Stripf, Timo wrote:
>>>>> Hi all,
>>>>> I worked the last 2 years on a LLVM back-end that supports clustered
> and non-clustered VLIW architectures. I also wrote a paper about it that is
> currently within the review process and is hopefully going to be accepted.
> Here is a small summary how I realized VLIW support with a LLVM back-end. I
> also used packing and unpacking of VLIW bundles. My implementations do not
> require any modification of the LLVM core.
>>>>> To support VLIW I added two representations for VLIW instructions:
> packed and unpacked representation. Within the unpacked representation a
> VLIW Bundle is separated by a NEXT instruction like it was done within the
> IA-64 back-end. The pack representation packs all instructions of one Bundle
> into a single PACK instruction and I used this representation especially for
> the register allocation.
>>>>> I used the following pass order for the clustered VLIW back-end:
>>>>> DAG->DAG Pattern Instruction Selection
>>>>> ...
>>>>> Clustering (Not required for unicluster VLIW architectures) 
>>>>> Scheduling Packing ...
>>>>> Register Allocation
>>>>> ...
>>>>> Prolog/Epilog Insertion & Frame Finalization Unpacking Reclustering 
>>>>> ...
>>>>> Rescheduling (Splitting, Packing, Scheduling, Unpacking) Assembly 
>>>>> Printer
>>>>> In principle, it is possible to use the LLVM scheduler to generate
> parallel code by providing a custom hazard recognizer that checks true data
> dependencies of the current bundle. The scheduler has also the capability to
> output NEXT operations by using NoopHazard and outputting a NEXT instruction
> instead of a NOP. However, the scheduler that is used within "DAG->DAG
> Pattern Instruction Selection" uses this glue mechanism and that could be
> problematic since no NEXT instructions are issued between glued
> instructions.
>>>>> Within my back-end I added a parallelizing scheduling after "DAG->DAG
> Pattern Instruction Selection" by reusing the LLVM Post-RA scheduler
> together with a custom hazard recognizer as explained. The Post-RA scheduler
> works very well with some small modifications (special PHI instruction
> handling and a small performance issue due to the high virtual register
> numbers) also before register allocation.
>>>>> Before register allocation the Packing pass converts the unpacked
> representation outputted by the scheduler into the pack representation. So
> the register allocation sees the VLIW bundles as one instruction. After
> "Prolog/Epilog Insertion & Frame Finalization" the Unpack pass converts the
> PACK instruction back to the unpacked representation. Thereby, instructions
> that were added within the Register Allocation and Prolog/Epilog Insertion
> are recognized and gets into one bundle since they are not parallelized.
>>>>> At the end (just before assembly output) I added several passes for
> doing a rescheduling. First, the splitting pass tries to split a VLIW bundle
> into single instructions (if possible). The Packing pass packs all Bundles
> with more the one instruction into a single PACK instruction. The scheduler
> will recognize the PACK instruction as a single scheduling unit. Scheduling
> is nearly the same as before RA. Unpacking establishes again the unpacked
> representation. 
>>>>> If anyone is interested in more information please send me an email.
> I'm also interested in increasing support for VLIW architectures within
>>>>> Kind regards,
>>>>> Timo Stripf
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: llvmdev-bounces at cs.uiuc.edu 
>>>>> [mailto:llvmdev-bounces at cs.uiuc.edu] Im Auftrag von Carlos Sánchez 
>>>>> de La Lama
>>>>> Gesendet: Donnerstag, 6. Oktober 2011 13:14
>>>>> An: LLVM Dev
>>>>> Betreff: Re: [LLVMdev] VLIW Ports
>>>>> Hi all,
>>>>> here is the current (unfinished) version of the VLIW support I
> mentioned. It is a patch over svn rev 141176. It includes the
> MachineInstrBundle class, and small required changes in a couple of outside
> LLVM files.
>>>>> Also includes a modification to Mips target to simulate a 2-wide VLIW
> MIPS. The scheduler is really silly, I did not want to implement a
> scheduler, just the bundle class, and the test scheduler is just provided as
> an example.
>>>>> Main thing still missing is to finish the "pack" and "unpack" methods
> in the bundle class. Right now it manages operands, both implicit and
> explicit, but it should also manage memory references, and update MIB flags
> acording to sub-MI flags.
>>>>> For any question I would be glad to help.
>>>>> BR
>>>>> Carlos
>>>>> On Tue, 2011-09-20 at 16:02 +0200, Carlos Sánchez de La Lama wrote:
>>>>>> Hi,
>>>>>>> Has anyone attempted the port of LLVM to a VLIW architecture?  Is 
>>>>>>> there any publication about it?
>>>>>> I have developed a derivation of MachineInstr class, called 
>>>>>> MachineInstrBundle, which is essnetially a VLIW-style machine 
>>>>>> instruction which can store any MI on each "slot". After the 
>>>>>> scheduling phase has grouped MIs in bundles, it has to call
>>>>>> MIB->pack() method, which takes operands from the MIs in the "slots" 
>>>>>> and transfers them to the superinstruction. From this point on the 
>>>>>> bundle is a normal machineinstruction which can be processed by 
>>>>>> other LLVM passes (such as register allocation).
>>>>>> The idea was to make a framework on top of which VLIW/ILP 
>>>>>> scheduling could be studies using LLVM. It is not completely 
>>>>>> finished, but it is more or less usable and works with a trivial 
>>>>>> scheduler in a synthetic MIPS-VLIW architecture. Code emission does 
>>>>>> not work though (yet) so bundles have to be unpacked prior to
> emission.
>>>>>> I was waiting to finish it to send a patch to the list, but if you 
>>>>>> are interested I can send you a patch over svn of my current code.
>>>>>> BR
>>>>>> Carlos
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> <TS1VLIWPacking.cpp><TS1VLIWUnpacking.cpp>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

More information about the llvm-dev mailing list