[llvm-commits] Deterministic finite automaton based packetizer for VLIW architectures

Wed Nov 30 14:44:37 PST 2011

Jakob,

I've attached a modified patch with the changes you suggested. A couple 
of comments: I will remove the emission of the second table when I split 
up the transition tables for each subtarget in Hexagon since those two 
tasks are related. Also, it was possible to change most of the STL 
containers (but not all) to LLVM equivalents. Let me know if it's okay 
to commit. I will post the CodeGenerator.html changes as a separate patch.

Thanks
-Anshu

--
Qualcomm Innovation Center, Inc is a member of Code Aurora Forum

On 11/29/2011 12:22 PM, Jakob Stoklund Olesen wrote:
>
> On Nov 29, 2011, at 7:34 AM, Anshuman Dasgupta wrote:
>
>> > You are building CachedTable on demand. Is that really necessary? 
>> How much of the table is built in
>> > a typical run? Would it be better/faster to just build the whole 
>> thing up front?
>>
>> That was an interesting decision. So the reason why I implemented 
>> CachedTable is that currently the DFA generator constructs one common 
>> transition table for all versions (subtargets) of Hexagon. As a 
>> result, when we compile for a particular Hexagon subtarget, I noticed 
>> that only part of the transition table is typically used. I have 
>> plans to augment the DFA generator to create separate tables for each 
>> Hexagon subtarget and once I do, I will change this to load up the 
>> entire transition table. But for now CachedTable is useful.
>
> That makes perfect sense.
>
>> > You have 4 table lookups per DFA transition by my count. It seems 
>> that 1 is enough. Can the API be improved to allow that?
>>
>> > Could you get rid of the DFAStateEntryTable by renumbering your states?
>>
>> > s -> DFAStateEntryTable[s]
>>
>> > This would preclude the sparse transition matrix representation, of 
>> course.
>>
>> If I understand correctly, these two questions are related. Let me 
>> answer them together. I can change the code so that there is one 
>> lookup per transition. That was my original design. However, most of 
>> the table entries were invalid transitions. So, while running 
>> TableGen, the I/O required to emit the table became a bottleneck. It 
>> significantly slowed down the Hexagon backend build time. Therefore I 
>> moved to a sparse matrix representation.
>>
>
> OK, I actually meant CachedTable lookups. Hash table lookups are an 
> order of magnitude slower than array lookups.
>
> I assume the API is meant to be used like this:
>
>  if (DFA.canReserveResources(MI))
>    DFA.reserveResources(MI);
>
> That's 4 hash table lookups for a valid transition. 1 should be 
> enough. For invalid transitions, you need two lookups because 
> CachedTable is lazily built.
>
>
> As for renumbering the states, since the state numbers aren't used for 
> anything but indexing DFAStateEntryTable, you can simply replace all 
> state numbers with DFAStateEntryTable[s], and you don't have to waste 
> memory on the table.
>
> /jakob
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111130/8a853ff8/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dfa_packetizer_2.patch
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111130/8a853ff8/attachment.ksh>