[llvm-commits] Deterministic finite automaton based packetizer for VLIW architectures

Tue Nov 29 11:57:42 PST 2011

 > OK, I actually meant CachedTable lookups. Hash table lookups are an 
order of magnitude slower than array lookups.
 > I assume the API is meant to be used like this:
 > if (DFA.canReserveResources(MI))
 >  DFA.reserveResources(MI);
 >

Ah I see what you mean. On Hexagon, the canReserveResources() and 
reserveResources() are never used together. We conduct a bunch of 
dependency pruning/checking in between the two calls. Since the 
dependency checking is relatively intricate we do not want to do that 
unless there are enough resources to accommodate the instruction in a 
packet.

 > As for renumbering the states, since the state numbers aren't used 
for anything but indexing DFAStateEntryTable,
 > you can simply replace all state numbers with DFAStateEntryTable[s], 
and you don't have to waste memory on the table.

That's a good idea; thanks! I'll need to remove CachedTable for this and 
I can implement it along with the "splitting up the transition table" work.

-Anshu

--
Qualcomm Innovation Center, Inc is a member of Code Aurora Forum

On 11/29/2011 12:22 PM, Jakob Stoklund Olesen wrote:
>
> On Nov 29, 2011, at 7:34 AM, Anshuman Dasgupta wrote:
>
>> > You are building CachedTable on demand. Is that really necessary? 
>> How much of the table is built in
>> > a typical run? Would it be better/faster to just build the whole 
>> thing up front?
>>
>> That was an interesting decision. So the reason why I implemented 
>> CachedTable is that currently the DFA generator constructs one common 
>> transition table for all versions (subtargets) of Hexagon. As a 
>> result, when we compile for a particular Hexagon subtarget, I noticed 
>> that only part of the transition table is typically used. I have 
>> plans to augment the DFA generator to create separate tables for each 
>> Hexagon subtarget and once I do, I will change this to load up the 
>> entire transition table. But for now CachedTable is useful.
>
> That makes perfect sense.
>
>> > You have 4 table lookups per DFA transition by my count. It seems 
>> that 1 is enough. Can the API be improved to allow that?
>>
>> > Could you get rid of the DFAStateEntryTable by renumbering your states?
>>
>> > s -> DFAStateEntryTable[s]
>>
>> > This would preclude the sparse transition matrix representation, of 
>> course.
>>
>> If I understand correctly, these two questions are related. Let me 
>> answer them together. I can change the code so that there is one 
>> lookup per transition. That was my original design. However, most of 
>> the table entries were invalid transitions. So, while running 
>> TableGen, the I/O required to emit the table became a bottleneck. It 
>> significantly slowed down the Hexagon backend build time. Therefore I 
>> moved to a sparse matrix representation.
>>
>
> OK, I actually meant CachedTable lookups. Hash table lookups are an 
> order of magnitude slower than array lookups.
>
> I assume the API is meant to be used like this:
>
>  if (DFA.canReserveResources(MI))
>    DFA.reserveResources(MI);
>
> That's 4 hash table lookups for a valid transition. 1 should be 
> enough. For invalid transitions, you need two lookups because 
> CachedTable is lazily built.
>
>
> As for renumbering the states, since the state numbers aren't used for 
> anything but indexing DFAStateEntryTable, you can simply replace all 
> state numbers with DFAStateEntryTable[s], and you don't have to waste 
> memory on the table.
>
> /jakob
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111129/48611486/attachment.html>