[llvm-commits] Deterministic finite automaton based packetizer for VLIW architectures
Anshuman Dasgupta
adasgupt at codeaurora.org
Tue Nov 29 11:57:42 PST 2011
> OK, I actually meant CachedTable lookups. Hash table lookups are an
order of magnitude slower than array lookups.
> I assume the API is meant to be used like this:
> if (DFA.canReserveResources(MI))
> DFA.reserveResources(MI);
>
Ah I see what you mean. On Hexagon, the canReserveResources() and
reserveResources() are never used together. We conduct a bunch of
dependency pruning/checking in between the two calls. Since the
dependency checking is relatively intricate we do not want to do that
unless there are enough resources to accommodate the instruction in a
packet.
> As for renumbering the states, since the state numbers aren't used
for anything but indexing DFAStateEntryTable,
> you can simply replace all state numbers with DFAStateEntryTable[s],
and you don't have to waste memory on the table.
That's a good idea; thanks! I'll need to remove CachedTable for this and
I can implement it along with the "splitting up the transition table" work.
-Anshu
--
Qualcomm Innovation Center, Inc is a member of Code Aurora Forum
On 11/29/2011 12:22 PM, Jakob Stoklund Olesen wrote:
>
> On Nov 29, 2011, at 7:34 AM, Anshuman Dasgupta wrote:
>
>> > You are building CachedTable on demand. Is that really necessary?
>> How much of the table is built in
>> > a typical run? Would it be better/faster to just build the whole
>> thing up front?
>>
>> That was an interesting decision. So the reason why I implemented
>> CachedTable is that currently the DFA generator constructs one common
>> transition table for all versions (subtargets) of Hexagon. As a
>> result, when we compile for a particular Hexagon subtarget, I noticed
>> that only part of the transition table is typically used. I have
>> plans to augment the DFA generator to create separate tables for each
>> Hexagon subtarget and once I do, I will change this to load up the
>> entire transition table. But for now CachedTable is useful.
>
> That makes perfect sense.
>
>> > You have 4 table lookups per DFA transition by my count. It seems
>> that 1 is enough. Can the API be improved to allow that?
>>
>> > Could you get rid of the DFAStateEntryTable by renumbering your states?
>>
>> > s -> DFAStateEntryTable[s]
>>
>> > This would preclude the sparse transition matrix representation, of
>> course.
>>
>> If I understand correctly, these two questions are related. Let me
>> answer them together. I can change the code so that there is one
>> lookup per transition. That was my original design. However, most of
>> the table entries were invalid transitions. So, while running
>> TableGen, the I/O required to emit the table became a bottleneck. It
>> significantly slowed down the Hexagon backend build time. Therefore I
>> moved to a sparse matrix representation.
>>
>
> OK, I actually meant CachedTable lookups. Hash table lookups are an
> order of magnitude slower than array lookups.
>
> I assume the API is meant to be used like this:
>
> if (DFA.canReserveResources(MI))
> DFA.reserveResources(MI);
>
> That's 4 hash table lookups for a valid transition. 1 should be
> enough. For invalid transitions, you need two lookups because
> CachedTable is lazily built.
>
>
> As for renumbering the states, since the state numbers aren't used for
> anything but indexing DFAStateEntryTable, you can simply replace all
> state numbers with DFAStateEntryTable[s], and you don't have to waste
> memory on the table.
>
> /jakob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111129/48611486/attachment.html>
More information about the llvm-commits
mailing list