[llvm-commits] [llvm] r78492 - /llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp
Chris Lattner
clattner at apple.com
Sat Aug 8 16:43:39 PDT 2009
On Aug 8, 2009, at 1:55 PM, Daniel Dunbar wrote:
>> +typedef std::pair<std::string, std::string> StringPair;
>
> Not that it matters, but at least in the context of this functionality
> I think this can use a StringRef; it always deals with substrings of
> the existing strings, right?
Sure, but it is actually more natural to express it without sub
strings. The code just uses pointers to the entries in the original
array, it doesn't do string slicing and dicing.
>> + for (unsigned i = 0, e = Matches.size(); i != e; ++i)
>> + MatchesByLetter[Matches[i]->first[CharNo]].push_back(Matches
>> [i]);
>> +
>> +
>> + // If we have exactly one bucket to match, see how many
>> characters are common
>> + // across the whole set and match all of them at once.
>> + // length, just verify the rest of it with one if.
>
> Edito.
Fixed.
> I think another similar simple optimization which can be done is to
> match common suffixes. See the code that gets generated for "st(0)",
> "st(1)", etc.
Ah, that could be interesting, match from both ends! :)
>> +/// EmitStringMatcher - Given a list of strings and code to
>> execute when they
>> +/// match, output a simple switch tree to classify the input
>> string. If a
>> +/// match is found, the code in Vals[i].second is executed. This
>> code should do
>> +/// a return to avoid falling through. If nothing matches,
>> execution falls
>> +/// through. StrVariableName is the name of teh variable to test.
>
> IMHO, we should just implement this as a (string -> unsigned) matcher.
> Thats very frequently the use case, and when it isn't you aren't
> necessarily worse off by using it as (string -> unsigned -> my generic
> code), and you end up with more readable code (instead of intertwining
> the generic actions with the matching code).
>
> This lets the matcher implement assorted fun optimizations, like:
>
> 1. { "0" -> a + b*0, "1" -> n + b*1, ... "9" -> n + b*9} to { "[0-9]"
> -> a + (char - '0') * b}.
>
> 2. { "foo" -> 1, "bar" -> 1, "baz" -> 1, etc -> 0 } into a hash match.
I'd rather just make it generic and implement those optimizations in
the code generator as we discussed on IRC, but I really don't care
that much. I don't see a big reason to implement these in tblgen. If
you're going to implement them, it would make more sense to do them in
one place that is used for all switches.
-Chris
More information about the llvm-commits
mailing list