[PATCH] D14257: [AsmParser] Generalize matching for grammars without mnemonic-lead statements.

Mon Jan 4 17:12:02 PST 2016

colinl added a comment.

I agree about the scalability of the match table.  Part of the goal with this design was to impact other targets as little as possible.  If a design required a rewrite of all targets it would have probably been a non-started as far as getting Hexagon parsing working.  The change preserved the existing behavior of string match + small linear search if the first actual parsed operand was a string.  This was to have no performance impact for matching on existing targets.

Was the majority of the issue with these changes related to the tool code size increase?  I obviously miscalculated the impact since I thought the impact to the size would be minimal and it would only affect machines which are running the assembler, presumably machines with sufficient resources to deal with one extra operand.

The Hexagon instructions can be parsed easily with a parser generator that accepts EBNF style definitions.  The biggest issue is that we have separate parsing and matching loops.  We do one pass through the statement to parse tokens in to operands and then a second pass to match operands to instructions.

Fixing this would require either slicing these loops differently or overhauling the matcher with something that generates more generalized matching grammars.  There's a lot of code in various targets dedicated to pasting tokens together or splitting tokens apart so making this type of change seems like it would require all targets to transition which was what I was trying to avoid.

Let me know your thoughts, I'm open to suggestions.

Repository:
  rL LLVM

http://reviews.llvm.org/D14257