[PATCH] D14257: [AsmParser] Generalize matching for grammars without mnemonic-lead statements.

Colin LeMahieu via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 4 15:41:09 PST 2016


colinl added a comment.

Hey everyone, I have a revert up for review which undoes the changes in this three series of patches.  http://reviews.llvm.org/D15874

In the mean time here's an expanded rationale for the changes that were made.

A couple regressions:
size_t to unsigned and string::find to StringRef::find due to merge issues.  Even though rebasing before committing was performed, the noise due to trying to maintain 80 column formatting both in source and it generated code complicated the merge and these two issues were missed.  My apologies, I'll try to be more careful.

MatchTable size:
The match table is an N x M array of match entries where N is the number of instructions and M is the number of operands for the instruction with the largest number of operands in the entire instruction set.  Instructions that use fewer operands than the longest instruction are value-initialized as unspecified array elements.  The issue is if the number of operands for the longest instruction changes, the entire array is resized to N x (M + 1).  By adding the initial mnemonic to the match table we hit this case which increased the binary size of the tools themselves.

r256669 was created to address this issue
An alternative I would have suggested in a code review for this patch would be to remove Mnemonic from OperandMatchEntry and rewrite the binary search that uses LessOpcodeOperand to use OperandMatchEntry.Classes[0]  This should address the binary size issue without introducing the HasMnemonicFirst flag and still allowing instructions without a leading mnemonic.

matchTokenString function size
This function converts known strings to an enum value via a nested switch table.  By adding mnemonics to the generated match set we're indeed increasing the size of this function.  If we remove OperandMatchEntry.Mnemonic and use Classes[0] we'd be making use of these code paths, otherwise it's true we could optimize this and only add strings that were actually needed for target that have HasMnemonicFirst=0.

Linear search
Right now the match loop narrows down the match table by the first entry, which is assumed to be a string, and does a linear search within these entries to find a match.  When expanding to a generalized search the first impulse is to try and perform a binary search on matches though there are some issues.

Some backends don't allow some combinations of operand fitting tests to be performed, they have asserts in some branches that aren't hit with a linear search but are hit with a binary search.  See ARMAsmParser::checkTargetMatchPredicate which has some asserts when certain checks don't pass despite the fact that the matcher loop is designed to survive match predicate failures with "if ((MatchResult = checkTargetMatchPredicate(Inst)) != Match_Success)"

Some backends test specific error messages out of the assembler, see noneon-diagnostics.s.  Since the matcher loop tracks the last instruction tested and prints diagnostics for this instruction sometimes, these tests would need to be changed.

The fundamental issue with binary searching the match entries has to do with the fact that we can't generate a total order for all operand fits.  Consider the register classes
ClassA{reg1, reg2}, ClassB{reg1, reg3}, ClassC{reg2, reg3}.  There's no way to sort these register classes such that valid entries are in a contiguous region which makes binary searching not work.  I think the ARM backend had register classes in this category that tripped up the matcher.  Also consider the immediate classes s8, s16, u8, u16 when trying to match -1 and 257.  There's no way to sort these immediate classes such that there's a contiguous match region.

If someone can propose an efficient way to do this searching I'd be interested, it'd be even more interesting if it didn't involve rewriting all the assembler backends.


Repository:
  rL LLVM

http://reviews.llvm.org/D14257





More information about the llvm-commits mailing list