[PATCH] D104572: [lit] Add the ability to parse regexes in Lit boolean expressions
Joel E. Denny via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 28 11:48:00 PDT 2021
jdenny added a comment.
There should be some user-level documentation about this new feature. It looks like the appropriate place is:
https://llvm.org/docs/TestingGuide.html#constraining-test-execution
================
Comment at: llvm/utils/lit/lit/BooleanExpression.py:184
self.assertTrue(BooleanExpression.evaluate('-os', {}, triple))
self.assertFalse(BooleanExpression.evaluate('arch-os', {}, triple))
----------------
Should there be some minimal test somewhere (maybe here) ensuring that a regex is handled as a literal string when matching triples? I think that means it cannot ever match (due to `{{` and `}}`), so maybe it's not a very interesting behavior, but we still might want to be aware if we accidentally change it.
================
Comment at: llvm/utils/lit/lit/BooleanExpression.py:114
self.expect(')')
elif not BooleanExpression.isIdentifier(self.token):
raise ValueError("expected: '!' or '(' or identifier\nhave: %s" %
----------------
ldionne wrote:
> jdenny wrote:
> > `isIdentifier` seems misnamed if it accepts `{{`.
> >
> > The logic should probably be something like either `not isIdentifier and not isRegexOpen` or `not isIdentiferOrRegexOpen`.
> I agree that `isIdentifier` is misnamed, I'll fix that.
>
> I also agree that it should be something like `not isIdentifier and not isRegexOpen`, however this implementation does not track the fact that we're parsing the content of a regular expression -- it doesn't know that it's inside a `{{`. Instead, the tokenization pattern was augmented to treat anything with `{{<whatever>}}` as a token of its own - that was by far the simplest way I could find to implement it. So instead, I believe the fix is to simply rename `isIdentifier` to `isMatchExpression`, and to acknowledge that we now have a new leaf in the grammar, and that `isMatchExpression` basically allows us to detect that.
That seems reasonable. Thanks for addressing it.
================
Comment at: llvm/utils/lit/lit/BooleanExpression.py:115
elif not BooleanExpression.isIdentifier(self.token):
raise ValueError("expected: '!' or '(' or identifier\nhave: %s" %
self.quote(self.token))
----------------
ldionne wrote:
> jdenny wrote:
> > Shouldn't this mention `{{` as a possibility?
> I'll say `expected '!' or '(' or match-expression` instead, LMK if that's not satisfying.
While that does reflect the implementation, I think `expected '!', '(', '{{', or identifier` would be more meaningful to a user who isn't familiar with the internal grammar symbol names. As far as they know, `match-expression` could be, for example, the start symbol for the whole grammar. In contrast, `{{` and `identifier` are probably clear enough to any user.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D104572/new/
https://reviews.llvm.org/D104572
More information about the llvm-commits
mailing list