[llvm-dev] Linking Linux kernel with LLD

Fri Jan 27 11:17:28 PST 2017

> Hmm..., the crux of not being able to lex arithmetic expressions seems to
> be due to lack of context sensitivity. E.g. consider `foo*bar`. Could be a
> multiplication, or could be a glob pattern.
>
> Looking at the code more closely, adding context sensitivity wouldn't be
> that hard. In fact, our ScriptParserBase class is actually a lexer (look at
> the interface; it is a lexer's interface). It shouldn't be hard to change
> from an up-front tokenization to a more normal lexer approach of scanning
> the text for each call that wants the next token. Roughly speaking, just
> take the body of the for loop inside ScriptParserBase::tokenize and add a
> helper which does that on the fly and is called by consume/next/etc.
> Instead of an index into a token vector, just keep a `const char *` pointer
> that we advance.
>
> Once that is done, we can easily add a `nextArithmeticToken` or something
> like that which just lexes with different rules.

I like that idea. I first thought of always having '*' as a token, but
then space has to be a token, which is an incredible pain.

I then thought of having a "setLexMode" method, but the lex mode can
always be implicit from where we are in the parser. The parser should
always know if it should call next or nextArithmetic.

And I agree we should probably implement this. Even if it is not common,
it looks pretty silly to not be able to handle 2*5.

Cheers,
Rafael