[llvm-dev] Linking Linux kernel with LLD
George Rimar via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 24 07:57:50 PST 2017
>Our tokenizer recognize
>
> [A-Za-z0-9_.$/\\~=+[]*?\-:!<>]+
>
>as a token. gold uses more complex rules to tokenize. I don't think we need that much complex rules, but there seems to be >room to improve our tokenizer. In particular, I believe we can parse the Linux's linker script by changing the tokenizer rules as >follows.
>
> [A-Za-z_.$/\\~=+[]*?\-:!<>][A-Za-z0-9_.$/\\~=+[]*?\-:!<>]*
>
>or
>
> [0-9]+?
After more investigation, that seems will not work so simple.
Next are possible examples where it will be broken:
. = 0x1000; (gives tokens "0, x1000")
. = A*10; (gives "A*10")
. = 10k; (gives "10, k")
. = 10*5; (gives "10, *5"
"[0-9]+" could be "[0-9][kmhKMHx0-9]*"
but for "10*5" that anyways gives "10" and "*5" tokens.
And I do not think we can involve some handling of operators,
as its hard to assume some context on tokenizing step.
We do not know if that a file name we are parsing or a math expression.
May be worth trying to handle this on higher level, during evaluation of
expressions ?
George.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170124/2af5c409/attachment.html>
More information about the llvm-dev
mailing list