[PATCH] D98306: [ELF] Support . and $ in symbol names in expressions

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 10 10:32:03 PST 2021


MaskRay added a comment.

In D98306#2616116 <https://reviews.llvm.org/D98306#2616116>, @peter.smith wrote:

> Noticed that ld.bfd will also accept ~ which may be worth adding as well; other than that looks reasonable to me.

We could support `~` in non-leading positions in a symbol name but that would just add unneeded complexity.

`~` is also a valid unary operator so supporting it requires us to tokenize `~`. A simple try

  --- a/lld/ELF/ScriptLexer.cpp
  +++ b/lld/ELF/ScriptLexer.cpp
  @@ -174,3 +174,3 @@ bool ScriptLexer::atEOF() { return errorCount() || tokens.size() == pos; }
   static std::vector<StringRef> tokenizeExpr(StringRef s) {
  -  StringRef ops = "+-*/:!~=<>"; // List of operators
  +  StringRef ops = "+-*/:!=<>"; // List of operators
   
  diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
  index 4b15a71f029b..0b10d3a67754 100644
  --- a/lld/ELF/ScriptParser.cpp
  +++ b/lld/ELF/ScriptParser.cpp
  @@ -1236,2 +1236,9 @@ static void checkIfExists(OutputSection *cmd, StringRef location) {
   
  +static bool isValidSymbolName(StringRef s) {
  +  auto valid = [](char c) {
  +    return isAlnum(c) || c == '$' || c == '.' || c == '_' || c == '~';
  +  };
  +  return !s.empty() && !isDigit(s[0]) && s[0] != '~' && llvm::all_of(s, valid);
  +}
  +
   Expr ScriptParser::readPrimary() {

does not work because `symbol5 = symbol - ~0xfffb;` in `symbol-assignexpr.s` will fail to parse.
The reason is that `tokenizeExpr` needs to recognize that the leading `~` in `~0xfffb` can still be parsed.

Put these things together, I think `~` is a misdesign in GNU ld. Since no practical symbol names use `~`, I incline to not support it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98306/new/

https://reviews.llvm.org/D98306



More information about the llvm-commits mailing list