[PATCH] D74687: [LLD][ELF] - Linker script: do not fail parsing when "/DISCARD/" follows the fill expression.
Fangrui Song via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 26 08:51:28 PST 2020
MaskRay added a comment.
In D74687#1893041 <https://reviews.llvm.org/D74687#1893041>, @grimar wrote:
> In D74687#1880879 <https://reviews.llvm.org/D74687#1880879>, @MaskRay wrote:
>
> > I prefer an alternative: limit `=` to accept `readPrimary` instead of `readFill`:
> >
> > if (peek() == "=" || peek().startswith("=")) {
> > inExpr = true;
> > consume("=");
> > cmd->filler = readFill();
> > inExpr = false;
> > }
> >
> >
> >
>
>
> Honestly I just do not feel comfortable to do this change.
> If I currectly understood, you're suggesting something like:
>
> if (peek() == "=" || peek().startswith("=")) {
> inExpr = true;
> consume("=");
> uint64_t value = readPrimary()().val;
> std::array<uint8_t, 4> buf;
> write32be(buf.data(), (uint32_t)value);
>
> cmd->filler = buf;
> inExpr = false;
> }
>
>
> With it we stop supporting syntax like `SECTIONS { .mysec : { *(.mysec*) } =0x11223300+0x44 }`.
> It is a deviation from the specification which says that "fillexp is an expression" and mentions a unary +.
>
> > As you can see, GNU ld's `=` syntax is weird...
>
> Yes, but it the current state LLD seems to have a reasonable behavior.
> I am not sure we should intentionally ignore specification and remove this already supported feature just because of a specific issue
> with "/DISCARD/".
See my example in https://reviews.llvm.org/D74687#1878409 I suspect `=0x10+2+3` never works properly in GNU ld.
If we disallow that in lld, a user can easily work around the limitation by using `=(0x10+2+3)`.
In D74687#1893065 <https://reviews.llvm.org/D74687#1893065>, @grimar wrote:
> In D74687#1893064 <https://reviews.llvm.org/D74687#1893064>, @psmith wrote:
>
> > Would it make sense to make /DISCARD/ its own token recognised by the Lexer? I believe that is what BFD does. It would prevent it from being recognised as "/" "DISCARD" "/". Apologies if this is not appropriate, I'm not too familiar with this part of the code.
>
>
> It is what my patch does :) "/DISCARD/" is a single token initially, but during parsing of the FILL expression,
> we call `tokenizeExpr` to return "3", "*" and "5" tokens for "3*5", for example.
>
> I.e. `tokenizeExpr` should probably ignore and don't split the "/DISCARD/", and that is what I do.
The trailing tokens of one output section description can cause other ambiguity when parsing the next output section description:
SECTIONS {
.foo : { } =3
/a : { }
}
I think it is just not necessary to support a syntax that will inherently cause problems, especially when there is no functionality loss.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74687/new/
https://reviews.llvm.org/D74687
More information about the llvm-commits
mailing list