[PATCH] [lld] Teach LLD how to parse complete linker scripts

Thu Oct 23 20:02:03 PDT 2014

> Note: Sorry for the size of this patch. I started with the goal of incrementally
> developing this parser but ultimately failed. On the other side, the logic of the
> parser is kind of repetitive, as one would expect, so this patch does not feature
> lots of new things.

It's totally OK (and I think normal) to go back and split the history into incremental pieces after the fact if you are unhappy with the patch size. That way, you can still get the same high-quality code review that you would have gotten by incrementally developing it, which is most of the benefit of the incremental approach.


As a sanity check, I would also recommend running this over all the linker scripts in the linux kernel and the linker scripts from a typical embedded application. If you haven't already, definitely take a read through http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-December/057421.html

Also, to test your "dump" functionality, you should verify that the various projects still work if you replace the real scripts with "dump"ed scripts.

In test/LinkerScript/sections.test, it looks like you directly used the tool output as the expected output for the test. This doesn't give any confidence in the correctness by itself; if you haven't already done so, definitely try to build some code using the dumped script in place of the real script as a sanity check.

Also, please add links (or at least mention) any resources you used when developing this. What did you use as your "langref"? In the future, somebody is going to have to debug this code and they need to be able to determine if it is doing the right thing (or what the right thing is supposed to be).

================
Comment at: lib/ReaderWriter/LinkerScript.cpp:423-424
@@ -121,17 +422,4 @@
     }
-    // -l<lib name>
-    if (_buffer.startswith("-l")) {
-      _buffer = _buffer.drop_front(2);
-      StringRef::size_type start = 0;
-      if (_buffer[start] == ':')
-        ++start;
-      if (!canStartName(_buffer[start]))
-        // Create 'unknown' token.
-        break;
-      auto libNameEnd =
-          std::find_if(_buffer.begin() + start + 1, _buffer.end(),
-                       [=](char c) { return !canContinueName(c); });
-      StringRef::size_type libNameLen =
-          std::distance(_buffer.begin(), libNameEnd);
-      tok = Token(_buffer.substr(0, libNameLen), Token::libname);
-      _buffer = _buffer.drop_front(libNameLen);
+    /// Handle slashes '/', which can be either an operator inside an expression
+    /// or the beginning of an identifier
+    if (_buffer.startswith("/=")) {
----------------
Don't use doxygen comments inside the function. Here and elsewhere.

================
Comment at: test/LinkerScript/expr-precedence.test:27-32
@@ +26,8 @@
+CHECK: number: 1
+CHECK: semicolon: ;
+CHECK: r_brace: }
+CHECK: eof:
+CHECK: SECTIONS
+CHECK: {
+CHECK: . = (foo >= (bar + 1)) ? bar : (1 - (-(-(-1))))
+CHECK: }
----------------
Having the token dump together with the AST dump is really gross. Could you commit a separate patch that enhances linker-script-test to have two different modes and then clean up these test cases?

================
Comment at: test/LinkerScript/missing-operand.test:19
@@ +18,3 @@
+CHECK: eof:
+CHECK-ERR: 6:15: error: expected symbol, number, minus or left parenthesis.
+ */
----------------
FileCheck has some special functionality for checking diagnostics that avoids the need to hard-code absolute line numbers. You should use it: http://llvm.org/docs/CommandGuide/FileCheck.html#filecheck-expressions

Also, it should be pretty easy to do caret diagnostics.

http://reviews.llvm.org/D5852