[lld] [LLD] Fix crash on parsing ':ALIGN' in linker script (PR #146723)

Sat Jul 5 00:41:45 PDT 2025

https://github.com/parth-07 updated https://github.com/llvm/llvm-project/pull/146723

>From 2cbaacba40bb6ea1b3c3a0df97dc910d8f8a8c6c Mon Sep 17 00:00:00 2001
From: Parth Arora <partaror at qti.qualcomm.com>
Date: Wed, 2 Jul 2025 07:32:41 -0700
Subject: [PATCH] [LLD] Fix crash on parsing ':ALIGN' in linker script

The linker was crashing due to stack overflow when parsing ':ALIGN' in
an output section description. This commit fixes the linker script
parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions
(readExpr) in linker script and the behavior of ScriptLexer::expect(...)
utility. ScriptLexer::expect does not do anything if errors have already
been encountered during linker script parsing. In particular, it never
increments the current token position in the script file, even if the
current token is the same as the expected token. This causes an infinite
call cycle on parsing an expression such as '(4096)' when an error
has already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

  expect("("); // no-op, current token still points to '('
  Expression *E = readExpr(); // The cycle continues...

Closes #146722

Signed-off-by: Parth Arora <partaror at qti.qualcomm.com>
---
 lld/ELF/ScriptParser.cpp                     |  2 ++
 lld/test/ELF/linkerscript/align-section.test | 21 ++++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index 593d5636f2455..ddfa24d9cacf5 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -1229,6 +1229,8 @@ SymbolAssignment *ScriptParser::readSymbolAssignment(StringRef name) {
 // This is an operator-precedence parser to parse a linker
 // script expression.
 Expr ScriptParser::readExpr() {
+  if (atEOF())
+    return []() { return 0; };
   // Our lexer is context-aware. Set the in-expression bit so that
   // they apply different tokenization rules.
   SaveAndRestore saved(lexState, State::Expr);
diff --git a/lld/test/ELF/linkerscript/align-section.test b/lld/test/ELF/linkerscript/align-section.test
index 7a28fef2076ed..f8c00dd27b005 100644
--- a/lld/test/ELF/linkerscript/align-section.test
+++ b/lld/test/ELF/linkerscript/align-section.test
@@ -1,7 +1,24 @@
 # REQUIRES: x86
-# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux /dev/null -o %t.o
-# RUN: ld.lld -o %t --script %s %t.o -shared
+# RUN: rm -rf %t && split-file %s %t && cd %t
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux /dev/null -o a.o
+# RUN: ld.lld --script a.t a.o -shared
 
 # lld shouldn't crash.
 
+#--- a.t
 SECTIONS { .foo : ALIGN(2M) {} }
+
+# RUN: not ld.lld --script b.t 2>&1 | FileCheck %s
+
+# lld should not crash and report the error properly.
+
+# CHECK: error: b.t:3: malformed number: :
+# CHECK: >>>   S :ALIGN(4096) {}
+# CHECK: >>>     ^
+
+#--- b.t
+SECTIONS
+{
+  S :ALIGN(4096) {}
+}