[llvm] [llvm][Tablegen][BUG] : The correct td file ending with #endif (there… (PR #69411)

zhao jiangtao via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 17 19:43:43 PDT 2023


https://github.com/whousemyname created https://github.com/llvm/llvm-project/pull/69411

### Brief introduction
The correct td file ending with #endif (there are no other characters after #endif, including newlines) still cannot be compiled. This PR is to solve this bug.
The test file in the submission can verify this situation, if you compile it with llvm-tblgen

### Bug occurrence scenarios
td file ending with #endif (no characters after #endif, including newlines and comments), When you write a td file in vim, after wq saves it, the td file automatically uses the newline character as the last character, so you need to manually delete the last newline character. Afterwards, compilation errors will occur when using llvm-tblgen to compile.
Let me give you an example: If you open an IDE (vscode or the like) and create a new file test.td, enter the following content(Please make sure there are no characters after #endif):
```
#ifdef asdasdsasd
#endif
```
I can confirm that a compilation error will appear in this case:
```
tdtest.td:2:7: error: Reached EOF without matching #endif
#endif
      ^
tdtest.td:1:8: error: The latest preprocessor control is here
#ifdef asdasdsasd
       ^
tdtest.td:2:7: error: Unexpected token at top level
#endif
```
However, if Tablegen does not allow #endif as the end of the file, this is not a bug, so I checked the official documentation and found the following rules:
```
LineEnd                ::=  newline | return | EOF
PreEndif               ::=  LineBegin (WhiteSpaceOrCComment)*
                            "#endif" (WhiteSpaceOrAnyComment)* LineEnd
```
I'm not sure what return means, but I think it's a weird setting not to end with #endif.
Adding nextChar == '\0' can  solve this compile, but there may be a more correct way to solve this problem.

>From bcb61edab7b3673de5fa25224c1aa0c30e8c7f08 Mon Sep 17 00:00:00 2001
From: angryZ <lazytortoisezzzz at gmail.com>
Date: Wed, 18 Oct 2023 10:39:56 +0800
Subject: [PATCH] [llvm][Tablegen][BUG] : The correct td file ending with
 #endif (there are no other characters after #endif, including newlines) still
 cannot be compiled. This PR is to solve this bug.

---
 llvm/lib/TableGen/TGLexer.cpp           | 2 +-
 llvm/test/TableGen/prep-endif-diag-1.td | 8 ++++++++
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/TableGen/prep-endif-diag-1.td

diff --git a/llvm/lib/TableGen/TGLexer.cpp b/llvm/lib/TableGen/TGLexer.cpp
index d5140e91fce9e94..3bedbcd6307fd15 100644
--- a/llvm/lib/TableGen/TGLexer.cpp
+++ b/llvm/lib/TableGen/TGLexer.cpp
@@ -664,7 +664,7 @@ tgtok::TokKind TGLexer::prepIsDirective() const {
           // It looks like TableGen does not support '\r' as the actual
           // carriage return, e.g. getNextChar() treats a single '\r'
           // as '\n'.  So we do the same here.
-          NextChar == '\r')
+          NextChar == '\r' || NextChar == '\0')
         return Kind;
 
       // Allow comments after some directives, e.g.:
diff --git a/llvm/test/TableGen/prep-endif-diag-1.td b/llvm/test/TableGen/prep-endif-diag-1.td
new file mode 100644
index 000000000000000..95151e0dfb360d2
--- /dev/null
+++ b/llvm/test/TableGen/prep-endif-diag-1.td
@@ -0,0 +1,8 @@
+// // RUN: not llvm-tblgen %s 2>&1 | FileCheck %s
+
+
+// CHECK: error: Reached EOF without matching #endif
+// I think this is a correct td file (at the syntax 
+// level), but it still compiles incorrectly.
+#ifdef adasd
+#endif
\ No newline at end of file



More information about the llvm-commits mailing list