[flang-commits] [flang] [flang] Adjust prescanner fix for preprocessing (PR #122779)

Peter Klausler via flang-commits flang-commits at lists.llvm.org
Mon Jan 13 12:01:28 PST 2025


https://github.com/klausler created https://github.com/llvm/llvm-project/pull/122779

Commas being optional in FORMAT statements, the tokenization of things like 3I9HHOLLERITH is tricky.  After tokenizing the initial '3', we don't want to then take apparent identifier "I9HHOLLERITH" as the next token.  So the prescanner just consumes the letter ("I") as its own token in this context.

A recent bug report complained that this can lead to incorrect results when (in this case) the letter is a defined preprocessing macro.  I updated the prescanner to check that the letter is actually followed by an instance of a problematic Hollerith literal.

And this broke two tests in the Fujitsu Fortran test suite that Linaro runs, as it couldn't detect a following Hollerith literal that wasn't on the same source line.  We can't do look-ahead line continuation processing in NextToken(), either.

So here's a second attempt at fixing the original problem: namely, the letter that follows a decimal integer token is checked to see whether it's the name of a defined macro.

>From e24c60fe85a2f639111eff3ad0d1912a9840c1a5 Mon Sep 17 00:00:00 2001
From: Peter Klausler <pklausler at nvidia.com>
Date: Mon, 13 Jan 2025 11:45:45 -0800
Subject: [PATCH] [flang] Adjust prescanner fix for preprocessing

Commas being optional in FORMAT statements, the tokenization
of things like 3I9HHOLLERITH is tricky.  After tokenizing
the initial '3', we don't want to then take apparent identifier
"I9HHOLLERITH" as the next token.  So the prescanner just consumes
the letter ("I") as its own token in this context.

A recent bug report complained that this can lead to incorrect
results when (in this case) the letter is a defined preprocessing
macro.  I updated the prescanner to check that the letter is
actually followed by an instance of a problematic Hollerith literal.

And this broke two tests in the Fujitsu Fortran test suite that
Linaro runs, as it couldn't detect a following Hollerith literal
that wasn't on the same source line.  We can't do look-ahead
line continuation processing in NextToken(), either.

So here's a second attempt at fixing the original problem:
namely, the letter that follows a decimal integer token
is checked to see whether it's the name of a defined macro.
---
 flang/lib/Parser/prescan.cpp         | 22 +++++-----------------
 flang/test/Preprocessing/bug129131.F |  3 +++
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp
index 703a02792a1c4e..c5939a1e0b6c2c 100644
--- a/flang/lib/Parser/prescan.cpp
+++ b/flang/lib/Parser/prescan.cpp
@@ -708,23 +708,11 @@ bool Prescanner::NextToken(TokenSequence &tokens) {
       EmitCharAndAdvance(tokens, *at_);
       QuotedCharacterLiteral(tokens, start);
     } else if (IsLetter(*at_) && !preventHollerith_ &&
-        parenthesisNesting_ > 0) {
-      const char *p{at_};
-      int digits{0};
-      for (;; ++digits) {
-        ++p;
-        if (InFixedFormSource()) {
-          p = SkipWhiteSpace(p);
-        }
-        if (!IsDecimalDigit(*p)) {
-          break;
-        }
-      }
-      if (digits > 0 && (*p == 'h' || *p == 'H')) {
-        // Handles FORMAT(3I9HHOLLERITH) by skipping over the first I so that
-        // we don't misrecognize I9HOLLERITH as an identifier in the next case.
-        EmitCharAndAdvance(tokens, *at_);
-      }
+        parenthesisNesting_ > 0 &&
+        !preprocessor_.IsNameDefined(CharBlock{at_, 1})) {
+      // Handles FORMAT(3I9HHOLLERITH) by skipping over the first I so that
+      // we don't misrecognize I9HHOLLERITH as an identifier in the next case.
+      EmitCharAndAdvance(tokens, *at_);
     }
     preventHollerith_ = false;
   } else if (*at_ == '.') {
diff --git a/flang/test/Preprocessing/bug129131.F b/flang/test/Preprocessing/bug129131.F
index 5b1a914a2c9e35..00aba5da2c7cbd 100644
--- a/flang/test/Preprocessing/bug129131.F
+++ b/flang/test/Preprocessing/bug129131.F
@@ -1,5 +1,8 @@
 ! RUN: %flang -fc1 -fdebug-unparse %s 2>&1 | FileCheck %s
 ! CHECK: PRINT *, 2_4
+! CHECK: PRINT *, 1_4
 #define a ,3
       print *, mod(5 a)
+      print *, mod(4 a
+     +)
       end



More information about the flang-commits mailing list