[PATCH] D82157: Fix crash on `user defined literals`

Dmitri Gribenko via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 8 05:54:44 PDT 2020


gribozavr2 added a comment.

> Fix crash on `user defined literals`

WDYT:

Implement support for user defined literals (which also fixes a crash)

> Given an UserDefinedLiteral 1.2_w:
>  Problem: Lexer generates one Token for the literal, but ClangAST
>  references two source locations
>  Fix: Ignore the operator and interpret it as the underlying literal.
>  e.g.: 1.2_w token generates syntax node IntegerLiteral(1.2_w)

WDYT:

A user defined literal (for example, `1.2_w`) is one token. The Clang AST for a user defined literal references two source locations: the beginning of the token (the location of `1` in `1.2_w`) and the beginning of the suffix (the location of `_`). When constructing the syntax tree, we were trying to find a token that starts at the underscore, but couldn't find one, and crashed on an assertion failure. To fix this issue, we ignore the Clang AST nodes for UDLs that have source locations that point in the middle of a token.



================
Comment at: clang/lib/Tooling/Syntax/BuildTree.cpp:634
+    // `SourceLocation`s. As a result one of these nodes has a valid
+    // `SourceLocation` that doesn't point to a token.
+    //
----------------
"The semantic AST node for has child nodes that reference two source locations, the location of the beginning of the token (`1`), and the location of the beginning of the UDL suffix (`_`). The UDL suffix location does not point to the beginning of a token, so we can't represent the UDL suffix as a separate syntax tree node."


================
Comment at: clang/unittests/Tooling/Syntax/TreeTest.cpp:1197-1199
+    1.2_w; // calls operator "" _w(1.2L)
+    12_w;  // calls operator "" _w("12")
+    12_x;  // calls operator<'1', '2'> "" _x()
----------------
Indent -2.


================
Comment at: clang/unittests/Tooling/Syntax/TreeTest.cpp:1200
+    12_x;  // calls operator<'1', '2'> "" _x()
+}
+    )cpp",
----------------
Could you also add tests for user-defined string literals and user-defined character literals? ("abc"_mystr, u"abc"_mystr, 'c'_mychar)


================
Comment at: clang/unittests/Tooling/Syntax/TreeTest.cpp:1265
+    | |-UserDefinedLiteralExpression
+    | | `-12_w
+    | `-;
----------------
It looks somewhat weird to me that integer and floating point literals end up with the same syntax tree node type. WDYT about making different nodes for different literals (integer, floating-point, string, character)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82157/new/

https://reviews.llvm.org/D82157





More information about the cfe-commits mailing list