[PATCH] D88450: [ms] [llvm-ml] Accept whitespace around the dot operator

Mon Sep 28 17:14:47 PDT 2020

rnk added a comment.

This code is already so convoluted that I don't feel like I can provide very good code review for it. More tech debt can't hurt here, this codepath is already overleveraged with tech debt and needs to be rewritten. But if you want to push on through and add the functionality, it has good tests, so go for it.

================
Comment at: llvm/lib/MC/MCParser/MasmParser.cpp:1383
     }
+
     // Parse symbol variant.
----------------
Please remote the stray whitespace change.

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:1706
+          consumeToken();
+          if (dotOffset < Identifier.size() - 1) {
+            getLexer().UnLex(AsmToken(AsmToken::Identifier,
----------------
My suggestion to try to simplify this code would be to use .slice to build two StringRefs, then check if they are non-empty, and unlex them if needed:
  StringRef LHS = Identifer.slice(0, dotOffset);
  StringRef RHS = Identifer.slice(dotOffset+1, Identifer.size());
  if (LHS.empty())
    .. UnLex LHS
  .. Unlex dot
  if (RHS.empty())
    .. UnLex RHS

That keeps all the index math closer together, less spread out.

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:1707
+          if (dotOffset < Identifier.size() - 1) {
+            getLexer().UnLex(AsmToken(AsmToken::Identifier,
+                                      Identifier.substr(dotOffset + 1)));
----------------
Truly, breaking up identifiers into small chunks is the job of the lexer, but here we are anyway, breaking things up and "unlexing" them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88450/new/

https://reviews.llvm.org/D88450