[PATCH] D91460: [AsmParser] make .ascii support spaces as separators

Jian Cai via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 3 17:32:51 PST 2020


jcai19 added inline comments.


================
Comment at: llvm/lib/MC/MCParser/AsmParser.cpp:3017-3030
+  auto parseManyWithOptionalComma = [&]() -> bool {
+    if (parseOptionalToken(AsmToken::EndOfStatement))
+      return false;
+    while (true) {
+      if (parseOp())
+        return true;
+      if (parseOptionalToken(AsmToken::EndOfStatement))
----------------
jrtc27 wrote:
> jcai19 wrote:
> > jrtc27 wrote:
> > > It'd be nicer to push this all up into parseOp so it just parses all the strings it finds one at a time. For `.ascii` you can get away with treating spaces and commas the same, but if we ever want to support the new `.asciz`/`.string` behaviour (which I feel we should one day, provided binutils doesn't decide to revert its behaviour) we'll need it to be like that, so it'd make it trivial to enable. Also I feel it's a cleaner (and likely simpler) way to write it regardless.
> > Thanks for the comment. Just to clarify, do you suggest we merge the code of parseManyWithOptionalComma into parseOp? parseOp is used as an argument to parseMany later, so if we change it, then we may have to make adjustments to parseMany too, which is used in other places. Or maybe what you suggest is that we should completely rewrite parseDirectiveAscii function and not call parseMany at all? Also I am not sure about rewriting the code assuming the new GNU behaviors will be the way going forward. I could not find any discussion on the change and its potential implications, and therefore could not confirm if that would be what GNU community agrees on.
> I mean:
> - remove parseManyWithOptionalComma
> - keep using parseMany(parseOp)
> - leave parseMany unchanged
> - make parseOp parse _all_ strings up to the first comma, effectively treating them like a single multi-token operand (i.e. keep calling parseEscapedString and emitting the string while there's still another AsmToken::String)
Thanks for the clarification. IIUC, we only change parseOp to treat space-separated strings as one string until we hit a comma or end of statement, and keep everything else unchanged. Wouldn't that also affect .asciz and .string and make them behave like the new GNU behavior? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91460/new/

https://reviews.llvm.org/D91460



More information about the llvm-commits mailing list