[PATCH] D91460: [AsmParser] make .ascii support spaces as separators

Jessica Clarke via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 3 17:34:42 PST 2020


jrtc27 added inline comments.


================
Comment at: llvm/lib/MC/MCParser/AsmParser.cpp:3017-3030
+  auto parseManyWithOptionalComma = [&]() -> bool {
+    if (parseOptionalToken(AsmToken::EndOfStatement))
+      return false;
+    while (true) {
+      if (parseOp())
+        return true;
+      if (parseOptionalToken(AsmToken::EndOfStatement))
----------------
jcai19 wrote:
> jrtc27 wrote:
> > jcai19 wrote:
> > > jrtc27 wrote:
> > > > It'd be nicer to push this all up into parseOp so it just parses all the strings it finds one at a time. For `.ascii` you can get away with treating spaces and commas the same, but if we ever want to support the new `.asciz`/`.string` behaviour (which I feel we should one day, provided binutils doesn't decide to revert its behaviour) we'll need it to be like that, so it'd make it trivial to enable. Also I feel it's a cleaner (and likely simpler) way to write it regardless.
> > > Thanks for the comment. Just to clarify, do you suggest we merge the code of parseManyWithOptionalComma into parseOp? parseOp is used as an argument to parseMany later, so if we change it, then we may have to make adjustments to parseMany too, which is used in other places. Or maybe what you suggest is that we should completely rewrite parseDirectiveAscii function and not call parseMany at all? Also I am not sure about rewriting the code assuming the new GNU behaviors will be the way going forward. I could not find any discussion on the change and its potential implications, and therefore could not confirm if that would be what GNU community agrees on.
> > I mean:
> > - remove parseManyWithOptionalComma
> > - keep using parseMany(parseOp)
> > - leave parseMany unchanged
> > - make parseOp parse _all_ strings up to the first comma, effectively treating them like a single multi-token operand (i.e. keep calling parseEscapedString and emitting the string while there's still another AsmToken::String)
> Thanks for the clarification. IIUC, we only change parseOp to treat space-separated strings as one string until we hit a comma or end of statement, and keep everything else unchanged. Wouldn't that also affect .asciz and .string and make them behave like the new GNU behavior? 
Yes, so for now you can add an additional check to bail out after the first iteration of the do-while loop if ZeroTerminated is true, and at some point in the future all that's needed is to delete that additional check to make .asciz and .string behave like the new GNU binutils.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91460/new/

https://reviews.llvm.org/D91460



More information about the llvm-commits mailing list