[llvm] Treat ';' and '\n' as assembly instruction separators in collectAsmInstrs (PR #149365)

Mingming Liu via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 21 20:49:06 PDT 2025


================
@@ -60,17 +60,23 @@ FunctionType *InlineAsm::getFunctionType() const {
   return FTy;
 }
 
-void InlineAsm::collectAsmStrs(SmallVectorImpl<StringRef> &AsmStrs) const {
+SmallVector<StringRef> InlineAsm::collectAsmInstrs() const {
   StringRef AsmStr(AsmString);
-  AsmStrs.clear();
-
-  // TODO: 1) Unify delimiter for inline asm, we also meet other delimiters
-  // for example "\0A", ";".
-  // 2) Enhance StringRef. Some of the special delimiter ("\0") can't be
-  // split in StringRef. Also empty StringRef can not call split (will stuck).
-  if (AsmStr.empty())
-    return;
-  AsmStr.split(AsmStrs, "\n\t", -1, false);
+  SmallVector<StringRef> AsmLines;
+  AsmStr.split(AsmLines, '\n');
+
+  SmallVector<StringRef> AsmInstrs;
+  AsmInstrs.reserve(AsmLines.size());
+  for (StringRef AsmLine : AsmLines) {
+    // Trim most general comments. We don't handle comment blocks (/* ... */).
+    // We also don't handle '@' (ARM) and ';' (MachO) since they have different
+    // interpretations in different targets and we don't have target info in IR.
----------------
mingmingl-llvm wrote:

I think it's fine to handle the common cases like '#' and '//' right now. 

Just as a minor clarification, my understanding is that we do have the target info available in the IR (e.g., the call instruction from which `InlineAsm` is extracted). https://github.com/llvm/llvm-project/pull/149365#issuecomment-3095679009 also mentions this.

To make things simpler and capture this for the future, how about we update the comment to be a TODO?

https://github.com/llvm/llvm-project/pull/149365


More information about the llvm-commits mailing list