[llvm] MC: Support quoted symbol names (PR #138817)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 31 13:36:20 PDT 2025


================
@@ -212,6 +212,27 @@ MCDataFragment *MCContext::allocInitialFragment(MCSection &Sec) {
 MCSymbol *MCContext::getOrCreateSymbol(const Twine &Name) {
   SmallString<128> NameSV;
   StringRef NameRef = Name.toStringRef(NameSV);
+  if (NameRef.contains('\\')) {
+    NameSV = NameRef;
+    size_t S = 0;
+    // Support escaped \\ and \" as in GNU Assembler. GAS issues a warning for
+    // other characters following \\, which we do not implement due to code
+    // structure.
+    for (size_t I = 0, E = NameSV.size(); I < E; ++I) {
+      char C = NameSV[I];
+      if (C == '\\') {
+        switch (NameSV[I + 1]) {
+        case '"':
+        case '\\':
+          C = NameSV[++I];
+          break;
+        }
+      }
+      NameSV[S++] = C;
+    }
+    NameSV.resize(S);
+    NameRef = NameSV;
+  }
----------------
nikic wrote:

Shouldn't this unescaping only be done when parsing *textual* assembly in AsmParser, but not when programmatically creating symbols?

I'd expect the LLVM IR `@"\\\22" = constant i8 0` to result in `"\\\"":` rather than `"\"":`, as it does with this patch. (Previously it resulted in `"\\"":`, which is also wrong, but at least only affects the case where you're emitting assembly.)

See https://github.com/rust-lang/rust/issues/146065 for a regression from this change. Of course we could easily apply additional escaping on the Rust side, but I think the current behavior here isn't right.

https://github.com/llvm/llvm-project/pull/138817


More information about the llvm-commits mailing list