[llvm] [X86] Quote symbol names that collide with registers/keywords in Intel syntax (PR #186570)

LIU Hao via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 16 00:28:09 PDT 2026


lhmouse wrote:

> > I think it's safer to just blindly quote all identifiers, which may break existent tests though.
> 
> Quoting everything works but is noisy - every symbol in the output gets wrapped in "..." unnecessarily. Updating too many tests is indeed a problem.

Indeed, but it's safer. There are random copies of MASM manuals around the Internet like https://www.pcjs.org/documents/books/mspl13/masm/mpguide/ which says

> The following list names all the new reserved words in MASM 6.0:
> 
> .BREAK           CMPXCHG          IRETDF           PUSHW
> .CONTINUE        ECHO             IRETF            REAL10
> .DOSSEG          EXTERN           LENGTHOF         REAL4
> .ELSE            EXTERNDEF        LOOPD            REAL8
> .ELSEIF          FAR16            LOOPED           REPEAT
> .ENDIF           FAR32            LOOPEW           SBYTE
> .ENDW            FLAT             LOOPNED          SDWORD
> .EXIT            FLDENVD          LOOPNEW          SIGN?
> .IF              FLDENVW          LOOPNZD          SIZEOF
> .LISTALL         FNSAVED          LOOPNZW          STDCALL
> .LISTIF          FNSAVEW          LOOPW            STRUCT
> .LISTMACRO       FNSTENVD         LOOPZW           SUBTITLE
> .LISTMACROALL    FNSTENVW         LOWWORD          SWORD
> .NO87            FOR              LROFFSET         SYSCALL
> .NOCREF          FORC             NEAR16           TEXTEQU
> .NOLIST          FRSTORD          NEAR32           TR3
> .NOLISTIF        FRSTORW          OPATTR           TR4
> .NOLISTMACRO     FSAVED           OPTION           TR5
> .REPEAT          FSAVEW           OVERFLOW?        TYPEDEF
> .STARTUP         FSTENVD          PARITY?          UNION
> .UNTIL           FSTENVW          POPAW            VARARG
> .UNTILCXZ        GOTO             POPCONTEXT       WBINVD
> .WHILE           HIGHWORD         PROTO            WHILE
> ADDR             INVD             PUSHAW           XADD
> ALIAS            INVLPG           PUSHCONTEXT      ZERO?
> BSWAP            INVOKE           PUSHD
> CARRY?

... which seems too many.


> This is an orthogonal concern, but both gas and LLVM's integrated assembler handle `foo at SECREL32` correctly — @SECREL32 is recognized as a relocation specifier, producing IMAGE_REL_AMD64_SECREL on symbol foo. In LLVM, the lexer produces a single token foo at SECREL32 (since AllowAtInIdentifier is true for COFF), then the parser does Identifier.split('@') and checks the suffix via getSpecifierForName. If it's a known specifier, it splits; if not (e.g., _foo at 8 for stdcall), the whole string is kept as the symbol name.

This sounds safe AFAICT. (May be a bad guy can enforce a symbol name like `int bad_variable __asm__("bad_variable at SECREL32") = 42;` but really they are on their own.)


https://github.com/llvm/llvm-project/pull/186570


More information about the llvm-commits mailing list