[PATCH] [1/6] Improve handling of .file, .include and .incbin directives

Yunzhong Gao Yunzhong_Gao at playstation.sony.com
Wed Sep 4 15:17:06 PDT 2013


ygao added you to the CC list for the revision "[1/6] Improve handling of .file, .include and .incbin directives".

Hi,
When -S is specified or -save-temps is specified, the compiler will generate
assembly files and include .file directives in the format of
    .file "<filename>"
or  .file <num> "<filename>"
with any non-printing characters converted to their escaped octal sequence,
see EmitFileDirective() in llvm/lib/MC/MCAsmStreamer.cpp.

Problem is that when the integrated assembler reads back the assembly file,
it is not converting these escaped sequence back to the original value. The
fix is needed to carry out the necessary conversion. There are two related
directives (.incbin and .include) that might take a file name as argument.
Clang does not generate them but may read in such directives from other
compilers. I think it is reasonable to fix these two directives as well,
just to be compatible with GNU binutils behavior.

This is part of a bigger effort to support foreign characters in file names.

Could someone review this patch?

Many thanks,
- Gao.

http://llvm-reviews.chandlerc.com/D1289

Files:
  lib/MC/MCParser/AsmParser.cpp
  test/MC/AsmParser/directive_file.s
  test/MC/AsmParser/directive_include.s
  test/MC/AsmParser/directive_incbin.s

Index: lib/MC/MCParser/AsmParser.cpp
===================================================================
--- lib/MC/MCParser/AsmParser.cpp
+++ lib/MC/MCParser/AsmParser.cpp
@@ -2546,17 +2546,21 @@
     return TokError("unexpected token in '.file' directive");
 
   // Usually the directory and filename together, otherwise just the directory.
-  StringRef Path = getTok().getString();
-  Path = Path.substr(1, Path.size()-2);
+  // Allow the strings to have escaped octal character sequence.
+  std::string Path = getTok().getString();
+  if (parseEscapedString(Path))
+    return true;
   Lex();
 
   StringRef Directory;
   StringRef Filename;
+  std::string FilenameData;
   if (getLexer().is(AsmToken::String)) {
     if (FileNumber == -1)
       return TokError("explicit path specified, but no file number");
-    Filename = getTok().getString();
-    Filename = Filename.substr(1, Filename.size()-2);
+    if (parseEscapedString(FilenameData))
+      return true;
+    Filename = FilenameData;
     Directory = Path;
     Lex();
   } else {
@@ -3492,16 +3496,16 @@
   if (getLexer().isNot(AsmToken::String))
     return TokError("expected string in '.include' directive");
 
-  std::string Filename = getTok().getString();
+  // Allow the strings to have escaped octal character sequence.
+  std::string Filename;
+  if (parseEscapedString(Filename))
+    return true;
   SMLoc IncludeLoc = getLexer().getLoc();
   Lex();
 
   if (getLexer().isNot(AsmToken::EndOfStatement))
     return TokError("unexpected token in '.include' directive");
 
-  // Strip the quotes.
-  Filename = Filename.substr(1, Filename.size()-2);
-
   // Attempt to switch the lexer to the included file before consuming the end
   // of statement to avoid losing it when we switch.
   if (EnterIncludeFile(Filename)) {
@@ -3518,16 +3522,16 @@
   if (getLexer().isNot(AsmToken::String))
     return TokError("expected string in '.incbin' directive");
 
-  std::string Filename = getTok().getString();
+  // Allow the strings to have escaped octal character sequence.
+  std::string Filename;
+  if (parseEscapedString(Filename))
+    return true;
   SMLoc IncbinLoc = getLexer().getLoc();
   Lex();
 
   if (getLexer().isNot(AsmToken::EndOfStatement))
     return TokError("unexpected token in '.incbin' directive");
 
-  // Strip the quotes.
-  Filename = Filename.substr(1, Filename.size()-2);
-
   // Attempt to process the included file.
   if (ProcessIncbinFile(Filename)) {
     Error(IncbinLoc, "Could not find incbin file '" + Filename + "'");
Index: test/MC/AsmParser/directive_file.s
===================================================================
--- test/MC/AsmParser/directive_file.s
+++ test/MC/AsmParser/directive_file.s
@@ -1,7 +1,7 @@
 # RUN: llvm-mc -triple i386-unknown-unknown %s | FileCheck %s
 
         .file "hello"
-        .file 1 "world"
+        .file 1 "worl\144"   # "\144" is "d"
         .file 2 "directory" "file"
 
 # CHECK: .file "hello"
Index: test/MC/AsmParser/directive_include.s
===================================================================
--- test/MC/AsmParser/directive_include.s
+++ test/MC/AsmParser/directive_include.s
@@ -5,5 +5,5 @@
 # CHECK: a = 0
 # CHECK: TESTB:
 TESTA:  
-	.include       "directive_set.s"
+	.include       "directive\137set.s"   # "\137" is underscore "_"
 TESTB:
Index: test/MC/AsmParser/directive_incbin.s
===================================================================
--- test/MC/AsmParser/directive_incbin.s
+++ test/MC/AsmParser/directive_incbin.s
@@ -1,6 +1,6 @@
 # RUN: llvm-mc -triple i386-unknown-unknown %s -I %p | FileCheck %s
 
 .data
-.incbin "incbin_abcd"
+.incbin "incbin\137abcd"  # "\137" is underscore "_"
 
 # CHECK: .ascii	 "abcd\n"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D1289.3.patch
Type: text/x-patch
Size: 3717 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130904/86a082e1/attachment.bin>


More information about the llvm-commits mailing list