[LLVMbugs] [Bug 903] NEW: Better way to handle asm writing from the tools.

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Fri Sep 8 10:41:51 PDT 2006


http://llvm.org/bugs/show_bug.cgi?id=903

           Summary: Better  way to handle asm writing from the tools.
           Product: tools
           Version: trunk
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: llc
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: jlaskey at apple.com


I thought if I batted around some ideas we could work out some long term solutions to maintaining 
asm writing.

So let's say we have these .tai (target asm info) files around, and we stick asm properties in them. (Why 
not .td files?  Well I'll get to that.)

Thus we might have something like;

namespace Darwin;

properties {
  Comment               = ";";
  Global                = "_";
  Local                 = "L";
  ZeroDirective         = ".space";
  SetDirective          = ".set";
  Data64bitsDirective   = isPPC64 ? ".quad" : 0; 
  AlignmentIsInBytes    = false;
  ConstantPoolSection   = ".const";
  JumpTableDataSection  = ".const";
  JumpTableTextSection  = ".text";
  LCOMMDirective        = ".lcomm";
  StaticCtorsSection    = ".mod_init_func";
  StaticDtorsSection    = ".mod_term_func";
  InlineAsmStart        = "# InlineAsm Start";
  InlineAsmEnd          = "# InlineAsm End";
 
  NeedsSet              = true;
  AddressSize           = isPPC64 ? 8 : 4;
  DwarfAbbrevSection    = ".section __DWARF,__debug_abbrev";
  DwarfInfoSection      = ".section __DWARF,__debug_info";
  DwarfLineSection      = ".section __DWARF,__debug_line";
  DwarfFrameSection     = ".section __DWARF,__debug_frame";
  DwarfPubNamesSection  = ".section __DWARF,__debug_pubnames";
  DwarfPubTypesSection  = ".section __DWARF,__debug_pubtypes";
  DwarfStrSection       = ".section __DWARF,__debug_str";
  DwarfLocSection       = ".section __DWARF,__debug_loc";
  DwarfARangesSection   = ".section __DWARF,__debug_aranges";
  DwarfRangesSection    = ".section __DWARF,__debug_ranges";
  DwarfMacInfoSection   = ".section __DWARF,__debug_macinfo";
}

Straight forward.  But what I really want is code smidgens. Example, in the Darwin asm printer we have;

      SwitchToTextSection(".section __TEXT,__symbol_stub1,symbol_stubs,"
                          "pure_instructions,16", 0);
      EmitAlignment(4);
      O << "L" << *i << "$stub:\n";
      O << "\t.indirect_symbol " << *i << "\n";
      O << "\tlis r11,ha16(L" << *i << "$lazy_ptr)\n";
      if (isPPC64)
        O << "\tldu r12,lo16(L" << *i << "$lazy_ptr)(r11)\n";
      else
        O << "\tlwzu r12,lo16(L" << *i << "$lazy_ptr)(r11)\n";
      O << "\tmtctr r12\n";
      O << "\tbctr\n";
      SwitchToDataSection(".lazy_symbol_pointer", 0);
      O << "L" << *i << "$lazy_ptr:\n";
      O << "\t.indirect_symbol " << *i << "\n";
      if (isPPC64)
        O << "\t.quad dyld_stub_binding_helper\n";
      else
        O << "\t.long dyld_stub_binding_helper\n";

What if I could do something like the following in the .tai file.

function printFunctionStub(Name) {
    .section                __TEXT,__symbol_stub1,symbol_stubs,pure_instructions,16
    .align                  4
L#{Name}$stub:
    .indirect_symbol        #{Name}     ; comment of some sort
    lis                     r11,ha16(#{Local}#{Name}$lazy_ptr)
    #{isPPC64?"ldu":"lwzu"} r12,lo16(#{Local}#{Name}$lazy_ptr)(r11)
    mtctr                   r12
    bctr
    .lazy_symbol_pointer
L#{Name}$lazy_ptr:
    .indirect_symbol        #{Name}     ; comment of some sort
    .ptr                    dyld_stub_binding_helper
}

Or more generically;

function printFunctionStub(Name) {
    .section                __TEXT,__symbol_stub1,symbol_stubs,pure_instructions,16
    .align                  4
#{Local}#{Name}$stub:
    .indirect_symbol        #{Name}     #{Comment} comment of some sort
    lis                     r11,ha16(#{Local}#{Name}$lazy_ptr)
    #{isPPC64?"ldu":"lwzu"} r12,lo16(#{Local}#{Name}$lazy_ptr)(r11)
    mtctr                   r12
    bctr
    .lazy_symbol_pointer
#{Local}#{Name}$lazy_ptr:
    .indirect_symbol        #{Name}     #{Comment} comment of some sort
    .ptr                    dyld_stub_binding_helper
}

General Notes:

- TAI functions generate equivalent C++ functions.
- The TAI tool recognizes generic features of asm code, like labels, ops, operands and comments,, and 
does the correct thing based on the current properties.
- All arguments and variables would always be assumed to be strings.  So numeric values and machine 
operand would be cast to strings.
- Tokens in the form of .xxxxx would be recognized as directives and may be acted upon.  
Example .section would only be emitted if not same as current section.  .text may become .section 
TEXT,__text or whatever.
- Patterns in the form of #{expression} get substituted with the result of the expression (ala scripting/
ruby.)
- Variables in expressions may be visible properties, target constructed or arguments.
- L$ and G$ are shorthand for #{Local} and #{Global}.  Simplifies defining labels.
- ";" is shorthand for #{Comment}.  Simplifies defining comments.  If the verbose flag is off then all 
comments are suppressed.
- Spaces between first column/directives, directives/operand and operand/comment are converted to 
tabs.

Example Notes:
- The above tai function gets translated into a C++ function called 
DarwinAsmPrinter::printFunctionStub (base on name and namespace)
- .align would be assumed to be in bytes but properties may imply a conversion to log2 bytes.
- .indirect_symbol is not recognized so is written as is.
-  #{isPPC64?"ldu":"lwzu"} gets evaluated at runtime and the result.
- .ptr is recognized as a special directive and emits .word or .quad (based on properties.)  .quad could 
get converted to two .long if not supported.

Dwarf example;

Instead of;

      EmitInt32(0x1c); EOL("Length of Address Ranges Info");
     
      EmitInt16(DWARF_VERSION); EOL("Dwarf Version");
     
      EmitReference("info_begin", Unit->getID());
      EOL("Offset of Compilation Unit Info");

      EmitInt8(TAI->getAddressSize()); EOL("Size of Address");

      EmitInt8(0); EOL("Size of Segment Descriptor");

      EmitInt16(0);  EOL("Pad (1)");
      EmitInt16(0);  EOL("Pad (2)");

      // Range 1
      EmitReference("text_begin", 0); EOL("Address");
      EmitDifference("text_end", 0, "text_begin", 0); EOL("Length");

      EmitInt32(0); EOL("EOM (1)");
      EmitInt32(0); EOL("EOM (2)");

We'd have;

namespace Dwarf;       
 
function printDwarfARanges(Version, InfoNum) {
    .word           0x1c                                ; Length of Address Ranges Info
    .short          #{Version}                          ; Dwarf Version
    .ptr            L$debug_info_begin#{InfoNum}        ; Offset of Compilation Unit Info
    .byte           #{AddressSize}                      ; Size of Address
    .byte           0                                   ; Size of Segment Descriptor
    .byte           0, 0                                ; Pad
    .ptr            L$debug_text_begin                  ; Address
    .ptr            L$debug_text_end-L$debug_text_begin ; Length
    .byte           0, 0                                ; EOM
}

Much easier to read and maintain, and allow for generic code patterns to be used across targets.

Now back to why not .td files.  No good reason - just add "here" strings to td and it might work out 
fine.

<<<EOS
    .word           0x1c                                ; Length of Address Ranges Info
    .short          #{Version}                          ; Dwarf Version
    .ptr            L$debug_info_begin#{InfoNum}        ; Offset of Compilation Unit Info
    .byte           #{AddressSize}                      ; Size of Address
    .byte           0                                   ; Size of Segment Descriptor
    .byte           0, 0                                ; Pad
    .ptr            L$debug_text_begin                  ; Address
    .ptr            L$debug_text_end-L$debug_text_begin ; Length
    .byte           0, 0                                ; EOM
EOS



------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.



More information about the llvm-bugs mailing list