[LLVMbugs] [Bug 903] NEW: Better way to handle asm writing from the tools.
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Fri Sep 8 10:41:51 PDT 2006
http://llvm.org/bugs/show_bug.cgi?id=903
Summary: Better way to handle asm writing from the tools.
Product: tools
Version: trunk
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: llc
AssignedTo: unassignedbugs at nondot.org
ReportedBy: jlaskey at apple.com
I thought if I batted around some ideas we could work out some long term solutions to maintaining
asm writing.
So let's say we have these .tai (target asm info) files around, and we stick asm properties in them. (Why
not .td files? Well I'll get to that.)
Thus we might have something like;
namespace Darwin;
properties {
Comment = ";";
Global = "_";
Local = "L";
ZeroDirective = ".space";
SetDirective = ".set";
Data64bitsDirective = isPPC64 ? ".quad" : 0;
AlignmentIsInBytes = false;
ConstantPoolSection = ".const";
JumpTableDataSection = ".const";
JumpTableTextSection = ".text";
LCOMMDirective = ".lcomm";
StaticCtorsSection = ".mod_init_func";
StaticDtorsSection = ".mod_term_func";
InlineAsmStart = "# InlineAsm Start";
InlineAsmEnd = "# InlineAsm End";
NeedsSet = true;
AddressSize = isPPC64 ? 8 : 4;
DwarfAbbrevSection = ".section __DWARF,__debug_abbrev";
DwarfInfoSection = ".section __DWARF,__debug_info";
DwarfLineSection = ".section __DWARF,__debug_line";
DwarfFrameSection = ".section __DWARF,__debug_frame";
DwarfPubNamesSection = ".section __DWARF,__debug_pubnames";
DwarfPubTypesSection = ".section __DWARF,__debug_pubtypes";
DwarfStrSection = ".section __DWARF,__debug_str";
DwarfLocSection = ".section __DWARF,__debug_loc";
DwarfARangesSection = ".section __DWARF,__debug_aranges";
DwarfRangesSection = ".section __DWARF,__debug_ranges";
DwarfMacInfoSection = ".section __DWARF,__debug_macinfo";
}
Straight forward. But what I really want is code smidgens. Example, in the Darwin asm printer we have;
SwitchToTextSection(".section __TEXT,__symbol_stub1,symbol_stubs,"
"pure_instructions,16", 0);
EmitAlignment(4);
O << "L" << *i << "$stub:\n";
O << "\t.indirect_symbol " << *i << "\n";
O << "\tlis r11,ha16(L" << *i << "$lazy_ptr)\n";
if (isPPC64)
O << "\tldu r12,lo16(L" << *i << "$lazy_ptr)(r11)\n";
else
O << "\tlwzu r12,lo16(L" << *i << "$lazy_ptr)(r11)\n";
O << "\tmtctr r12\n";
O << "\tbctr\n";
SwitchToDataSection(".lazy_symbol_pointer", 0);
O << "L" << *i << "$lazy_ptr:\n";
O << "\t.indirect_symbol " << *i << "\n";
if (isPPC64)
O << "\t.quad dyld_stub_binding_helper\n";
else
O << "\t.long dyld_stub_binding_helper\n";
What if I could do something like the following in the .tai file.
function printFunctionStub(Name) {
.section __TEXT,__symbol_stub1,symbol_stubs,pure_instructions,16
.align 4
L#{Name}$stub:
.indirect_symbol #{Name} ; comment of some sort
lis r11,ha16(#{Local}#{Name}$lazy_ptr)
#{isPPC64?"ldu":"lwzu"} r12,lo16(#{Local}#{Name}$lazy_ptr)(r11)
mtctr r12
bctr
.lazy_symbol_pointer
L#{Name}$lazy_ptr:
.indirect_symbol #{Name} ; comment of some sort
.ptr dyld_stub_binding_helper
}
Or more generically;
function printFunctionStub(Name) {
.section __TEXT,__symbol_stub1,symbol_stubs,pure_instructions,16
.align 4
#{Local}#{Name}$stub:
.indirect_symbol #{Name} #{Comment} comment of some sort
lis r11,ha16(#{Local}#{Name}$lazy_ptr)
#{isPPC64?"ldu":"lwzu"} r12,lo16(#{Local}#{Name}$lazy_ptr)(r11)
mtctr r12
bctr
.lazy_symbol_pointer
#{Local}#{Name}$lazy_ptr:
.indirect_symbol #{Name} #{Comment} comment of some sort
.ptr dyld_stub_binding_helper
}
General Notes:
- TAI functions generate equivalent C++ functions.
- The TAI tool recognizes generic features of asm code, like labels, ops, operands and comments,, and
does the correct thing based on the current properties.
- All arguments and variables would always be assumed to be strings. So numeric values and machine
operand would be cast to strings.
- Tokens in the form of .xxxxx would be recognized as directives and may be acted upon.
Example .section would only be emitted if not same as current section. .text may become .section
TEXT,__text or whatever.
- Patterns in the form of #{expression} get substituted with the result of the expression (ala scripting/
ruby.)
- Variables in expressions may be visible properties, target constructed or arguments.
- L$ and G$ are shorthand for #{Local} and #{Global}. Simplifies defining labels.
- ";" is shorthand for #{Comment}. Simplifies defining comments. If the verbose flag is off then all
comments are suppressed.
- Spaces between first column/directives, directives/operand and operand/comment are converted to
tabs.
Example Notes:
- The above tai function gets translated into a C++ function called
DarwinAsmPrinter::printFunctionStub (base on name and namespace)
- .align would be assumed to be in bytes but properties may imply a conversion to log2 bytes.
- .indirect_symbol is not recognized so is written as is.
- #{isPPC64?"ldu":"lwzu"} gets evaluated at runtime and the result.
- .ptr is recognized as a special directive and emits .word or .quad (based on properties.) .quad could
get converted to two .long if not supported.
Dwarf example;
Instead of;
EmitInt32(0x1c); EOL("Length of Address Ranges Info");
EmitInt16(DWARF_VERSION); EOL("Dwarf Version");
EmitReference("info_begin", Unit->getID());
EOL("Offset of Compilation Unit Info");
EmitInt8(TAI->getAddressSize()); EOL("Size of Address");
EmitInt8(0); EOL("Size of Segment Descriptor");
EmitInt16(0); EOL("Pad (1)");
EmitInt16(0); EOL("Pad (2)");
// Range 1
EmitReference("text_begin", 0); EOL("Address");
EmitDifference("text_end", 0, "text_begin", 0); EOL("Length");
EmitInt32(0); EOL("EOM (1)");
EmitInt32(0); EOL("EOM (2)");
We'd have;
namespace Dwarf;
function printDwarfARanges(Version, InfoNum) {
.word 0x1c ; Length of Address Ranges Info
.short #{Version} ; Dwarf Version
.ptr L$debug_info_begin#{InfoNum} ; Offset of Compilation Unit Info
.byte #{AddressSize} ; Size of Address
.byte 0 ; Size of Segment Descriptor
.byte 0, 0 ; Pad
.ptr L$debug_text_begin ; Address
.ptr L$debug_text_end-L$debug_text_begin ; Length
.byte 0, 0 ; EOM
}
Much easier to read and maintain, and allow for generic code patterns to be used across targets.
Now back to why not .td files. No good reason - just add "here" strings to td and it might work out
fine.
<<<EOS
.word 0x1c ; Length of Address Ranges Info
.short #{Version} ; Dwarf Version
.ptr L$debug_info_begin#{InfoNum} ; Offset of Compilation Unit Info
.byte #{AddressSize} ; Size of Address
.byte 0 ; Size of Segment Descriptor
.byte 0, 0 ; Pad
.ptr L$debug_text_begin ; Address
.ptr L$debug_text_end-L$debug_text_begin ; Length
.byte 0, 0 ; EOM
EOS
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
More information about the llvm-bugs
mailing list