[LLVMdev] me being stupid: me vs the llvm codebase...
Gordon Henriksen
gordonhenriksen at mac.com
Tue Oct 23 08:45:48 PDT 2007
On Oct 23, 2007, at 05:52, BGB wrote:
> I am assuming then that some external assembler is used (such as
> 'gas')?...
In the static compilers, yes. The JIT directly serializes
instructions into memory without the aid of an external assembler.
There are also experimental built-in assemblers; LLVM calls them
object writers[1].
> it looks like much of the interconnection and data sharing is done
> through objects and templates?...
That's correct. The LLVM intermediate representation (IR) is well-
suited for many transformations and analyses, which are generally
structured as passes[2]. The LLVM IR has both object-oriented[3],
textual (.ll) [4], and binary (.bc "bitcode") [5] representations;
all are fully equivalent. However, it is more efficient not to wring
the program through multiple print/parse or write/read cycles, so the
object-oriented representation is generally maintained within any
single process.
The code generators also convert the program into the SelectionDAG
and MachineFunction forms, both of which are target-independent in
form but not in content.[6] Each of these forms have multiple states
with differing invariants. (Strictly speaking, however, these forms
are private to each code generator; the C backend does not use
either.) These code generation forms do not have first-class textual
or binary representations, since they are ephemeral data structures
used only during code generation. They can however be dumped to human-
readable text, or viewed with GraphVis.
> doesn't appear much like working like a dynamic compiler is a major
> design goal (so I hear, it can be used this way, but this is not
> the focus).
>
> so, it looks like the design focuses mostly of taking the input
> modules, grinding it and mixing it, and doing lots of spify inter-
> module optimizations (presumably forming a monolithic output
> representing the entire project?...).
LLVM does work well as a static (offline) compiler, where inter-
procedural optimization and link-time optimization are useful. In
llvm-gcc, link-time optimization ("mixing" as you say) only occurs
with at -O4. Typically, IPO is performed only within a single
compilation unit (-O3/-O2). No IPO is performed at -O0.
> as a result, my compiler generally refrains from inlining things or
> doing brittle inter-function optimizations (after all, one could
> potentially relink parts of the image and break things...).
It's possible to use LLVM in the same manner by simply refraining
from the use of inter-procedural optimizations.
If LLVM bytecode is used as the on-disk representation, however, LLVM
would allow the use of offline optimizations before starting the JIT
program. This could include IPO or LTO at the developer's option, and
would be entirely safe if the unit of dynamism were restricted to an
LLVM module, since LTO merges modules together.
> how well would LLVM work for being used in a manner comprable to
> LISP-style eval (or Self, Smalltalk, or Python style incremental
> restructuring)?...
Simply codegen the string into a function at runtime, JIT it, and
call it.[7] Afterwards, the IR and the machine code representation
can be deleted.
> and incrementally replacing functions or modules at runtime?...
Generally speaking, LLVM neither helps nor hinders here. Maybe
someone will follow up with whether the JIT uses stub functions which
would enable dynamic relinking If not, it would be a straightforward,
if platform-specific, feature to add.
— Gordon
[1]
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
ELFWriter.cpp?view=markup
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/
MachOWriter.cpp?view=markup
[2] http://llvm.org/docs/WritingAnLLVMPass.html
[3] http://llvm.org/docs/ProgrammersManual.html#coreclasses
[4] http://llvm.org/docs/LangRef.html
[5] http://llvm.org/docs/BitCodeFormat.html
[6] http://llvm.org/docs/CodeGenerator.html
[7] watch this space, currently under rapid construction: http://
llvm.org/docs/tutorial/
In particular, observe the HandleTopLevelExpression function in §3.3
"Implementing Code Generation to LLVM IR." That function will be
extended to handle the eval usage in §3.4 "Adding JIT and Optimizer
Support."
More information about the llvm-dev
mailing list