Hi all,<br><br>We are interested in implementing a loop-aware (in the sense memory dependences are tracked across back edges especially) memory profiling tool in LLVM. We need to annotate profile information on individual instructions. Basically our earlier profiling system used to work as follows:<br>
<br>(1) The first pass in the compiler instruments (individual load/store instructions) the program with calls to the profiling library which keeps track of the src/dst addresses of load/store instructions and how often they are executed.<br>
(2) The instrumented program is run, with some input and the profiling library generates its output into a separate file.<br>(3) The second pass in the compiler reads in this profile information and annotates individual load/store instructions with the set of other load/store instructions (along with profile weights) that it is dependent on for the current input set. This information is then used by the subsequent passes and eventually dropped before code generation.<br>
<br>What would be a good way to get started on implementing this sort of system in LLVM ? Specifically: As far as we know, there is no way to give individual instructions ids/tags so that pass (3) can uniquely identify them using these tags. Is this correct or is there some other way to uniquely identify instructions (by just reading the .ll/.bc file) ?<br>
<br><div class="gmail_quote">On Sat, Jun 28, 2008 at 6:58 PM, Chris Lattner <<a href="mailto:sabre@nondot.org">sabre@nondot.org</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>
On Jun 25, 2008, at 6:33 AM, Matthijs Kooijman wrote:<br>
> Hi all,<br>
<br>
Howdy Matthijs,<br>
<div class="Ih2E3d"><br>
> I've also been developing an interest in using IR annotations for my<br>
> compiler.<br>
> Some discussion with Bart turns out that he has implemented some<br>
> code to parse<br>
> the llvm.globals.annotations array, but in no way integrated or<br>
> reusable.<br>
> We've spent some thought about how this could be done properly,<br>
> which I will<br>
> share here.<br>
<br>
</div>Ok, cool. Annotations are tricky to do right :)<br>
<div class="Ih2E3d"><br>
> Firstly, however, I was wondering about the format of the<br>
> llvm.globals.annotations array. It does not seem to be defined in<br>
> the LLVM<br>
> language reference, shouldn't it be? It's name suggests that it is a<br>
> reserved<br>
> variable name with a fixed type (similar to intrinsic functions?).<br>
<br>
</div>Yes, we should document it. It is a convention established by the<br>
__builtin_annotate function in the c compilers. We should standardize<br>
it and document it.<br>
<div class="Ih2E3d"><br>
> Furthermore, it seems that the AnnotationManager that is currently<br>
> implemented<br>
> is capable of keeping a list of Annotations for any Annotatable<br>
> (currently<br>
> only Function). These annotations are kept in memory only and really<br>
> have<br>
> nothing to do at all with the annotations in the IR.<br>
<br>
</div>Yes, this is a really old mechanism that we should rip out.<br>
MachineFunction should be moved to be an analysis that is preserved as<br>
an actual part of the passmanager, instead of being a thing we tack<br>
onto the Function object. We have killed all uses of this old<br>
annotation mechanism except MachineFunction.<br>
<div class="Ih2E3d"><br>
> Still, it seems that using the AnnotationManager to make the IR<br>
> annotations<br>
> accessible seems like a decent approach.<br>
<br>
</div>I agree that *having* an annotationmanager makes sense, but the<br>
existing one should die and be replaced. :)<br>
<div class="Ih2E3d"><br>
> The way I see this is having some pass, or probably the assembly<br>
> reader or the<br>
> AnnotationManager itself, parsing the llvm.global.annotations<br>
> variable and<br>
> adding annotations to the corresponding GlobalValues. This would<br>
> just leave the<br>
> annotations in the IR as well, so that transformation passes would<br>
> properly<br>
> preserve them (and, just like debug info, sometimes be prevented from<br>
> modifying some annotated global values unless they are taught how to<br>
> preserve<br>
> the annotations).<br>
<br>
</div>Makes sense. This is similar to how the MachineDebugInfo stuff<br>
deserializes debug info out of the LLVM IR and presents it for easy<br>
consumption of the code generator.<br>
<div class="Ih2E3d"><br>
> By using a subclass of Annotation (say, GlobalAnnotation) we can<br>
> distinguish<br>
> between annotations that are (or should be) in the IR and (the<br>
> existing)<br>
> annotations that should be in memory only. This would also allow for<br>
> newly<br>
> added annotations to be immediately be added to the IR, ensuring<br>
> that the<br>
> AnnotationManager's view remains consistent with the IR.<br>
<br>
</div>I think we need to distinguish between two forms of annotation:<br>
<br>
1. there are some "annotations" like "readonly", "nounwind", etc that<br>
are baked into the LLVM IR and are/should be documented in LangRef.<br>
<br>
2. There are annotations that are really "cheap extensions" of the<br>
LLVM IR that are either experimental, very domain specific, or that<br>
are just metadata about the code.<br>
<br>
For #1, the current "parameter attributes" we have work reasonable<br>
well, and Devang is actually cooking up a proposal to extend them a<br>
bit (to fix some issues with LTO). #2 is something that llvm.annotate<br>
handles reasonable well, but I agree it would be great to have a nice<br>
interface to update/read them.<br>
<br>
The advantage of #1 is that the compiler as a whole knows about the<br>
attributes, but this means that adding one is "hard". The advantage<br>
of #2 is that they are easy to add, but they have limitations and can<br>
impact codegen (e.g. they disable IPO in some cases).<br>
<div class="Ih2E3d"><br>
> A problem I could imagine using this approach would be name<br>
> conflicts. Since<br>
> any annotation name could come from the IR, these could conflict by<br>
> the other<br>
> names already in use (such as "CodeGen::MachineFunction" IIRC). This<br>
> could be<br>
> solved by using a "GlobalAnnotation::" prefix for the name, or<br>
> something<br>
> similar.<br>
<br>
</div>It could also be served by making them completely string based, and<br>
just provide a simple string interface? That way you don't need<br>
classes for each attribute.<br>
<font color="#888888"><br>
-Chris<br>
</font><div><div></div><div class="Wj3C7c">_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
</div></div></blockquote></div><br>