[llvm-dev] Getting LLVM Instructions

Mon Jul 20 14:01:38 PDT 2020

If you have the Module, you could record in a separate table just a
series of numbers - like "5th function, 10th instruction"?

On Mon, Jul 20, 2020 at 1:58 PM Yugesh Kothari <kothariyugesh at gmail.com> wrote:
>
> It would have been easier to do that - only thing is that the order of the instructions is important and unique for each trace.
> The llvm::Module will be more like the same program but in llvm IR assembly not C.
> I am interested in preserving the order in which instructions and function calls happen in each trace.
> I am not sure if I can create a module that would fit this requirement.
>
> Thanks!
>
> On Tue, 21 Jul, 2020, 1:56 am David Blaikie, <dblaikie at gmail.com> wrote:
>>
>> Somewhat - though perhaps it's easier to emit the whole Module you
>> already have, then? & go hunting for the desired instructions from
>> scratch when you parse that Module back in? Rather than filtering
>> before writing, write IR unfiltered, and filter when reading it.
>>
>> Yes, it would be difficult to create any coherent representation of a
>> collection of unrelated Instructions to write out to a Module.
>>
>> On Mon, Jul 20, 2020 at 1:01 PM Yugesh Kothari <kothariyugesh at gmail.com> wrote:
>> >
>> > Maybe I did not state my use case correctly, I apologise for the confusion.
>> >
>> > I have two different use cases -
>> >
>> > 1. I have a list of function call instructions from which I can get a list of functions, and subsequently print them out instruction by instruction.
>> >
>> > 2. I have a list of <llvm::Instructions*> directly (obtained by storing each instruction into a vector as it was executed).
>> >
>> > For the second case, the number of instructions is well over 10,000. (Even for the first case, going through each Instruction of each function call, the total number of instructions to be printed is huge).
>> >
>> > I have 25-30 such traces, so when I try to print out everything it takes a couple of hours (using llvm::Value::print) so textual representation by way of printing using llvm::Value::print is not practical.
>> >
>> > Binary dumps would include a lot of handling (since I need to resolve pointers of all objects I want to dump).
>> >
>> > In the best case, it would be nice if I could club together the instructions into some container that I can use the `clang -emit-llvm` method on. I am inclined to think this cannot be an llvm::Module because (as I understand) just a list of instructions cannot be clubbed together to create a valid Module.
>> >
>> > Does that offer more clarity for my use case (and why I am disinclined to use llvm::Value::print)?
>> >
>> > Thanks!
>> >
>> >
>> > On Tue, 21 Jul, 2020, 1:16 am David Blaikie, <dblaikie at gmail.com> wrote:
>> >>
>> >> I'm not sure that LLVM's bitcode format would natively support just a
>> >> handful of Instructions, rather than a whole llvm::Module.
>> >>
>> >> If you really want just a handful of instructions, maybe text is the
>> >> way to go - it sounded like you were serializing whole functions, at
>> >> least - which could be copied/cloned/moved into a standalone
>> >> llvm::Module and serialized from there. If it's only select
>> >> instructions, then maybe text is fine? Or maybe you can summarize the
>> >> information you want from the call more succinctly than LLVM's textual
>> >> representation.
>> >>
>> >> On Mon, Jul 20, 2020 at 12:00 PM Yugesh Kothari <kothariyugesh at gmail.com> wrote:
>> >> >
>> >> > Replicating what clang -emit-llvm does sound like the better way to do it.
>> >> >
>> >> > I was looking under IRPrintingPasses but couldn't find anything specific that would allow me to print out say a std::vector<llvm:: Instruction*>.
>> >> >
>> >> > What do you think would be the easiest way to do this? Can I do some hack where I can get away without writing my own llvm pass? I'm not even sure what the right question to ask is, since this is the first time I'm working with llvm.
>> >> >
>> >> > Thanks!
>> >> >
>> >> > On Mon, 20 Jul, 2020, 10:37 pm David Blaikie, <dblaikie at gmail.com> wrote:
>> >> >>
>> >> >> if you're trying to serialize LLVM IR and read it back again later -
>> >> >> yeah, probably best to use th binary searialization rather than the
>> >> >> textual. If I were doing this I'd try building something using clang
>> >> >> with -emit-llvm (that'll produce LLVM IR bitcode in the .o file) and
>> >> >> debug that to see which APIs are used to do that.
>> >> >>
>> >> >> On Mon, Jul 20, 2020 at 3:19 AM Yugesh Kothari via llvm-dev
>> >> >> <llvm-dev at lists.llvm.org> wrote:
>> >> >> >
>> >> >> > Hi,
>> >> >> >
>> >> >> > I am working on a project where I need to get a list of llvm Functions that were called during an execution (for futher analysis).
>> >> >> > To do this I have maintained a vector<llvm:: Function*> which I print out to a .ll file at the end. However this takes a lot of time since the number of call Instructions is HUGE.
>> >> >> > I feel that the bottleneck is the conversion from llvm:: Function to std::string
>> >> >> >
>> >> >> > How can I speed this up?
>> >> >> >
>> >> >> > I don't necessarily need it in .ll format, if there is a way to dump the entire llvm::Function object as a byte stream to a .dat file and read it back as objects in a separate script, that would work too. I'm not sure how to do this (tried few things didn't work), any help would be appreciated!
>> >> >> >
>> >> >> > Thanks!
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > LLVM Developers mailing list
>> >> >> > llvm-dev at lists.llvm.org
>> >> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev