[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?

Davide Italiano via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 30 12:29:38 PDT 2017


On Wed, Aug 30, 2017 at 12:26 PM, Davide Italiano <davide at freebsd.org> wrote:
> On Tue, Aug 29, 2017 at 3:10 PM, Nicolas Agostini via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hi all,
>> First post to the list, I hope you can help or guide me on this task.
>>
>> I am involved in a project that requires to re-link extracted and edited IR
>> code
>>
>> Thus I want to know if these tools can be used in this way?
>>
>> clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll
>> llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll
>> llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o
>> linked_main.ll
>> clang++-4.0 linked_main.ll -o main.out
>>
>>
>> where code03.cpp is:
>>
>>> #include <iostream>
>>> using namespace std;
>>> int main()
>>> {
>>>   cout << "First Message\n ";
>>>   cout << "Second Message\n ";
>>>   cout << "Third Message\n ";
>>>   return 0;
>>> }
>>
>>
>>
>> I have been trying to extract a function's llvm IR, modify it preserving its
>> signature (or not), and re-insert this function back to the original IR
>> file, however I am getting an error during the compilation step (
>> clang++-4.0 linked_main.ll -o main.out ):
>>
>>> main.ll:(.text+0x14): undefined reference to `.str'
>>> main.ll:(.text+0x34): undefined reference to `.str.1'
>>> main.ll:(.text+0x51): undefined reference to `.str.2'
>>
>>
>>  and linked_main.ll file has this section:
>>
>>> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00",
>>> align 1
>>> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A
>>> \00", align 1
>>> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A
>>> \00", align 1
>>> @.str = external hidden unnamed_addr constant [16 x i8], align 1
>>> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1
>>> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1
>>
>>
>>
>> But the function does not use the correct versions of the strings as the
>> linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I not
>> supposed to do it this way?
>>
>
> llvm-extract changes the semantic as it gives every GlobalValue
> external linkage for simplicity.
> Therefore, if you have GVs with internal linkage when you run
> llvm-extract that information is lost.
> At least, you may want to fix this, the relevant code is around here
> (Transforms/IPO/ExtractGV.cpp)
>
> ```
>       // For simplicity, just give all GlobalValues ExternalLinkage. A trickier
>       // implementation could figure out which GlobalValues are actually
>       // referenced by the Named set, and which GlobalValues in the rest of
>       // the module are referenced by the NamedSet, and get away with leaving
>       // more internal and private things internal and private. But for now,
>       // be conservative and simple.
>
>       // Visit the GlobalVariables.
>       for (Module::global_iterator I = M.global_begin(), E = M.global_end();
>            I != E; ++I) {
>         bool Delete =
>             deleteStuff == (bool)Named.count(&*I) && !I->isDeclaration();
>         if (!Delete) {
>           if (I->hasAvailableExternallyLinkage())
>             continue;
>           if (I->getName() == "llvm.global_ctors")
>             continue;
>         }
> ```
>
> Thanks,
>

I forgot, but apparently I had a bug open about this a while ago
https://bugs.llvm.org/show_bug.cgi?id=31674

--
Davide


More information about the llvm-dev mailing list