[llvm-dev] Is the flow "llvm-extract -> llvm-link -> clang++ " supposed to be used in this way? To Extract and Re-insert functions?

Davide Italiano via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 30 12:26:22 PDT 2017


On Tue, Aug 29, 2017 at 3:10 PM, Nicolas Agostini via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hi all,
> First post to the list, I hope you can help or guide me on this task.
>
> I am involved in a project that requires to re-link extracted and edited IR
> code
>
> Thus I want to know if these tools can be used in this way?
>
> clang++-4.0 code03.cpp -emit-llvm -S -o code03.ll
> llvm-extract-4.0 code03.ll -func main -S -o extracted_main.ll
> llvm-link-4.0 code03.ll -only-needed -override extracted_main.ll -S -o
> linked_main.ll
> clang++-4.0 linked_main.ll -o main.out
>
>
> where code03.cpp is:
>
>> #include <iostream>
>> using namespace std;
>> int main()
>> {
>>   cout << "First Message\n ";
>>   cout << "Second Message\n ";
>>   cout << "Third Message\n ";
>>   return 0;
>> }
>
>
>
> I have been trying to extract a function's llvm IR, modify it preserving its
> signature (or not), and re-insert this function back to the original IR
> file, however I am getting an error during the compilation step (
> clang++-4.0 linked_main.ll -o main.out ):
>
>> main.ll:(.text+0x14): undefined reference to `.str'
>> main.ll:(.text+0x34): undefined reference to `.str.1'
>> main.ll:(.text+0x51): undefined reference to `.str.2'
>
>
>  and linked_main.ll file has this section:
>
>> @.str.4 = private unnamed_addr constant [16 x i8] c"First Message\0A \00",
>> align 1
>> @.str.1.6 = private unnamed_addr constant [17 x i8] c"Second Message\0A
>> \00", align 1
>> @.str.2.8 = private unnamed_addr constant [16 x i8] c"Third Message\0A
>> \00", align 1
>> @.str = external hidden unnamed_addr constant [16 x i8], align 1
>> @.str.1 = external hidden unnamed_addr constant [17 x i8], align 1
>> @.str.2 = external hidden unnamed_addr constant [16 x i8], align 1
>
>
>
> But the function does not use the correct versions of the strings as the
> linked "extracted_main" keeps making calls to .str, .str.1, .str.2? Am I not
> supposed to do it this way?
>

llvm-extract changes the semantic as it gives every GlobalValue
external linkage for simplicity.
Therefore, if you have GVs with internal linkage when you run
llvm-extract that information is lost.
At least, you may want to fix this, the relevant code is around here
(Transforms/IPO/ExtractGV.cpp)

```
      // For simplicity, just give all GlobalValues ExternalLinkage. A trickier
      // implementation could figure out which GlobalValues are actually
      // referenced by the Named set, and which GlobalValues in the rest of
      // the module are referenced by the NamedSet, and get away with leaving
      // more internal and private things internal and private. But for now,
      // be conservative and simple.

      // Visit the GlobalVariables.
      for (Module::global_iterator I = M.global_begin(), E = M.global_end();
           I != E; ++I) {
        bool Delete =
            deleteStuff == (bool)Named.count(&*I) && !I->isDeclaration();
        if (!Delete) {
          if (I->hasAvailableExternallyLinkage())
            continue;
          if (I->getName() == "llvm.global_ctors")
            continue;
        }
```

Thanks,

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare


More information about the llvm-dev mailing list