[llvm-dev] target-features attribute prevents inlining?

Haoran Xu via llvm-dev llvm-dev at lists.llvm.org
Fri Jun 12 21:20:58 PDT 2020


Hello,

I'm new to LLVM and I recently hit a weird problem about inlining behavior.
I managed to get a minimal repro and the symptom of the issue, but I
couldn't understand the root cause or how I should properly handle this
issue.

Below is an IR code consisting of two functions '_Z2fnP10TestStructi' and
'testfn', with the latter calling the former. One would expect the
optimizer inlining the call to the '_Z2fnP10TestStructi', but it doesn't.
(The command line I used is 'opt -O3 test.ll -o test2.bc')

source_filename = "a.cpp"
> target datalayout =
> "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
> target triple = "x86_64-unknown-linux-gnu"
>
> %struct.TestStruct = type { i8*, i32 }
>
> define dso_local i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %1)
> #0 {
>   %3 = getelementptr inbounds %struct.TestStruct, %struct.TestStruct* %0,
> i64 0, i32 0
>   %4 = load i8*, i8** %3, align 8
>   %5 = icmp eq i8* %4, null
>   %6 = add nsw i32 %1, 1
>   %7 = shl nsw i32 %1, 1
>   %8 = select i1 %5, i32 %6, i32 %7
>   ret i32 %8
> }
>
> define i32 @testfn(%struct.TestStruct* %0) {
> body:
>   %1 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 1)
>   %2 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %1)
>   %3 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %2)
>   %4 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %3)
>   %5 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %4)
>   %6 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %5)
>   %7 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %6)
>   %8 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %7)
>   %9 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %8)
>   %10 = call i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %9)
>   ret i32 %10
> }
>
> attributes #0 = { "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" }
>

It turns out that the failure to inline is caused by the 'target-features'
attribute in the last line. The function inlines properly if I remove the
'target-features' attribute from '_Z2fnP10TestStructi', or if I add
'attribute #0' to 'testfn'.

So I think the symptom is that inlining does not work when two functions
have different 'target-features' attributes. However, I could not
understand what is the reasoning behind this, or how I should prevent this
issue properly.

Just for additional information, in my use case, the function
'_Z2fnP10TestStructi' is automatically extracted from IR generated by
clang++ with -O3, so the IR contains a bunch of attributes and
MetadataNodes. The function 'testfn' is generated by my logic using
llvm::IRBuilder at runtime, so the function does not contain any of those
attributes and MetadataNodes initially. The functions generated by clang++
and my functions are then fed together into optimization passes, and I
expect the optimizer to inline clang++ functions into my functions as
needed.

So, what is the proper workaround for this? Should I delete all the
attribute and MetadataNodes from the clang++-generated IR (and if yes, is
that sufficient to prevent all those weird cases like this one)? I thought
it was a bad idea because they provide more info to optimizer. If not, what
is the proper way of handling this?

Thanks!

Best regards,
Haoran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200612/b185c5a7/attachment.html>


More information about the llvm-dev mailing list