[llvm-dev] llvm emits unoptimized code

Thu Oct 31 11:36:33 PDT 2019

On Thu, Oct 31, 2019 at 11:26 AM David Blaikie <dblaikie at gmail.com> wrote:

> On Thu, Oct 31, 2019 at 11:17 AM Jorg Brown via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Thu, Oct 31, 2019 at 8:50 AM kamlesh kumar via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi Devs,
>>> Consider testcase here
>>> https://godbolt.org/z/qHZzqw
>>> When optimization is O1 or above it produces unoptimized code
>>> because it calls __tls_get_address in loops.
>>> While with optimization disabled
>>> It produce single call to __tls_get_address outside of loop.
>>> is this a missed optimization by llvm?
>>>
>>
>> It's interesting to me that there's a big difference in -fpie and -fpic.
>>
>> https://godbolt.org/z/klX3q3
>>
>> In particular, with -fpie, no call to __tls_get_addr is needed, so the
>> underlying considerations for optimization change.  This feels like the
>> optimizer isn't taking in to account the overhead of -fpic, when
>> determining whether to hoist the address calculation out of the loop.
>>
>> On Thu, Oct 31, 2019 at 10:36 AM David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Looks pretty similar to the GCC generated code
>>
>>
>> Challenge accepted => https://godbolt.org/z/8PX2La
>>
>
> Which challenge? Sorry, could've linked to the godbolt I was looking at
> when I said that: https://godbolt.org/z/_07tOk - comparing GCC and Clang
> trunk on the code linked in the original post.
>

Right, your example showed where gcc and clang were similar.

My example https://godbolt.org/z/8PX2La showed where gcc produced code that
was possibly twice as fast as clang's code.

-- Jorg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191031/ed1029e9/attachment.html>