[cfe-dev] No loop optimisation?

mats petersson via cfe-dev cfe-dev at lists.llvm.org
Fri Nov 6 15:12:08 PST 2015


Ah, that explains why the "small example above" is not showing the
problem...

Thanks for the explanation.

Shall I raise a bug?

--
Mats

2015-11-06 22:57 GMT+00:00 Richard Smith <richard at metafoo.co.uk>:

> On Fri, Nov 6, 2015 at 2:43 PM, mats petersson via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Actual source below. Now with:
>>
>> $ clang++ --version
>> clang version 3.8.0 (http://llvm.org/git/clang.git
>> 27b8a29273862fa1e9e296be8f9850a1459115c8) (http://llvm.org/git/llvm.git
>> 91950eea5582709ea263cc48bd97fdea817b53d1)
>> Target: x86_64-unknown-linux-gnu
>> Thread model: posix
>> InstalledDir: /usr/local/bin
>>
>>
>> #include <cstdio>
>> #include <cstring>
>>
>> #define LOOP_COUNT 1000000000
>>
>> unsigned long long rdtscl(void)
>> {
>>     unsigned int lo, hi;
>>     __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
>>     return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
>> }
>>
>> int main()
>> {
>>     unsigned long long before = rdtscl();
>>     size_t ret;
>>     for (int i = 0; i < LOOP_COUNT; i++)
>>         ret = strlen("abcd");
>>     unsigned long long after = rdtscl();
>>     printf("Strlen %lld ret=%zd\n",(after - before),  ret);
>>
>>     before = rdtscl();
>>     for (int i = 0; i < LOOP_COUNT; i++)
>>         ret = sizeof("abcd");
>>     after = rdtscl();
>>     printf("Strlen %lld ret=%zd\n",(after - before),  ret);
>> }
>>
>>
>> llvm generated:
>>
>> ; Function Attrs: nounwind uwtable
>> define i32 @main() #0 {
>> entry:
>>   %0 = tail call { i32, i32 } asm sideeffect "rdtsc",
>> "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
>>   %asmresult1.i = extractvalue { i32, i32 } %0, 1
>>   %conv2.i = zext i32 %asmresult1.i to i64
>>   %shl.i = shl nuw i64 %conv2.i, 32
>>   %asmresult.i = extractvalue { i32, i32 } %0, 0
>>   %conv.i = zext i32 %asmresult.i to i64
>>   %or.i = or i64 %shl.i, %conv.i
>>   %1 = tail call { i32, i32 } asm sideeffect "rdtsc",
>> "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
>>   %asmresult.i.25 = extractvalue { i32, i32 } %1, 0
>>   %asmresult1.i.26 = extractvalue { i32, i32 } %1, 1
>>   %conv.i.27 = zext i32 %asmresult.i.25 to i64
>>   %conv2.i.28 = zext i32 %asmresult1.i.26 to i64
>>   %shl.i.29 = shl nuw i64 %conv2.i.28, 32
>>   %or.i.30 = or i64 %shl.i.29, %conv.i.27
>>   %sub = sub i64 %or.i.30, %or.i
>>   %call2 = tail call i32 (i8*, ...) @printf(i8* nonnull getelementptr
>> inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i64 %sub, i64 4)
>>   %2 = tail call { i32, i32 } asm sideeffect "rdtsc",
>> "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
>>   %asmresult1.i.32 = extractvalue { i32, i32 } %2, 1
>>   %conv2.i.34 = zext i32 %asmresult1.i.32 to i64
>>   %shl.i.35 = shl nuw i64 %conv2.i.34, 32
>>   br label %for.cond.5
>>
>> for.cond.5:                                       ; preds = %for.cond.5,
>> %entry
>>   %i4.0 = phi i32 [ 0, %entry ], [ %inc10.18, %for.cond.5 ]
>>   %inc10.18 = add nsw i32 %i4.0, 19
>>   %exitcond.18 = icmp eq i32 %inc10.18, 1000000001
>>   br i1 %exitcond.18, label %for.cond.cleanup.7, label %for.cond.5
>>
>> for.cond.cleanup.7:                               ; preds = %for.cond.5
>>   %asmresult.i.31 = extractvalue { i32, i32 } %2, 0
>>   %conv.i.33 = zext i32 %asmresult.i.31 to i64
>>   %or.i.36 = or i64 %shl.i.35, %conv.i.33
>>   %3 = tail call { i32, i32 } asm sideeffect "rdtsc",
>> "={ax},={dx},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !1
>>   %asmresult.i.37 = extractvalue { i32, i32 } %3, 0
>>   %asmresult1.i.38 = extractvalue { i32, i32 } %3, 1
>>   %conv.i.39 = zext i32 %asmresult.i.37 to i64
>>   %conv2.i.40 = zext i32 %asmresult1.i.38 to i64
>>   %shl.i.41 = shl nuw i64 %conv2.i.40, 32
>>   %or.i.42 = or i64 %shl.i.41, %conv.i.39
>>   %sub13 = sub i64 %or.i.42, %or.i.36
>>   %call14 = tail call i32 (i8*, ...) @printf(i8* nonnull getelementptr
>> inbounds ([21 x i8], [21 x i8]* @.str, i64 0, i64 0), i64 %sub13, i64 5)
>>   ret i32 0
>> }
>>
>>
>> Am I missing something?
>>
>
> Definitely looks like a bug. It's a consequence of both loops using the
> same 'ret' variable. More specifically, the second loop body ends up with a
> phi for the value of 'ret', whose value is either 4 or 5 depending on
> whether the loop body is executed 0 times or more than 0 times, and the
> existence of that phi prevents us from removing the second loop. (The phi
> disappears once we remove the first loop, but we don't try to remove the
> second loop again afterwards.)
>
> [The 'add 19' is a comical red herring; we unroll the loop 19 times (as 19
> is a largeish factor of the trip count), then fold all 19 'add 1'
> instructions into one, but this happens after simplifycfg has already tried
> and failed to remove the second loop.]
>
>
>> --
>> Mats
>>
>> On 6 November 2015 at 16:10, mats petersson <mats at planetcatfish.com>
>> wrote:
>>
>>> Just to be clear: I was using -O2, and the "first loop" which has strlen
>>> in it was optimised to a constant value and no loop, the second loop, with
>>> sizeof, was not "removed".
>>>
>>> --
>>> Mats
>>>
>>> On 6 November 2015 at 14:30, Renato Golin via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>> On 6 November 2015 at 14:25, Joerg Sonnenberger via cfe-dev
>>>> <cfe-dev at lists.llvm.org> wrote:
>>>> > Also, if you look at IR make sure to run the appropiate passes first.
>>>> > Clang output by default has very few optimisations in the -emit-llvm
>>>> > output.
>>>>
>>>> That's intentional. Try -O2 and your IR will look much nicer. :)
>>>>
>>>> --renato
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20151106/e739fb6a/attachment.html>


More information about the cfe-dev mailing list